14 December 2015

Derived Fields or Getting to Last Week's Top Ten by Name Instead of by Id

Last December, I blogged about how to work with timestamps in Event Log Files. The basic idea is that we generate timestamps in the form of a number that looks like this: 20151210160337.6. Only, it's not a number really, it's a string. For instance, when you add a couple more characters, you get: 2015-12-10T16:03:37.600Z. Look more familiar? More importantly, as an ISO 8601 formatted date time type, it's easier to integrate with third party analytic platforms for trending and time series charting.

The same can be said of of other fields like the ubiquitous USER_ID field which is stored as fifteen characters: 005B0000001WamP. However, in order to integrate with an analytics platform like Wave or Splunk, having the standard eighteen character id like 005B0000001WamPIAS makes it much easier to match with the actual User sObject so that you can retrieve other values such as the user's profile or role. The same can be true of any other object that you would want to denormalize so that you can create a report on the top ten users or reports by name instead of by id.

In Winter '16, we introduced derived fields as a sleepy enhancement to Event Log Files. It is sleepy in that there is no ticker tape parade, no marketing campaigns, no clappies at Dreamforce. Instead, it's a simple and effective way of transforming data when the file is generated based on patterns and data already contained within the file. It's TEL (Transform, Extract, and Load) instead of ETL (Extract, Transform, and Load).

The original data doesn't go away. Instead, a new field is added with a '_DERIVED' suffix. That way existing integrations won't break and options exist for other kinds of non-standard transformations. For instance, the fifteen character USER_ID now has a counterpart with an eighteen character USER_ID_DERIVED and the TIMESTAMP has a TIMESTAMP_DERIVED equivelent. This opens new doors to future transformations.

This means applications like Wave Analytics can more easily augment or join data based on these transformed ids and timestamps.

As a result reports look like this

instead of like this

Derived fields with Event Log Files handles the problem of data preparation by transforming the data before the file is generated, making future transformations easier. Hope this helps with your integrations!

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.