08 December 2014

Event Monitoring + Salesforce Wave = BFF


At Dreamforce 2014, I led a session entitled Audit Analytics that described the integration between Event Monitoring and Project Wave.

Combing the two solutions is a no brainer. Event Log Files generates CSV files on a daily basis. The Wave platform makes sense of CSVs for business analysts. 

While you can watch the video at http://bit.ly/DF14AuditAnalytics, there are a couple of tips, tricks, and best practices I want to share when using Event Log Files with the Wave platform:
  1. Consider storage requirements. Event data isn't like CRM data - there's a lot more of it. One org I work with logs approximately twenty million rows of event data per day using Event Log Files. That's approximately 600 million rows per month or 3.6 billion every half year. That means you will need to consider what data you import and how you manage that data over time.
  2. Understand your schema. There are tons of great use cases that Event Log Files solve; however, the secret sauce here is understanding what's possible already. Download a sample of files and take a look in Excel or run the head command in your terminal (i.e. head -n 2 VisualforceRequest-2014-10-21.csv) to get a sense of the kinds of lenses and dashboards you want to create. Read more about the lexicon of possible field values in the Event Log File Field Lexicon blog posting.
  3. You should convert the TIMESTAMP field in each log file to something that Wave can understand and trend in their timeline graphs. Event Log Files provides an epoch style TIMESTAMP (i.e. 20140925015802.062) rather than date format (i.e. 2014-09-25T01:58:02Z). I usually build this transformation into the download process. Read more about this transformation process with my Working with Timestamps in Event Log Files blog posting.
  4. You should de-normalize Ids into Name fields where possible. For instance, instead of uploading just USER_ID, you should also upload USER_NAME so that it's more human readable. If you don't do this before you upload the data, you can always use SAQL to help normalize name fields. Read more about using pig and data pipelines to de-normalize data before importing it into Wave with the Hadoop and Pig come to the Salesforce Platform with Data Pipelines blog posting.
  5. Merge files across days to reduce the number of datasets you have to manage (i.e. awk -F ',' 'FNR > 1 {print $0}' new_* > merged_file.csv) rather than treating each day of log files as a new dataset.
  6. Import your data using the dataset loader from Github: https://github.com/forcedotcom/Analytics-Cloud-Dataset-Utils/releases. This is the easiest way to automate dataset creation and management.
Combining large scale event data about the operational health of your organization with the power of an incredible visualization platform has the ability to change how you separate truth from fiction with your users.

01 December 2014

Working with Timestamps in Event Log Files

An event in a log file represents that something happened in our application along a timeline of events.

As a result, every Event Log File contains a TIMESTAMP field which represents the time each event happened in GMT. This is useful for understanding when the event happened, for correlating user behavior during a time period, and for trending similar events over various periods of time.



The TIMESTAMP field in Event Log Files is stored as a number. This simplifies storage costs since date formatting a string comes with a storage cost and can be more difficult to transform latter. This can become a challenge when importing Event Log File data into an analytics system that requires a different date time format. And there are a lot of different kinds of date time formats that are possible.

For instance, Salesforce Analytics Cloud's Wave platform accepts a variety of different date time formats:


This means that you will have to convert the TIMESTAMP field for each row within an Event Log File into something that Wave or any other analytics platform can interpret.

I usually convert the TIMESTAMP when I download the file, that way it makes it easier to do it in one step.

To convert it, I use a simple AWK script that Aakash Pradeep wrote, in my download script or in my Mac terminal. It takes the input from a downloaded file like Login.csv and creates a new file, substituting each TIMESTAMP field value with the right format:
awk -F ','  '{ if(NR==1) printf("%s\n",$0); else{ for(i=1;i<=NF;i++) { if(i>1&& i<=NF) printf("%s",","); if(i == 2) printf "\"%s-%s-%sT%s:%s:%sZ\"", substr($2,2,4),substr($2,6,2),substr($2,8,2),substr($2,10,2),substr($2,12,2),substr($2,14,2); else printf ("%s",$i);  if(i==NF) printf("\n")}}}' Login.csv  > new_Login1.csv

Date time formats can be a challenge in any system and this utility provides me a quick and easy way of converting date time formats into something I can use with my analytics platform of choice.

24 November 2014

Downloading Event Log Files using a Script

Event Monitoring, new in the Winter '15 release, enables use cases like adoption, user audit, troubleshooting, and performance profiling using an easy to download, file based API to extract Salesforce app log data.

The most important part is making it easy to download the data so that you can integrate it with your analytics platform.

To help make it easy, I created a simple bash shell script to download these CSV (comma separated value) files to your local drive. It works best on Mac and Linux but can be made to work with Windows with a little elbow grease. You can try these scripts out at http://bit.ly/elfScripts. These scripts do require a separate JSON library called jq to parse the JSON that's returned by the REST API.

It's not difficult to build these scripts using other languages such as Ruby, Perl, or Python. What's important is the data flow.

I prompt the user to enter their username and password (which is masked). This information can just as easily be stored in environment variables or encrypted so that you can automate the download on a daily basis using CRON or launchd schedulers.

#!/bin/bash
# Bash script to download EventLogFiles
# Pre-requisite: download - http://stedolan.github.io/jq/ to parse JSON

#prompt the user to enter their username or uncomment #username line for testing purposes
read -p "Please enter username (and press ENTER): " username

#prompt the user to enter their password 
read -s -p "Please enter password (and press ENTER): " password

#prompt the user to enter their instance end-point 
echo 
read -p "Please enter instance (e.g. na1) for the loginURL (and press ENTER): " instance

#prompt the user to enter the date for the logs they want to download
read -p "Please enter logdate (e.g. Yesterday, Last_Week, Last_n_Days:5) (and press ENTER): " day

Once we have the credentials, we can log in using oAuth and get the access token.

#set access_token for OAuth flow 
#change client_id and client_secret to your own connected app - http://bit.ly/clientId
access_token=`curl https://${instance}.salesforce.com/services/oauth2/token -d "grant_type=password" -d "client_id=3MVG99OxTyEMCQ3hSja25qIUWtJCt6fADLrtDeTQA12.liLd5pGQXzLy9qjrph.UIv2UkJWtwt3TnxQ4KhuD" -d "client_secret=3427913731283473942" -d "username=${username}" -d "password=${password}" -H "X-PrettyPrint:1" | jq -r '.access_token'`

Then we can query the event log files to get the Ids necessary to download the files and store the event type and log date in order to properly name the download directory and files.

#set elfs to the result of ELF query
elfs=`curl https://${instance}.salesforce.com/services/data/v31.0/query?q=Select+Id+,+EventType+,+LogDate+From+EventLogFile+Where+LogDate+=+${day} -H "Authorization: Bearer ${access_token}" -H "X-PrettyPrint:1"`

Using jq, we can parse the id, event type, and date in order to create the directory and file names

#set the three variables to the array of Ids, EventTypes, and LogDates which will be used when downloading the files into your directory
ids=( $(echo ${elfs} | jq -r ".records[].Id") )
eventTypes=( $(echo ${elfs} | jq -r ".records[].EventType") )
logDates=( $(echo ${elfs} | jq -r ".records[].LogDate" | sed 's/'T.*'//' ) )

We create the directories to store the files. In this case, we download the raw data and then convert the timestamp to something our analytics platform will understand better.

Then we can iterate through each download, renaming it to the Event Type + Log Date so that we easily refer back to it later on. I also transform the Timestamp field to make it easier to import into an analytics platform like Project Wave from Salesforce Analytics Cloud.

#loop through the array of results and download each file with the following naming convention: EventType-LogDate.csv
for i in "${!ids[@]}"; do
    
    #make directory to store the files by date and separate out raw data from 
    #converted timezone data
    mkdir "${logDates[$i]}-raw"
    mkdir "${logDates[$i]}-tz"

    #download files into the logDate-raw directory
    curl "https://${instance}.salesforce.com/services/data/v31.0/sobjects/EventLogFile/${ids[$i]}/LogFile" -H "Authorization: Bearer ${access_token}" -H "X-PrettyPrint:1" -o "${logDates[$i]}-raw/${eventTypes[$i]}-${logDates[$i]}.csv" 

    #convert files into the logDate-tz directory for Salesforce Analytics
    awk -F ','  '{ if(NR==1) printf("%s\n",$0); else{ for(i=1;i<=NF;i++) { if(i>1&& i<=NF) printf("%s",","); if(i == 2) printf "\"%s-%s-%sT%s:%s:%sZ\"", substr($2,2,4),substr($2,6,2),substr($2,8,2),substr($2,10,2),substr($2,12,2),substr($2,14,2); else printf ("%s",$i);  if(i==NF) printf("\n")}}}' "${logDates[$i]}-raw/${eventTypes[$i]}-${logDates[$i]}.csv" > "${logDates[$i]}-tz/${eventTypes[$i]}-${logDates[$i]}.csv"

done

Downloading event log files is quick and efficient.  You can try these scripts out at http://bit.ly/elfScripts. Give it a try!

10 November 2014

Event Log File Field Lexicon

Event Log Files, new in the Winter '15 release, enables adoption, troubleshooting, and auditing use cases using an easy to download, file based API to extract Salesforce app log data.

It's an extremely rich data source, originally created by Salesforce developers to better understand the operational health of the overall service and better support our customers.

Extending access to these log files provides our customers the ability to support themselves using some of the same tools we've used to support them.

Most fields in the log files are self-describing like CLIENT_IP or TIMESTAMP. However, some of the log file fields can be difficult to understand without a lexicon.

There are a couple of reasons for this. One reason is because some fields are derived where data is encoded in an enumerated value or with an acronym which is defined in a separate place in the code.

A lot of time, this is done because less characters or numeric codes take up less total storage space which is important when you're storing terabytes of log files every day.

But this leaves us with a problem, what in the world does the data actually mean?

For instance, rather than store 'Partner' for the API_TYPE in the API log file, we store a simple code of 'P'.

Another example is when the code is spelled out and still needs interpretation. For instance, VersionRenditionDownload for the TRANSACTION_TYPE in the ContentTransfer log file simply means someone previewed a file in the app instead of downloading it (which is actually VersionDownloadAction or VersionDownloadApi).


All of this means we need a lexicon to map codes to possible values or examples so that we understand the data we're downloading.

Below are some example fields to help make sense of the data from Event Log Files.

Common Log File Fields
These are log fields you'll see across many different log files and typically address who, what, when, where, and how.

Field NameDescriptionPossible Values or Examples (e.g.)
CLIENT_IPThe IP address of the client using Salesforce services.e.g. 192.168.0.1
EVENT_TYPEThe type of event, such as content sharing.e.g. URI
ORGANIZATION_IDThe 15-character ID of the organization.e.g. 00DB00000000mZw
REQUEST_IDThe unique ID of a single transaction.e.g. 3nWgxWbDKWWDIk0FKfF5DV
REQUEST_STATUSThe status of the request for a page view or user interface action.Possible values include:
• S: Success
• F: Failure
• U: Uninitialized
TIMESTAMPThe access time of Salesforce services, in UTC time.e.g. 20130715233322.670,
which equals 2013-07-15T23:33:22.670+0000.
URIThe URI of the page receiving the request.e.g. /home/home.jsp
USER_IDThe 15-character ID of the user using Salesforce services, whether through the UI or the API.e.g. 005B00000018C2g

Log File Specific Fields
These are log fields that are typically unique to one or two log files and typically represent a type, operation, or other enumerated value.

EventType (File Type)Field NameDescriptionPossible Values or Examples (e.g.)
APEX_CALLOUT_EVENTMETHODThe HTTP method of the callout.e.g. GET, POST, PUT, DELETE
APEX_CALLOUT_EVENTTYPEThe type of calloute.g. REST, AJAX
APEX_TRIGGER_EVENTTRIGGER_TYPEThe type of this trigger.The types of triggers are:
• AfterInsert
• AfterUpdate
• BeforeInsert
• BeforeUpdate
API_EVENTMETHOD_NAMEThe API method that is invoked.e.g. query(), insert(), upsert(), delete()
API_EVENTAPI_TYPEThe type of API invoked.values include:
• X: XmlRPC
• O: Old SOAP
• E: SOAP Enterprise
• P: SOAP Partner
• M: SOAP Metadata
• I: SOAP Cross Instance
• S: SOAP Apex
• D: Apex Class
• R: REST API
• T: SOAP Tooling
ASYNC_REPORT_EVENTDISPLAY_TYPEThe report display type, indicating the run mode of the report.Possible values include:
• D: Dashboard
• S: Show Details
• H: Hide Details
ASYNC_REPORT_EVENTRENDERING_TYPEThe report rendering type, describing the format of the report output.Possible values include:
• W: Web (HTML)
• E: Email
• P: Printable
• X: Excel
• C: CSV (comma-separated values)
• J: JSON (JavaScript object notation)
CONTENT_DOCUMENT_LINK_EVENTSHARING_OPERATIONThe type of sharing operation on the document.e.g. INSERT, UPDATE, or DELETE.
CONTENT_DOCUMENT_LINK_EVENTSHARING_PERMISSIONWhat permissions the document was shared with.The possible values include:
• V: Viewer
• C: Collaborator
• I: Inferred—that is, the sharing permissions were inferred from a relationship between the viewer and document. For example, a document’s owner has a sharing permission to the document itself. Or, a document can be a part of a content collection, and the viewer has sharing permissions to the collection, rather than explicit permissions to the document directly.
CONTENT_TRANSFER_EVENTTRANSACTION_TYPEThe operation performed.The possible operations include:
• VersionDownloadAction and
VersionDownloadApi represent downloads via the user interface and API respectively.
• VersionRenditionDownload represents a file preview action.
• saveVersion represents a file being uploaded.
DASHBOARD_EVENTDASHBOARD_TYPEThe type of dashboard.Valid types include:
• R: Run as Running User
• C: Run as Context User
• S: Run as Specific User
LOGOUT_EVENTUSER_INITIATED_LOGOUTThe user type used when logging out.The value is 1 if the user intentionally logged out by
clicking the Logout link, and 0 if they were logged out by a timeout or other implicit logout action.
MDAPI_OPERATION_EVENTOPERATIONThe operation being performede.g. DEPLOY, RETRIEVE, LIST,
DESCRIBE
PACKAGE_INSTALL_EVENTOPERATION_TYPEThe type of package operation.Possible values include:
• INSTALL
• UPGRADE
• EXPORT
• UNINSTALL
• VALIDATE_PACKAGE
•INIT_EXPORT_PKG_CONTROLLER
REPORT_EVENTDISPLAY_TYPEThe report display type, indicating the run mode of the report.Possible values include:
• D: Dashboard
• S: Show Details
• H: Hide Details
REPORT_EVENTRENDERING_TYPEThe report rendering type, describing the format of the report output.Possible values include:
• W: Web (HTML)
• E: Email
• P: Printable
• X: Excel
• C: CSV (comma-separated values)
• J: JSON (JavaScript object notation)
REST_API_EVENTMETHODThe HTTP method of the requeste.g. GET, POST, PUT, DELETE
SITES_EVENTHTTP_METHODThe HTTP method of the requestGET, POST, PUT, DELETE
SITES_EVENTREQUEST_TYPEThe request type.Possible values include:
• page: a normal request for a page
• content_UI: a content request for a page
originated in the user interface
• content_apex: a content request initiated
by an Apex call
• PDF_UI: a request for a page in PDF format
through the user interface
• PDF_apex: a request for PDF format by an
Apex call (usually a Web Service call)
UI_TRACKING_EVENTCONNECTION_TYPEMethod used by the mobile device to connect to the web.Values can include:
• CDMA1x
• CDMA
• EDGE
• EVDO0
• EVDOA
• EVDOB
• GPRS
• HSDPA
• HSUPA
• HRPD
• LTE
• OFFLINE
• WIFI
VISUALFORCE_EVENTREQUEST_TYPEThe request type.Possible values include:
• page: a normal request for a page
• content_UI: a content request for a page
originated in the user interface
• content_apex: a content request initiated
by an Apex call
• PDF_UI: a request for a page in PDF format
through the user interface
• PDF_apex: a request for PDF format by an
Apex call (usually a Web Service call)

03 November 2014

Hadoop and Pig come to the Salesforce Platform with Data Pipelines


Event Log Files is big - really, really big. This isn't your everyday CRM data where you may have hundreds of thousands of records or even a few million here and there. One organization I work with does approximately twenty million rows of event data per day using Event Log Files. That's approximately 600 million rows per month or 3.6 billion every half year.

Because the size of the data does matter, we need tools that can orchestrate and process this data for a variety of use cases. For instance, one best practice when working with Event Log Files is to de-normalize Ids into Name fields. Rather than reporting on the most adopted reports by Id, it's better to show the most adopted reports by Name.

There are many ways to handle this operation outside of the platform. However, on the platform there's really only been one way to handle this in the past with Batch Apex.

In pilot with the Winter '15 release (page 198 of the release notes), data pipelines provides a way for users to upload data into Hadoop and run Pig scripts against it. The advantage is that it handles many different data sources including sobjects, chatter files, and external objects at scale.

I worked with Prashant Kommireddi on the following scripts which help me understand which reports users are viewing:

1. Export user Ids and Names using SOQL into userMap.csv (Id,Name) which I upload to chatter files
-- 069B0000000NBbN = userMap file stored in chatter
user = LOAD 'force://chatter/069B0000000NBbN' using gridforce.hadoop.pig.loadstore.func.ForceStorage() as (Id, Name);
-- loop through user map to reduce Id from 18 to 15 characters to match the log lines
subUser = foreach user generate SUBSTRING(Id, 0, 15) as Id, Name;
-- storing into FFX to retrieve in next step
STORE subUser INTO 'ffx://userMap15.csv' using gridforce.hadoop.pig.loadstore.func.ForceStorage();

2. Export report Ids and Names using SOQL into reportMap.csv (Id,Name) which I upload to chatter files
-- 069B0000000NBbD = reportMap file stored in chatter
report = LOAD 'force://chatter/069B0000000NBbD' using gridforce.hadoop.pig.loadstore.func.ForceStorage() as (Id, Name);
-- loop through user map to reduce Id from 18 to 15 characters to match the log lines
subReport = foreach report generate SUBSTRING(Id, 0, 15) as Id, Name;
-- storing into FFX to retrieve in next step
STORE subReport INTO 'ffx://reportMap15.csv' using gridforce.hadoop.pig.loadstore.func.ForceStorage();

3. createReportExport - Upload ReportExport.csv to chatter files and run script to combine all three
-- Step 1: load users and store 15 char id
userMap = LOAD 'ffx://userMap15.csv' using gridforce.hadoop.pig.loadstore.func.ForceStorage() as (Id, Name);
-- Step 2: load reports and store 15 char id
reportMap = LOAD 'ffx://reportMap15.csv' using gridforce.hadoop.pig.loadstore.func.ForceStorage() as (Id, Name);
-- Step 3: load full schema from report export elf csv file
elf = LOAD 'force://chatter/069B0000000NB1r' using gridforce.hadoop.pig.loadstore.func.ForceStorage() as (EVENT_TYPE, TIMESTAMP, REQUEST_ID, ORGANIZATION_ID, USER_ID, RUN_TIME, CLIENT_IP, URI, CLIENT_INFO, REPORT_DESCRIPTION);
-- Step 4: remove '/' from URI field to create an Id map
cELF = foreach elf generate EVENT_TYPE, TIMESTAMP, REQUEST_ID, ORGANIZATION_ID, USER_ID, RUN_TIME, CLIENT_IP, SUBSTRING(URI, 1, 16) as URI, CLIENT_INFO, REPORT_DESCRIPTION;
-- Step 5: join all three files by the common user Id field
joinUserCELF = join userMap by Id, cELF by USER_ID;
joinReportMapELF = join reportMap by Id, cELF by URI;
finalJoin = join joinUserCELF by cELF::USER_ID, joinReportMapELF by cELF::USER_ID;
-- Step 6: generate output based on the expected column positions
elfPrunedOutput = foreach finalJoin generate $0, $1, $2, $3, $4, $5, $7, $8, $10, $11, $12, $13;
-- Step 7: store output in CSV
STORE elfPrunedOutput INTO 'force://chatter/reportExportMaster.csv' using gridforce.hadoop.pig.loadstore.func.ForceStorage();

By combining the power of data pipelines, I can transform the following Wave platform report from:

To:


To learn more about data pipelines and using Hadoop at Salesforce, download the Data Pipelines Implementation Guide (Pilot) and talk with your account executive about getting into the pilot.

27 October 2014

Tracking User Activity Across Browsers and Mobile Devices

I'm often asked whether we can track user activity at a more granular level than what's currently provided with Login History, Setup Audit Trail, and other existing monitoring features in Salesforce.

When I tell them yes, people's imaginations immediate kick into over-drive. Without understanding what is possible, people begin to imagine every possible way data can be accessed whether through a button click, running a report, viewing a list view, hovering over a related list, or looking at search results.

There are many ways users may interact with data in Salesforce. This blog post is designed to separate out fact from fiction when understanding how granular we can track user activity while working with the new Event Log Files functionality.

Event Log Files provides self-service access for customers to server generated log records. This means that a server interaction had to happen in order to record the event. The most typical server interaction is the change in a URI (uniform resource identifier, analogous to the URL you see in your address bar).


For example, when I clicked on the Marc Benioff contact record from the Home tab, the URL in the address bar changed by adding the contact id. As a result, the entry in the log file shows a Referrer URI of /home/home.jsp and a URI of /0033000000Vt4Od.

This URI interaction unto itself is powerful considering salesforce grew up as a native web application. Most things that we click ultimately change the URI. The easiest way to test this is to click something in the application and see if the address bar changes.

De-coding the URIs isn't difficult, it just takes a lexicon:
Standard object based pages:

/d: Detail- Detailing a single record and with its associated records
/m: Hover- HoverDetail page that uses a mini layout and no header/footer
/e: editPage Allowing the editing of a single record
/p: printableView- In a relatively unadorned format, detailing a single record and all of its associated records.
/o: Overview of a single entity.
/l: list- A filtered list of a single entity
/x: Printable list: A filtered list of a single entity. Does not have a help link, since you can't click links on paper.
/r: Refresh list: A stripped down version of a list filtered by ids
/s: special: Special is used for "other" pages where you want to reuse parts of the edit/detail page
/h: history: Show the history (used only in forecasting)
/a: Assign: Entity Owner Change page
/c: calendar: Time-based (Calendar) view of list data
/n: mini edit: Mini layout edit page

Custom Objects based Pages

/d: Detail
/m: Hover
/e: Edit
/p: Printable View
/o: Overview
/l: List
/x: Printable List
/a: Assign
/r: Refresh List

We can now track when someone prints a page or list view, edits a record or creates one, changes ownership, or even refreshes a list.

URI events mainly track what happens in the browser. In order to track similar interactions on a mobile device using a Salesforce 1 application, we have a separate UI Tracking log event.

At Dreamforce 2014's True to the Core session Mark Bruso asked me about this distinction and what we actually track.

Most of the time, when people ask about Salesforce 1 mobile, it's to validate the effectiveness of a BYOD (Bring Your Own Device) mobile strategy with a focus on the type of device, network used, and operating system of choice. The goal is typically to rationalize an investment in mobile in addition to understanding what their users are doing when they are in a Salesforce 1 application.


However, we also track a couple key attributes in the UI Tracking log file including Referrer and Action. This are analogous to the URI and Referrer URI attributes in the URI log file.

When you combine these two data sets, the big picture emerges across these different platforms. Now we can track what's happening with a user regardless of whether they use a browser or a mobile phone.


This is powerful for answering questions like:
  1. What % of my users are in the browser versus on the phone?
  2. Where are users using their mobile devices since they may be on the road?
  3. Where are my users spending their time and on what records?
  4. How frequently are they logging in and what hours of the day?
  5. Who clicked on what, when, where, and how?
Tracking user activity in salesforce isn't rocket surgery. We can't track everything a client side script might, but we can track a lot. And what we track enables us to re-create what a user did and paint a picture that helps address a variety of adoption, troubleshooting, and audit use cases.

20 October 2014

Salesforce Application Monitoring with Event Log Files



Have you ever:
  • wondered how to make your sales and support reps more successful?
  • wanted to track the adoption of projects that you roll out on the Salesforce platform like S1, Chatter, or the Clone This User app from Arkus?
  • wanted to find out which apex classes are succeeding and how long it takes for your Visualforce pages to render in production?
  • find out why some reports run slower than others?
  • needed to audit when ex-employees leave the company with your customer list?
Application Monitoring using Event Log Files, new in the Winter '15 release, enables all of these use cases and many, many more using an easy to download, file based API to extract Salesforce app log data.

When we started building this feature, which has been in pilot for over a year, we talked with a lot of customers who wanted access to our server logs for a variety of use cases.

What we heard from many of those customers was that they wanted to easily integrate the log data from all of their organizations with their back-end reporting and audit systems so they could drill down into the day-to-day detail. As a result, you won't find a user interface within setup to access these files; everything is done through the API in order to make integration easy.

The idea behind this feature is simple. Everyday, Salesforce generates a massive amount of app log data on our servers.

Our app logs do not contain customer data but instead contain metadata about the events happening in an organization. As a result, we store a report id rather than a report name or an account id instead of an account name. This obfuscation of the data enables customer's to normalize ids into names for records.

Every night, we ship these logs to a Hadoop server where we map reduce over the log data to create Event Log Files for organizations enabled with the feature.

As a result, every morning, a customer can download the previous day's log events in the form of CSV (comma separated values) files using the API. We chose CSV as a file format because it's easy to use when migrating data between systems.

Once you have this file, you can easily integrate it with a data warehouse analytics tool, build an app on top of a platform like force.com, or buy an ISV (Independent Software Vendor) built app to analyze and work with the data.

To make it easy to try this feature out and build new kinds of apps, we are including one day of data retention for all Developer Edition organizations. That means if you have a Developer Edition organization already, just log into it using the API and you'll have access to the previous day's worth of log data. If you don't already have one, just go to http://developerforce.com/signup to get your free Developer Edition org.

Application monitoring at Salesforce with Event Log Files has just made auditing and tracking user adoption easier than ever before.



Icons by DryIcons

06 October 2014

#Where is the Salesforce Hacker at Dreamforce 2014



This will be my tenth year presenting at the conference. And every year, I look forward to this event!

When I joined in the Summer of 2005, the big news of the conference was:
  • Customizable Forecasting
  • AppExchange
  • Field History Tracking on Cases, Solutions, Contracts, and Assets
Customizable Forecasting is now in it's third iteration and looks better than ever. 

AppExchange has over twenty-five hundred apps that have been installed over two and one half million times times.

And Field History has grown to almost all objects and over one hundred billion rows of audit data.

But what makes Dreamforce truly remarkable is definitely not the features that we highlight, the band that headlines the conference, or the orders of magnitude growth - it's the people that come to the conference. Every year, I talk with as many customers, partners, and vendors as I can. I love Dreamforce for their stories, for their use cases, for their challenges that they bring to the conference in hopes of replacing those challenges with solutions.

In the words of a colleague of mine, this conference is magical.

If you get a chance, stop by some of my sessions listed below and feel free to introduce yourself. I would love to meet you!


Session
Location
Time
Registration Link
New Event Monitoring: Understand Your Salesforce Org Activity Like Never Before
InterContinental San Francisco Grand Ballroom AB
Monday, October 13th 

12:00 PM - 12:40 PM
Learn the Four A's to Admin Success
San Francisco Marriott Marquis, Yerba Buena - Salon 7
Monday, October 13th

3:30 PM - 
4:10 PM
Event Monitoring for Admins
Moscone Center West Admin Theater Zone
Tuesday, October 14th

11:00 AM - 11:20 AM
Project Wave Use Case: Audit Analytics
Moscone Center West 2007
Tuesday, October 14th

4:00 PM - 
4:40 PM
Do-It-Yourself Access Checks with Custom Permissions
Moscone Center West 2009
Tuesday, October 14th 

5:00 PM - 
5:40 PM
Creating Dynamic Visualizations with Event Log Files
Moscone Center West 2006
Wednesday, October 15th 

3:15 PM - 
3:55 PM
Parker Harris's True to the Core: What’s Next for Our Core Products
YBCA - Lam Research Theater
Wednesday, October 15th 

5:00 PM - 
5:40 PM

22 July 2014

DIY salesforce.com monitoring with real-time SMS notifications


Recently, while on a customer on-site, I was asked a simple question - how can we do real-time monitoring of salesforce.com?

These were system administrators and operations people used to monitoring the uptime of their data center. Of course they expected real-time monitoring and automated alerts.

There are many ways to monitor salesforce. And when there isn't standard functionality to monitor, there is always a custom solution.

About a week ago, I started running into some issues with a new service that I was building. I was inspired by a sparkfun blog article I read about an open API based on Phant that allows you to post arbitrary custom values for real-time monitoring. I decided to build my own real-time monitoring system based on a simple heartbeat design that would notify me when my heartbeat skipped a beat. And when it didn't skip a beat, I just wanted to log the success and chart the trend over time for discussion with our engineers. This was similar to the requirements I heard while at the on-site with my customer.

I had some basic requirements for the first iteration of my monitoring service:

  1. it had to be automated to provide real-time data
  2. it had to perform the simplest query to determine availability 
  3. the query mechanism needed to be secure and hosted outside of salesforce 
  4. the charting and notification systems had to be as simple as possible, preferably no passwords or fees for using it in the first iteration. As long as I could obfuscate sensitive data, it could even be publicly exposed data.

My first prototype was done in about half an hour.

  • I created a bash shell script that I hosted on a Linux box under my desk. This was the secure part hosted outside of salesforce.
  • I created a CRON job on my Linux box set to run the shell script every minute. This would consume 1440 API calls a day as a result but I thought I could fine tune the frequency of the script later to suit my needs. Increasing the real time nature increases cost of API calls and vice versa, I can decrease cost by loosening my requirements. This was the automated part of the solution.
  • The shell script data flow was simple: log in using OAuth and curl, query to get a count of an sObject, and parse the result. If the result has a number, consider it a success, otherwise consider it a failure and log the error.
  • I used a free data publishing service from data.sparkfun.com. Originally created for publicly accessible IoT (Internet of Things) apps like weather device data, it made it trivial to expose the data I needed in a simple Rest API. In the next iteration, I would use keen.io which has more functionality and freemium options but involved more design than necessary wiring up my first iteration. You can check out my live heartbeat monitor that I'm still using to monitor my service.
  • I created a google charting API report to visualize the data. This was the visualization part of the solution and entirely based on a phant.io blog posting.
  • I used a freemium SMS service called SendHub to handle the notifications. I originally used Twilio but needed a simpler, freemium option for the first iteration.




Every minute, the CRON job would wake the bash shell script. The script would log into salesforce using the rest API, query a count of my new sobject, and if successful it would log a row to sparkfun which I viewed on their public page. If it failed, I would log another row to sparkfun with the error message. I then sent a SMS notification of the failure to my cell phone. To view a trend of successes and failures over time (which was useful to see what happened when I was away from my phone or asleep), I used my Google charting report.

This DIY project highlights a simple case of real-time monitoring.  If you want to try it out, you can find the code for this project in my Github repository - heartbeatMonitor.