24 November 2014

Downloading Event Log Files using a Script

Event Monitoring, new in the Winter '15 release, enables use cases like adoption, user audit, troubleshooting, and performance profiling using an easy to download, file based API to extract Salesforce app log data.

The most important part is making it easy to download the data so that you can integrate it with your analytics platform.

To help make it easy, I created a simple bash shell script to download these CSV (comma separated value) files to your local drive. It works best on Mac and Linux but can be made to work with Windows with a little elbow grease. You can try these scripts out at http://bit.ly/elfScripts. These scripts do require a separate JSON library called jq to parse the JSON that's returned by the REST API.

It's not difficult to build these scripts using other languages such as Ruby, Perl, or Python. What's important is the data flow.

I prompt the user to enter their username and password (which is masked). This information can just as easily be stored in environment variables or encrypted so that you can automate the download on a daily basis using CRON or launchd schedulers.

#!/bin/bash
# Bash script to download EventLogFiles
# Pre-requisite: download - http://stedolan.github.io/jq/ to parse JSON

#prompt the user to enter their username or uncomment #username line for testing purposes
read -p "Please enter username (and press ENTER): " username

#prompt the user to enter their password 
read -s -p "Please enter password (and press ENTER): " password

#prompt the user to enter their instance end-point 
echo 
read -p "Please enter instance (e.g. na1) for the loginURL (and press ENTER): " instance

#prompt the user to enter the date for the logs they want to download
read -p "Please enter logdate (e.g. Yesterday, Last_Week, Last_n_Days:5) (and press ENTER): " day

Once we have the credentials, we can log in using oAuth and get the access token.

#set access_token for OAuth flow 
#change client_id and client_secret to your own connected app - http://bit.ly/clientId
access_token=`curl https://${instance}.salesforce.com/services/oauth2/token -d "grant_type=password" -d "client_id=3MVG99OxTyEMCQ3hSja25qIUWtJCt6fADLrtDeTQA12.liLd5pGQXzLy9qjrph.UIv2UkJWtwt3TnxQ4KhuD" -d "client_secret=3427913731283473942" -d "username=${username}" -d "password=${password}" -H "X-PrettyPrint:1" | jq -r '.access_token'`

Then we can query the event log files to get the Ids necessary to download the files and store the event type and log date in order to properly name the download directory and files.

#set elfs to the result of ELF query
elfs=`curl https://${instance}.salesforce.com/services/data/v31.0/query?q=Select+Id+,+EventType+,+LogDate+From+EventLogFile+Where+LogDate+=+${day} -H "Authorization: Bearer ${access_token}" -H "X-PrettyPrint:1"`

Using jq, we can parse the id, event type, and date in order to create the directory and file names

#set the three variables to the array of Ids, EventTypes, and LogDates which will be used when downloading the files into your directory
ids=( $(echo ${elfs} | jq -r ".records[].Id") )
eventTypes=( $(echo ${elfs} | jq -r ".records[].EventType") )
logDates=( $(echo ${elfs} | jq -r ".records[].LogDate" | sed 's/'T.*'//' ) )

We create the directories to store the files. In this case, we download the raw data and then convert the timestamp to something our analytics platform will understand better.

Then we can iterate through each download, renaming it to the Event Type + Log Date so that we easily refer back to it later on. I also transform the Timestamp field to make it easier to import into an analytics platform like Project Wave from Salesforce Analytics Cloud.

#loop through the array of results and download each file with the following naming convention: EventType-LogDate.csv
for i in "${!ids[@]}"; do
    
    #make directory to store the files by date and separate out raw data from 
    #converted timezone data
    mkdir "${logDates[$i]}-raw"
    mkdir "${logDates[$i]}-tz"

    #download files into the logDate-raw directory
    curl "https://${instance}.salesforce.com/services/data/v31.0/sobjects/EventLogFile/${ids[$i]}/LogFile" -H "Authorization: Bearer ${access_token}" -H "X-PrettyPrint:1" -o "${logDates[$i]}-raw/${eventTypes[$i]}-${logDates[$i]}.csv" 

    #convert files into the logDate-tz directory for Salesforce Analytics
    awk -F ','  '{ if(NR==1) printf("%s\n",$0); else{ for(i=1;i<=NF;i++) { if(i>1&& i<=NF) printf("%s",","); if(i == 2) printf "\"%s-%s-%sT%s:%s:%sZ\"", substr($2,2,4),substr($2,6,2),substr($2,8,2),substr($2,10,2),substr($2,12,2),substr($2,14,2); else printf ("%s",$i);  if(i==NF) printf("\n")}}}' "${logDates[$i]}-raw/${eventTypes[$i]}-${logDates[$i]}.csv" > "${logDates[$i]}-tz/${eventTypes[$i]}-${logDates[$i]}.csv"

done

Downloading event log files is quick and efficient.  You can try these scripts out at http://bit.ly/elfScripts. Give it a try!

10 comments:

  1. Replies
    1. Thanks Ryan for following the blog!

      Delete
    2. This comment has been removed by a blog administrator.

      Delete
  2. ny updates to this?

    The sample script in this blog works perfectly against my Developer account but not production.

    The result of the token request is:
    {
    "error" : "invalid_grant",
    "error_description" : "authentication failure"
    }

    I've been round and round. It MUST be something simple I missed on the Salesforce setup side.

    ReplyDelete
    Replies
    1. Can you check to see if a security token is needed for your user?

      Delete
    2. You can also get a sample script or download files via http://Salesforce-elf.herokuapp.com

      Delete
    3. ello Adam,
      I have a problem with your scripts on Cygwin.
      1. {access_token} variable get correct values but when it's used on curl query it doesn't show any results :(
      I mean this part:
      elfs=`curl https://${instance}.salesforce.com/services/data/v32.0/query?q=Select+Id+,+EventType+,+LogDate+From+EventLogFile+Where+LogDate+=+${day} -H "Authorization: Bearer ${access_token}" -H "X-PrettyPrint:1"`

      2. If I assign a fixed value to that the scipt starts file download but I get the following message

      .csv.-03-27.csv to URI
      .csver of input files: 1 merged to output file: URI
      -2017-03-27.csv]estalforceRequest
      .csv.-03-27.csv to VisualforceRequest
      .csver of input files: 1 merged to output file: VisualforceRequest
      -2017-03-27.csv]aveChange
      .csv.-03-27.csv to WaveChange
      .csver of input files: 1 merged to output file: WaveChange

      and no files are downloaded.
      I've tried using the script on Ubuntu 16.04 with a similar result :(

      Please help.

      Delete
    4. Hi Jakub,

      Thanks for reading the blog! I'm sorry you're having this issue.

      A way to simplify this is to bootstrap the bash script via the web browser app on heroku: http://Salesforce-elf.herokuapp.com which will auto generate the bash script and may help to troubleshoot what's going on here.

      Something odd is definitely happening in your file because it's trying to merge files. The issue with that is that the code would have to try to merge the files into a single one which is another reason to try to simplify the code to try downloading just one day and one file type first before trying to download all files at once.

      Delete