09 November 2015

Using the Setup Audit Trail API - the best kept secret of Winter '16

The Setup Audit Trail has been around for a long time. It's part of the Salesforce trust platform and is built into every edition. It tracks changes in setup by administrators such as adding new users, activation of users, changes in permissions including escalations of privilege, and changes to metadata like the creation of fields or deletion of objects. 

Until very recently, Setup Audit Trail was only available to manually download to a CSV (comma separated values) file using a link on a page in setup. I kept hearing how customers hated putting manual processes in place to click the link once a month as well as merge newly downloaded CSV files into previous exports. It made audit and compliance use cases much harder than what anyone wanted. Everyone was asking to integrate the Setup Audit Trail using the API so they could schedule regular, automated downloads of audit trail data to analytics and SIEM tools.

Starting with Winter '16, we added the Setup Audit Trail to the API as the SetupAuditTrail sObject. There are a couple key use cases that you might want to try out using a tool like Workbench.

1. I want to know everyone who logged in as a particular end-user:
SELECT Action, CreatedById, CreatedDate, DelegateUser, Display, Id, Section 
FROM SetupAuditTrail 
WHERE CreatedBy.Name = 'Jim Rivera' 
ORDER BY CreatedDate DESC NULLS FIRST LIMIT 10

2. I want to know everyone an admin user logged in as:
SELECT Action, CreatedBy.Name, CreatedDate, DelegateUser, Display, Id, Section 
FROM SetupAuditTrail 
WHERE DelegateUser = 'at@xx.com' 
ORDER BY CreatedDate DESC NULLS FIRST LIMIT 10

3. I want to know everything that any users with a specific profile (or role) did in setup:
SELECT Action, CreatedBy.Profile.Name, CreatedDate, DelegateUser, Display, Id, Section 
FROM SetupAuditTrail 
WHERE CreatedBy.Profile.Name = 'EMEA VP' 
ORDER BY CreatedDate DESC NULLS FIRST LIMIT 10

4. I want to know every user who was 'frozen' in the last week
SELECT Action, CreatedById, CreatedDate, DelegateUser, Display, Id, Section 
FROM SetupAuditTrail 
WHERE Action = 'frozeuser' AND CreatedDate = Last_n_Days:7 
ORDER BY CreatedDate DESC NULLS FIRST LIMIT 10

5. I want to know everything a specific user did last week
SELECT Action, CreatedById, CreatedDate, DelegateUser, Display, Id, Section 
FROM SetupAuditTrail 
WHERE CreatedBy.Name = 'Adrian Kunzle' AND CreatedDate = Last_n_Days:7 
ORDER BY CreatedDate DESC NULLS FIRST LIMIT 10

I use 'Limit 10' to just test the queries and keep them from taking a long time to return which is a good idea when experimenting with new queries.

Once you know what kind of queries you can write, you can create incredible apps that combine the SetupAuditTrail API with an app building platform like Node.js on Heroku:
The application above was created by the incredible Alessandro. You can download the source code from his Github repository or try it out with this free Heroku app.

You can also explore Setup Audit Trail data using an analytics platform like Wave:


The Setup Audit Trail is a powerful way of tracking important administrative audit events and now it's even more accessible through the API.

02 November 2015

These are a few of my favorite things


Event Monitoring provides easily downloadable CSV(Comma Separated Value) log files. The value of this data is really in the insights that you gain from rolling up and aggregating the data based on questions like:
  1. how many accounts were accessed in the last week?
  2. how many reports were downloaded by a specific user in the last 6 months?
  3. how many total executions did I have in the past year?
  4. how big are the files that I need to store on an ongoing basis for regulatory reasons?
Normally you would spend time to integrate, enrich, and analyze log data using an analytics platform. We have a great app ecosystem of ISV (Independent Software Vendors) that provide pre-built insights into this data.

But sometimes you just want a quick and easy way to answer this data without first creating an integration or enriching the data. In particular, you may want to analyze the data when it ranges over extremely long periods of time and significant volumes of data that you may not already have integrated into your analytics platform of choice. This actually came up in conversation with a customer administrator who told me that he was downloading Event Log Files and using a command line tool called grep to ask simple questions about the data; in effect he was trying to find a needle in the log file haystack without using an analytics platform.

Event Log Files generate CSV log file content. As a result, it's easy to report on it using piping in a Unix based CLI (Command Line Interface or Terminal).  This is definitely not for the button click crowd but it's incredibly fast and efficient for processing large amounts of data quickly. And since it's a good idea to keep a source system of files for a long period of time, you can always write quick piping commands to answer ad-hoc questions that may come up where you don't want to first stage or munge the data into an analytics platform.

If you haven't ever worked with a CLI (Command Line Interface or Terminal), I recommend first reading this great blog post from Lifehacker or this DataVu blog post. Also, if you're a windows only user, I wouldn't really continue with this post unless your files are on a unix based system like Mac or Linux that you can SSH into.

Here a few of my favorite things when it comes to commands to CLI prompts:

Use case: login to the file server where you're keeping the source log files
Example Prompt: ssh username@boxname
Notes: you'll need a password which means if you don't own that server, you'll need to request access

Use case: navigate to the directory where the log files are for a specific date (assuming you're storing them by date rather than by type or something else)
Example Prompt: cd Documents/logs/2015-09-30
Notes: if you don't manage this server, you'll need to find out where the files are first

Use case: list the files based on last modified date
Example Prompt: ls -tr
Example Output:
ApexCallout-2015-09-30.csv
ApexExecution-2015-09-30.csv
ApexSoap-2015-09-30.csv
Notes: I love this command. It's super simple and assumes you're dealing with the latest change first.

Use case: list the files based on size sorted from largest to smallest
Example Prompt: ls -less | sort -n -r
Example Output:
4116672 -rw-r--r--  1 auser  SFDC\Domain Users  2107733241 Oct  1 21:43 ApexTrigger-2015-09-30.csv
3512792 -rw-r--r--  1 auser  SFDC\Domain Users  1798546132 Oct  1 21:53 UITracking-2015-09-30.csv
3437816 -rw-r--r--  1 auser  SFDC\Domain Users  1760159289 Oct  1 21:58 URI-2015-09-30.csv
Notes: This is really helpful for finding the size of files, for instance if you just want to view the smallest file as a sample data set.

Use case: view the entire file in the terminal
Example Prompt: cat PackageInstall-2015-09-30.csv
Example Output:
"EVENT_TYPE","TIMESTAMP","REQUEST_ID","ORGANIZATION_ID","USER_ID","RUN_TIME","CPU_TIME","CLIENT_IP","URI","OPERATION_TYPE","IS_SUCCESSFUL","IS_PUSH","IS_MANAGED","IS_RELEASED","PACKAGE_NAME","FAILURE_TYPE","TIMESTAMP_DERIVED","USER_ID_DERIVED"
"PackageInstall","20150930175506.083","4-YHvN7eC7w3t6H5TippD-","00D000000000062","0053000000BqwWz","175","","","","INSTALL","0","0","0","0","SOS Quick Setup & Dashboard","OTHER","2015-09-30T17:55:06.083Z","0053000000BqwWzAAJ"
"PackageInstall","20150930175953.848","4-YIEBVUUMqV2bH5Tiluk-","00D000000000062","0053000000BqwWz","96","","","","INSTALL","0","0","0","0","SOS Quick Setup
Dashboard","OTHER","2015-09-30T17:59:53.848Z","0053000000BqwWzAAJ"
Notes: This is really helpful for small files but can be overwhelming when you have a larger one unless you first reduce the output using a command like grep.

Use case: only view the first 10 lines of a file
Example Prompt: head -10 ApexTrigger-2015-09-30.csv
Example Output:
"EVENT_TYPE","TIMESTAMP","REQUEST_ID","ORGANIZATION_ID","USER_ID","RUN_TIME","CPU_TIME","CLIENT_IP","URI","REQUEST_STATUS","DB_TOTAL_TIME","TRIGGER_ID","TRIGGER_NAME","ENTITY_NAME","TRIGGER_TYPE","TIMESTAMP_DERIVED","USER_ID_DERIVED"
"ApexTrigger","20151101000000.414","408evS044zRUwrH5Tipt5-","00D000000000062","00530000003ffnX","","","","","","","01q3000000008Xn","OrderTrigger","Order","BeforeUpdate","2015-11-01T00:00:00.414Z","00530000003ffnXAAQ"
"ApexTrigger","20151101000000.496","408evS044zRUwrH5Tipt5-","00D000000000062","00530000003ffnX","","","","","","","01q3000000008Xn","OrderTrigger","Order","AfterUpdate","2015-11-01T00:00:00.496Z","00530000003ffnXAAQ"
Notes: This is really helpful for really large files when you want to quickly validate the file headers with some sample data without having to load all of the data to the terminal or first reduce the data using a command like grep.

Use case: only view the last 10 lines of a file
Example Prompt: tail -10 ApexTrigger-2015-09-30.csv
Example Output:
"ApexTrigger","20151101084720.661","4096gnLuAlCzC6H5Tilrk-","00D000000000062","00530000001fAyR","","","","","","","01q300000000014","eimContractTrigger","Contract","AfterUpdate","2015-11-01T08:47:20.661Z","00530000001fAyRAAU"
"ApexTrigger","20151101084720.661","4096gnLuAlCzC6H5Tilrk-","00D000000000062","00530000001fAyR","","","","","","","01q3000000007mq","RenewalsContractTrigger","Contract","AfterUpdate","2015-11-01T08:47:20.661Z","00530000001fAyRAAU"
Notes: This is really helpful for really large files to find when and where the file ends without having to load all of the data to there terminal. Several times, I had questions where the file ended and when the last events were processed for the day. Tail helps me to find out where the file and day ends.

Use case: find out how many total transactions you had for a day, sorted from highest to lowest
Example Prompt: wc -cl /Users/atorman/Documents/logs/2015-09-30/*.csv | sort -n -r 
Example Output:
31129864 9577494169 total
8904594 2107733241 /Users/auser/Documents/logs/2015-09-30/ApexTrigger-2015-09-30.csv
5655267 1760159289 /Users/auser/Documents/logs/2015-09-30/URI-2015-09-30.csv
Notes: This is really helpful for capacity planning or for quick questions around total transactions by file type or date. I'm often asked how many log lines were generated or how big the files are that we're collecting. This is also really easy to output to a CSV so you can provide the information to another person. Just change the prompt to wc -cl /Users/atorman/Documents/logs/ | sort -n -r >> lineCount.csv.

Use case: simple report on number of lines by file type (*.csv)
Example Prompt: du -a *.csv | sort -n -r 
Example Output:
1676752 UITracking-2015-11-01.csv
1094160 ApexExecution-2015-11-01.csv
554040 ApexSoap-2015-11-01.csv
443408 RestApi-2015-11-01.csv
276112 URI-2015-11-01.csv
Notes: Similar to wc -l, du helps with capacity planning and enables you to answer quick questions around total transactions for a file or event type.

Use case: simple report on size by file type (*.csv)
Example Prompt: du -hs *.csv | sort -n
Example Output:
1.4M ApexCallout-2015-11-01.csv
1.7M TimeBasedWorkflow-2015-11-01.csv
3.2M QueuedExecution-2015-11-01.csv
3.6M Logout-2015-11-01.csv
5.7M Report-2015-11-01.csv
Notes: Similar to wc -c, helps with capacity planning and to answer quick questions around total transactions for a file or event type.

Use case: merge multiple CSV files together into a new CSV
Example Prompt: Merge cat *.csv > new.csv
Notes: This is really helpful when merging multiple files of the same type (e.g. API) that span multiple days into a single CSV prior to loading to an analytics platform.

Use case: get all of the report export log lines for a specific user (e.g. '00530000000h51Z')
Example Prompt: grep -r '00530000000h51Z' 2015*/ReportExport*.csv
Example Output:
2015-10-13/ReportExport-2015-10-13.csv:"ReportExport","20151013205624.313","4-nPScfLnMsalbH5Tipnt-","00D000000000062","00530000000h51Z","","","102.14.229.01","/00O30000008ZXB4","Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36"
Notes: I <3 grep! Grep is basically a simple search, and as a result, is a good tool for finding a needle in the haystack of log lines. Grep is really helpful for providing a quick audit report for your auditors on everything a specific user did. It's also really powerful to combine this with other commands by using a pipe (i.e. '|'). For instance, by adding '| wc -l' to the end of the above command, you can get the total number of reports the user exported instead of the specific report log lines. This is similar to performing a count() in SOQL and filtering by a specific user Id. Now you're using the command line for reporting purposes!

Use case: get all of the events where an account was accessed
Example Prompt: grep 'EVENT_TYPE\|/001' URI*.csv | head -10
Example Output:
"URI","20151101000017.742","408ewGnZuAVzcrH5Tipu4-","00D000000000062","00530000003jh6y","3597","695","102.14.229.01","/0013000001J4uxa","S","1801149015","8408","430","https-//na2-salesforce-com/00630000010Pxtw?srPos=0-srKp=006","2015-11-01T00:00:17.742Z","00530000003jh6yAAA"
Notes: This is really helpful for providing a quick audit report for your auditors on account access. This is also another good prompt to add '| wc -l' to in order to find out how many times a specific account was accessed.

Use case: convert timestamp from a number to a string prior to importing to a reporting application
Example Prompt: awk -F ','  '{ if(NR==1) printf("%s\n",$0); else{ for(i=1;i<=NF;i++) { if(i>1&& i<=NF) printf("%s",","); if(i == 2) printf "\"%s-%s-%sT%s:%s:%sZ\"", substr($2,2,4),substr($2,6,2),substr($2,8,2),substr($2,10,2),substr($2,12,2),substr($2,14,2); else printf ("%s",$i);  if(i==NF) printf("\n")}}}' "${eventTypes[$i]}-raw/${eventTypes[$i]}-${logDates[$i]}.csv" > "${eventTypes[$i]}/${eventTypes[$i]}-${logDates[$i]}.csv"
Notes: this one takes a lot more work but is really helpful for transforming data before loading it into a system that has specific formatting requirements.

There are many more great utilities available on the command line, which when coupled with pipes (|) and shell scripts makes for an easy way to automate many simple tasks or perform ad-hoc queries against the raw log files.

And those are a few of my favorite things!