Salesforce Hacker: 2015

23 December 2015

A Salesforce Wave App for Comparing Permissions

Comparing user's permissions within the Salesforce platform has been a long standing request and non-trivial task for administrators. I wrote a blog post on Comparing Profiles and Permission Sets previously about this very topic, outlining some of the challenges reporting on user's access.

Since then, it's come up on social media a couple of times, most recently with a twitter post.

While apps like the Perm-Comparator from John Brock are still the go-to solution for comparing users, permission sets, and profiles, a thought occurred to me that you could also solve this BigData problem with the Salesforce Wave platform.

So I set out to use the tooling available via the Wave platform for creating a simple app that compares permission assignments and their containers. It is just a proof of concept but it can be a good starting point for admins who have access to the Wave platform already and want to compare user's permissions.

The reason to create this kind of dashboard is to explore both permission set assignments (which include profiles) to users as well as the underlying permission set and profile containers.

This is done by auto-generating datasets using the Wave dataset builder which is a point-and-click tool that accesses these objects via the Salesforce API.

The Wave platform is an incredible tool for exploring complex datasets and cuts through the complexity of users, permission sets, and profiles like a hot knife through butter. If you have the Wave platform already and would like to try this out after the Spring '16 release, check out my Github repository, permissionsWaveApp.

14 December 2015

Derived Fields or Getting to Last Week's Top Ten by Name Instead of by Id

Last December, I blogged about how to work with timestamps in Event Log Files. The basic idea is that we generate timestamps in the form of a number that looks like this: 20151210160337.6. Only, it's not a number really, it's a string. For instance, when you add a couple more characters, you get: 2015-12-10T16:03:37.600Z. Look more familiar? More importantly, as an ISO 8601 formatted date time type, it's easier to integrate with third party analytic platforms for trending and time series charting.

The same can be said of of other fields like the ubiquitous USER_ID field which is stored as fifteen characters: 005B0000001WamP. However, in order to integrate with an analytics platform like Wave or Splunk, having the standard eighteen character id like 005B0000001WamPIAS makes it much easier to match with the actual User sObject so that you can retrieve other values such as the user's profile or role. The same can be true of any other object that you would want to denormalize so that you can create a report on the top ten users or reports by name instead of by id.

In Winter '16, we introduced derived fields as a sleepy enhancement to Event Log Files. It is sleepy in that there is no ticker tape parade, no marketing campaigns, no clappies at Dreamforce. Instead, it's a simple and effective way of transforming data when the file is generated based on patterns and data already contained within the file. It's TEL (Transform, Extract, and Load) instead of ETL (Extract, Transform, and Load).

The original data doesn't go away. Instead, a new field is added with a '_DERIVED' suffix. That way existing integrations won't break and options exist for other kinds of non-standard transformations. For instance, the fifteen character USER_ID now has a counterpart with an eighteen character USER_ID_DERIVED and the TIMESTAMP has a TIMESTAMP_DERIVED equivelent. This opens new doors to future transformations.

This means applications like Wave Analytics can more easily augment or join data based on these transformed ids and timestamps.

As a result reports look like this

instead of like this

Derived fields with Event Log Files handles the problem of data preparation by transforming the data before the file is generated, making future transformations easier. Hope this helps with your integrations!

09 November 2015

Using the Setup Audit Trail API - the best kept secret of Winter '16

The Setup Audit Trail has been around for a long time. It's part of the Salesforce trust platform and is built into every edition. It tracks changes in setup by administrators such as adding new users, activation of users, changes in permissions including escalations of privilege, and changes to metadata like the creation of fields or deletion of objects.

Until very recently, Setup Audit Trail was only available to manually download to a CSV (comma separated values) file using a link on a page in setup. I kept hearing how customers hated putting manual processes in place to click the link once a month as well as merge newly downloaded CSV files into previous exports. It made audit and compliance use cases much harder than what anyone wanted. Everyone was asking to integrate the Setup Audit Trail using the API so they could schedule regular, automated downloads of audit trail data to analytics and SIEM tools.

Starting with Winter '16, we added the Setup Audit Trail to the API as the SetupAuditTrail sObject. There are a couple key use cases that you might want to try out using a tool like Workbench.

1. I want to know everyone who logged in as a particular end-user:
SELECT Action, CreatedById, CreatedDate, DelegateUser, Display, Id, Section
FROM SetupAuditTrail
WHERE CreatedBy.Name = 'Jim Rivera'
ORDER BY CreatedDate DESC NULLS FIRST LIMIT 10

2. I want to know everyone an admin user logged in as:

SELECT Action, CreatedBy.Name, CreatedDate, DelegateUser, Display, Id, Section
FROM SetupAuditTrail
WHERE DelegateUser = 'at@xx.com'
ORDER BY CreatedDate DESC NULLS FIRST LIMIT 10

3. I want to know everything that any users with a specific profile (or role) did in setup:

SELECT Action, CreatedBy.Profile.Name, CreatedDate, DelegateUser, Display, Id, Section
FROM SetupAuditTrail
WHERE CreatedBy.Profile.Name = 'EMEA VP'
ORDER BY CreatedDate DESC NULLS FIRST LIMIT 10

4. I want to know every user who was 'frozen' in the last week

SELECT Action, CreatedById, CreatedDate, DelegateUser, Display, Id, Section
FROM SetupAuditTrail
WHERE Action = 'frozeuser' AND CreatedDate = Last_n_Days:7
ORDER BY CreatedDate DESC NULLS FIRST LIMIT 10

5. I want to know everything a specific user did last week

SELECT Action, CreatedById, CreatedDate, DelegateUser, Display, Id, Section
FROM SetupAuditTrail
WHERE CreatedBy.Name = 'Adrian Kunzle' AND CreatedDate = Last_n_Days:7
ORDER BY CreatedDate DESC NULLS FIRST LIMIT 10

I use 'Limit 10' to just test the queries and keep them from taking a long time to return which is a good idea when experimenting with new queries.

Once you know what kind of queries you can write, you can create incredible apps that combine the SetupAuditTrail API with an app building platform like Node.js on Heroku:

The application above was created by the incredible Alessandro. You can download the source code from his Github repository or try it out with this free Heroku app.

You can also explore Setup Audit Trail data using an analytics platform like Wave:

The Setup Audit Trail is a powerful way of tracking important administrative audit events and now it's even more accessible through the API.

02 November 2015

These are a few of my favorite things

Event Monitoring provides easily downloadable CSV(Comma Separated Value) log files. The value of this data is really in the insights that you gain from rolling up and aggregating the data based on questions like:

how many accounts were accessed in the last week?
how many reports were downloaded by a specific user in the last 6 months?
how many total executions did I have in the past year?
how big are the files that I need to store on an ongoing basis for regulatory reasons?

Normally you would spend time to integrate, enrich, and analyze log data using an analytics platform. We have a great app ecosystem of ISV (Independent Software Vendors) that provide pre-built insights into this data.

But sometimes you just want a quick and easy way to answer this data without first creating an integration or enriching the data. In particular, you may want to analyze the data when it ranges over extremely long periods of time and significant volumes of data that you may not already have integrated into your analytics platform of choice. This actually came up in conversation with a customer administrator who told me that he was downloading Event Log Files and using a command line tool called grep to ask simple questions about the data; in effect he was trying to find a needle in the log file haystack without using an analytics platform.

Event Log Files generate CSV log file content. As a result, it's easy to report on it using piping in a Unix based CLI (Command Line Interface or Terminal). This is definitely not for the button click crowd but it's incredibly fast and efficient for processing large amounts of data quickly. And since it's a good idea to keep a source system of files for a long period of time, you can always write quick piping commands to answer ad-hoc questions that may come up where you don't want to first stage or munge the data into an analytics platform.

If you haven't ever worked with a CLI (Command Line Interface or Terminal), I recommend first reading this great blog post from Lifehacker or this DataVu blog post. Also, if you're a windows only user, I wouldn't really continue with this post unless your files are on a unix based system like Mac or Linux that you can SSH into.

Here a few of my favorite things when it comes to commands to CLI prompts:

Use case: login to the file server where you're keeping the source log files
Example Prompt: ssh username@boxname
Notes: you'll need a password which means if you don't own that server, you'll need to request access

Use case: navigate to the directory where the log files are for a specific date (assuming you're storing them by date rather than by type or something else)
Example Prompt: cd Documents/logs/2015-09-30
Notes: if you don't manage this server, you'll need to find out where the files are first

Use case: list the files based on last modified date
Example Prompt: ls -tr
Example Output:

ApexCallout-2015-09-30.csv

ApexExecution-2015-09-30.csv

ApexSoap-2015-09-30.csv

Notes: I love this command. It's super simple and assumes you're dealing with the latest change first.

Use case: list the files based on size sorted from largest to smallest
Example Prompt: ls -less | sort -n -r
Example Output:

4116672 -rw-r--r--  1 auser  SFDC\Domain Users  2107733241 Oct  1 21:43 ApexTrigger-2015-09-30.csv

3512792 -rw-r--r--  1 auser  SFDC\Domain Users  1798546132 Oct  1 21:53 UITracking-2015-09-30.csv

3437816 -rw-r--r--  1 auser  SFDC\Domain Users  1760159289 Oct  1 21:58 URI-2015-09-30.csv

Notes: This is really helpful for finding the size of files, for instance if you just want to view the smallest file as a sample data set.

Use case: view the entire file in the terminal
Example Prompt: cat PackageInstall-2015-09-30.csv
Example Output:

"EVENT_TYPE","TIMESTAMP","REQUEST_ID","ORGANIZATION_ID","USER_ID","RUN_TIME","CPU_TIME","CLIENT_IP","URI","OPERATION_TYPE","IS_SUCCESSFUL","IS_PUSH","IS_MANAGED","IS_RELEASED","PACKAGE_NAME","FAILURE_TYPE","TIMESTAMP_DERIVED","USER_ID_DERIVED"

"PackageInstall","20150930175506.083","4-YHvN7eC7w3t6H5TippD-","00D000000000062","0053000000BqwWz","175","","","","INSTALL","0","0","0","0","SOS Quick Setup & Dashboard","OTHER","2015-09-30T17:55:06.083Z","0053000000BqwWzAAJ"

"PackageInstall","20150930175953.848","4-YIEBVUUMqV2bH5Tiluk-","00D000000000062","0053000000BqwWz","96","","","","INSTALL","0","0","0","0","SOS Quick Setup

Dashboard","OTHER","2015-09-30T17:59:53.848Z","0053000000BqwWzAAJ"

Notes: This is really helpful for small files but can be overwhelming when you have a larger one unless you first reduce the output using a command like grep.

Use case: only view the first 10 lines of a file
Example Prompt: head -10 ApexTrigger-2015-09-30.csv
Example Output:

"EVENT_TYPE","TIMESTAMP","REQUEST_ID","ORGANIZATION_ID","USER_ID","RUN_TIME","CPU_TIME","CLIENT_IP","URI","REQUEST_STATUS","DB_TOTAL_TIME","TRIGGER_ID","TRIGGER_NAME","ENTITY_NAME","TRIGGER_TYPE","TIMESTAMP_DERIVED","USER_ID_DERIVED"

"ApexTrigger","20151101000000.414","408evS044zRUwrH5Tipt5-","00D000000000062","00530000003ffnX","","","","","","","01q3000000008Xn","OrderTrigger","Order","BeforeUpdate","2015-11-01T00:00:00.414Z","00530000003ffnXAAQ"

"ApexTrigger","20151101000000.496","408evS044zRUwrH5Tipt5-","00D000000000062","00530000003ffnX","","","","","","","01q3000000008Xn","OrderTrigger","Order","AfterUpdate","2015-11-01T00:00:00.496Z","00530000003ffnXAAQ"

Notes: This is really helpful for really large files when you want to quickly validate the file headers with some sample data without having to load all of the data to the terminal or first reduce the data using a command like grep.

Use case: only view the last 10 lines of a file
Example Prompt: tail -10 ApexTrigger-2015-09-30.csv
Example Output:

"ApexTrigger","20151101084720.661","4096gnLuAlCzC6H5Tilrk-","00D000000000062","00530000001fAyR","","","","","","","01q300000000014","eimContractTrigger","Contract","AfterUpdate","2015-11-01T08:47:20.661Z","00530000001fAyRAAU"

"ApexTrigger","20151101084720.661","4096gnLuAlCzC6H5Tilrk-","00D000000000062","00530000001fAyR","","","","","","","01q3000000007mq","RenewalsContractTrigger","Contract","AfterUpdate","2015-11-01T08:47:20.661Z","00530000001fAyRAAU"

Notes: This is really helpful for really large files to find when and where the file ends without having to load all of the data to there terminal. Several times, I had questions where the file ended and when the last events were processed for the day. Tail helps me to find out where the file and day ends.

Use case: find out how many total transactions you had for a day, sorted from highest to lowest
Example Prompt: wc -cl /Users/atorman/Documents/logs/2015-09-30/*.csv | sort -n -r
Example Output:

31129864 9577494169 total

8904594 2107733241 /Users/auser/Documents/logs/2015-09-30/ApexTrigger-2015-09-30.csv

5655267 1760159289 /Users/auser/Documents/logs/2015-09-30/URI-2015-09-30.csv

Notes: This is really helpful for capacity planning or for quick questions around total transactions by file type or date. I'm often asked how many log lines were generated or how big the files are that we're collecting. This is also really easy to output to a CSV so you can provide the information to another person. Just change the prompt to wc -cl /Users/atorman/Documents/logs/ | sort -n -r >> lineCount.csv.

Use case: simple report on number of lines by file type (*.csv)
Example Prompt: du -a *.csv | sort -n -r
Example Output:

1676752 UITracking-2015-11-01.csv

1094160 ApexExecution-2015-11-01.csv

554040 ApexSoap-2015-11-01.csv

443408 RestApi-2015-11-01.csv

276112 URI-2015-11-01.csv

Notes: Similar to wc -l, du helps with capacity planning and enables you to answer quick questions around total transactions for a file or event type.

Use case: simple report on size by file type (*.csv)
Example Prompt: du -hs *.csv | sort -n
Example Output:

1.4M ApexCallout-2015-11-01.csv

1.7M TimeBasedWorkflow-2015-11-01.csv

3.2M QueuedExecution-2015-11-01.csv

3.6M Logout-2015-11-01.csv

5.7M Report-2015-11-01.csv

Notes: Similar to wc -c, helps with capacity planning and to answer quick questions around total transactions for a file or event type.

Use case: merge multiple CSV files together into a new CSV
Example Prompt: Merge cat *.csv > new.csv
Notes: This is really helpful when merging multiple files of the same type (e.g. API) that span multiple days into a single CSV prior to loading to an analytics platform.

Use case: get all of the report export log lines for a specific user (e.g. '00530000000h51Z')
Example Prompt: grep -r '00530000000h51Z' 2015*/ReportExport*.csv
Example Output:

2015-10-13/ReportExport-2015-10-13.csv:"ReportExport","20151013205624.313","4-nPScfLnMsalbH5Tipnt-","00D000000000062","00530000000h51Z","","","102.14.229.01","/00O30000008ZXB4","Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36"

Notes: I <3 grep! Grep is basically a simple search, and as a result, is a good tool for finding a needle in the haystack of log lines. Grep is really helpful for providing a quick audit report for your auditors on everything a specific user did. It's also really powerful to combine this with other commands by using a pipe (i.e. '|'). For instance, by adding '| wc -l' to the end of the above command, you can get the total number of reports the user exported instead of the specific report log lines. This is similar to performing a count() in SOQL and filtering by a specific user Id. Now you're using the command line for reporting purposes!

Use case: get all of the events where an account was accessed
Example Prompt: grep 'EVENT_TYPE\|/001' URI*.csv | head -10
Example Output:

"URI","20151101000017.742","408ewGnZuAVzcrH5Tipu4-","00D000000000062","00530000003jh6y","3597","695","102.14.229.01","/0013000001J4uxa","S","1801149015","8408","430","https-//na2-salesforce-com/00630000010Pxtw?srPos=0-srKp=006","2015-11-01T00:00:17.742Z","00530000003jh6yAAA"

Notes: This is really helpful for providing a quick audit report for your auditors on account access. This is also another good prompt to add '| wc -l' to in order to find out how many times a specific account was accessed.

Use case: convert timestamp from a number to a string prior to importing to a reporting application
Example Prompt: awk -F ',' '{ if(NR==1) printf("%s\n",$0); else{ for(i=1;i<=NF;i++) { if(i>1&& i<=NF) printf("%s",","); if(i == 2) printf "\"%s-%s-%sT%s:%s:%sZ\"", substr($2,2,4),substr($2,6,2),substr($2,8,2),substr($2,10,2),substr($2,12,2),substr($2,14,2); else printf ("%s",$i); if(i==NF) printf("\n")}}}' "${eventTypes[$i]}-raw/${eventTypes[$i]}-${logDates[$i]}.csv" > "${eventTypes[$i]}/${eventTypes[$i]}-${logDates[$i]}.csv"
Notes: this one takes a lot more work but is really helpful for transforming data before loading it into a system that has specific formatting requirements.

There are many more great utilities available on the command line, which when coupled with pipes (|) and shell scripts makes for an easy way to automate many simple tasks or perform ad-hoc queries against the raw log files.

And those are a few of my favorite things!

29 October 2015

LogDate vs CreatedDate - when to use one vs the other in an integration

Why are there two date-time fields for each Event Log File: LogDate and CreatedDate? Shouldn't one be good enough?

Seems like it should be a straightforward question, but it comes up frequently and the answer can effect how you integrate with Event Log Files.

Lets start with the definition of each:

LogDate tracks usage activity of a log file for a 24-hour period, from 12:00 a.m. to 11:59 p.m.
CreatedDate tracks when the log file was generated.

Why is this important? Why have these two different timestamps? Because having both ensure the reliability and eventual consistency of log delivery within a massively distributed system.

Each customer is co-located on a logical collection of servers we call a 'pod'. You can read more about pods and the multi-tenant architecture on the developer force blog.

There can be anywhere from ten to one hundred thousand customer organizations on a single pod. Each pod has numerous app servers which, at any given time, handle requests from any of those customer organizations. In such a large, distributed system, it's possible, though not frequently, for an app server to go down.

As a result, while a customer's transactions are being captured, if an app server does goes down, their transactions can be routed to another server seamlessly without affecting the end-user's experience or the integrity of the data. For a variety of reasons, app servers can go up or down throughout the day but what's important is that this activity doesn't affect the end-user's experience of the app or the integrity of the customer's data.

Each app server captures it's own log files throughout the day regardless of which customer's transactions are being handled by the server. Each log file therefore represents log entries for all customers who had transactions on that app server throughout the day. At the end of the day (~2am local server time), Salesforce ships the log files from active app servers to an HDFS where Hadoop jobs run. The Hadoop server generates the Event Log File content (~3am local server time) for each customer based on the app logs that were shipped earlier. This job is what generates Event Log File content which is accessible to the customer via the API.

It's possible that some log files will have to get shipped at a later date. This could have been from an app server that was offline during some part of the day that comes back on line after the log files are shipped for that day. Therefore log files may be considered eventually consistent based on the log shipper or Hadoop job picking up a past file in a future job run. As a result, it's possible that Salesforce will catch up at a later point. We have built look back functionality to address this scenario. Every night when we run the Hadoop job, we check to see if new files exist for previous days and then re-generate new Event Log File content, overwriting the existing log files that were previously generated.

This is why we have both CreatedDate and LogDate fields - LogDate reflects the actual 24-hour period when the user activity occurred and CreatedDate reflects when the actual Event Log File was generated. So it's possible, due to look back functionality, that we will re-generate a previous LogDate's file and in the process, write more lines than we did on the previous day with the newly available app server files co-mingled with the original app server log files that were originally received.

This eventual consistency of log files may impact your integration with Event Log Files.

The easy way to integrate with Event Log Files is to use LogDate and write a daily query that simply asks for the last 'n' days of log files based on the LogDate:

Select LogFile from EventLogFile where LogDate=Last_n_Days:7

However, if you query on LogDate, it is possible to miss data that you might get from downloading it later. For instance, if you downloaded yesterday's log files and then re-download them tomorrow, you may actually have more log lines in the newer download. This is because some app log files may have caught up, overwriting the original log content with more log lines.

To ensure a more accurate query that also captures look back updates of the previous day's log files, you should use CreatedDate:

Select LogFile from EventLogFile where CreatedDate=Last_n_Days:7

This is a more complicated integration because you will have to keep track of the CreatedDate for each LogDate and EventType that was previously downloaded in the case that a CreatedDate is newer than a previously downloaded file. You may also need to handle event row de-duplication where you've already downloaded log lines from a previous download into an analytics tool like Splunk only to find additional log lines added in a subsequent download.

There's one option that simplifies this a little bit. You can overwrite past data every time you run your integration. This is what we do with some analytics apps that work with Event Log Files; the job automatically overwrites the last seven days worth of log data with each job rather than appending new data and de-duplicating older downloads.

This may seem antithetical but believe it or not, having look back is a really good thing because it increases the reliability and eventual consistency when working with logs to ensure you get all of the data you expect to be there.

19 October 2015

ELF on ELK on Docker

The ELF on ELK on Docker repository is available!

You can download it from Github: https://github.com/developerforce/elf_elk_docker.

What in the world is ELK? How does an ELF fit on top of an ELK? Who is this Docker I keep hearing about? Why do I feel like I've fallen down the on-premise rabbit hole of acronym based logging solutions??!!

Okay, lets back up a second. We're trying to solve the problem of creating insights on top of Event Log File (ELF) data.

ELF stands for Event Log Files. It's Salesforce's solution for providing an easy to download set of organization specific log files. Everything from page views to report downloads. You can't really swing a cat by it's tail (not that I really would try) without hitting a blog post on SalesforceHacker.com about Event Log Files. Event Monitoring is the packaging of Event Log Files.

Since we launched Event Log Files last November, I've talked with a lot of customers about how to derive insights and visualizations on top of the log data. One of the solutions I keep hearing about is the ELK stack.

ELK stands for Elasticsearch, Logstash, and Kibana. The ELK stack is an open-source, scalable log management stack that supports exploration, analysis, and visualization of log data.

It consists of three key solutions:

Elasticsearch: A Lucene-based search server for storing log data.
Logstash: ETL process for retrieving, transforming, and pushing logs into data warehouses.
Kibana: Web GUI for exploring, analyzing, and visualizing log data in Elasticsearch.

ELK requires multiple installations and configurations on top of commoditized hardware or IaaS like AWS. To simplify the installation and deployment process, we use Docker.

Docker is an emerging open source solution for software containers. From the Docker website:

"Docker is an open platform for building, shipping and running distributed applications. It gives programmers, development teams and operations engineers the common toolbox they need to take advantage of the distributed and networked nature of modern applications."

With Docker, all the user needs to do to start working with ELF on ELK is:

download the ELF on ELK from Github
change the sfdc_elf.config file (add authorization credentials)
run Docker from the terminal

The purpose of the plug-in is to reduce the time it takes integrating Event Log Files into ELK, not to provide out-of-the-box dashboards like this one that I quickly created:

As a result, once you start importing Event Log Files into ELK through this ETL plug-in, you'll still need to create the visualizations on top of the data. The advantage of Kibana is that it makes that part point-and-click easy.

Depending on how you configure Docker and ELK, you might want to expose your new dashboards onto to the corporate network. I found the following terminal command helps to enable access across the VPN:

VBoxManage controlvm "default" natpf1 "tcp-port8081,tcp,,8081,,8081";

ELF on ELK on Docker provides an on-premise, scalable solution for visualizing Event Monitoring data.

The ELF on ELK on Docker plug-in was created by the dynamic duo of Abhishek Sreenivasa and Mohammaed Islam.

Let us know what you think!

12 October 2015

IdeaExchange 50K Club

Platform Monitoring team at Salesforce

At Dreamforce 2015, the Platform Monitoring team received an award for retiring fifty thousand idea exchange points. I'm really proud of our team and our accomplishments!

This was a big deal for us for a number of reasons. The team was formed two years ago to solve the problem of providing customer access to log and event data. This data helps customers gain insights into the operational health of their organization including how to support their end-users, audit user activity, and optimize programmatic performance. But in addition, we put together a temporary team of architects to take on some additional stories that weren't part of the team's charter but we felt nonetheless should be completed.

And in that time we closed some of the following ideas:

We take the ideas on the IdeaExchange seriously on our team because we know it's important to our customers and our users.

I wish we could get to all of the ideas that are out there. But prioritizing is a bit like being Charlie in Willy Wonka and the Chocolate Factory. Early in the movie, when Charlie walks into the candy store, he has to choose from everything but only has the one dollar that he found. I always wonder how he knew to pick a Wonka bar from everything he could choose from to spend his single dollar.

It's not an easy task to prioritize the stories we complete. Many of the great ideas on the IdeaExchange seem like they should be really easy to solve. But in truth, if they were easy to solve, we probably would have solved them a long time ago. Unfortunately, we have to choose and it's never an easy choice.

I hope you keep posting ideas, commenting on them, and voting on them. Even though sometimes it may not seem like we're listening, we are and we're really excited to complete as many of these ideas as we can!

09 October 2015

Cause and Consequence: Need of real time security actions

Hi there! I'm Jari Salomaa (@salomaa) and I recently joined Salesforce to work with security product management with various exciting new features, frameworks and capabilities in always interesting world of security, privacy and compliance. I have history with static, dynamic and behavioral analysis that we can talk about some other time..

One of these cutting edge projects that I'm collaborating with the founder of this fantastic blog, Adam Torman, is Transaction Security.

Introduction

Transaction Security is a real time security event framework built inside Salesforce Shield, which is a new very focused security offering from Salesforce for our most sophisticated customers with specific security needs. Having security built in to the Salesforce platform gives customers the best breed performance, rich intelligence and flexible user experience ready to integrate with customer's existing security investments, visualizations, dashboards and so on.

Salesforce Shield offers various security components, where Event Monitoring offers the most value in the areas of forensic investigations to dig deep, who - where - what and how.

Having access to Salesforce Shield - Event Monitoring logs gives Salesforce administrators capabilities to integrate the different events flows such as "login", "resource access", "entity" and "data exfiltration" event information to their data visualization dashboards like Splunk, New Relic, Salesforce Wave etc.

Once administrators and organizations have come to terms with their prioritized security use cases from their Event Monitoring Logs they can use Transaction Security framework to build real time security policies. Transaction Security can apply Concurrent Session Login Policy logic, for example, to enable only two administrator sessions may be open at any given time or users with the Standard User Profile should be limited to five active concurrent sessions. If for some reason end users would have more open sessions, they would be automatically forced to close them before continuing. Real time. As it happens.

Building Real Time Security Policies

Discussions with many security teams around the world highlight the question that who is accessing my data, exfiltrating or downloading my data and what can I do about it? Since Salesforce touches the many aspects of business lifecycle, what is important and confidential may be different from one company to another. This is why we have chosen to introduce Transaction Security in the form of a easy to use interface where you define the event type.

We currently support four (4) different event types:

Login - for user sessions
Entity - for authentication providers, sessions, client browsers and IP
DataExport - for Account, Contact, Lead and Opportunity objects
AccessResource - for connected apps, reports and dashboards

Each of the corresponding real time event has a set of defined actions.

Administrators can choose from receiving email notifications and in-app notifications to real time actions of either block, enforcing two factor authentication (2FA) or choosing to end the active session. You can also choose to take no action and just receive real-time alerts. Isn't that neat?

Each policy type automatically generates APEX code, that is highly customizable for your needs around defining the specific condition or additional criteria around the action.

As a security administrator in Salesforce you can edit the APEX to define more specific condition for the action. As an example you can define the action to only exhibit when specific platform conditions occur.

For example you may want to restrict access to specific corporate platforms, if you have corporate phone program like iOS or Android or specific operating systems in use, like Windows or OS X or Safari or Chrome, you can block those access requests coming from different environment unapproved by IT. Or at least ask a higher assurance with two factor authentication to validate they are not coming from unwanted and untrusted sources. This might be a really useful way for you to protect sensitive reports and dashboards, mass data exports with Dataloader or just simply user or administrator logins.

Next Steps

So what can customers do to enable real time security policies for their Salesforce applications?

You are required to have Salesforce Shield and Event Monitoring as a prerequisite to have Transaction Security enabled to your production Orgs. Please have a conversation with your Salesforce Account Executive about Salesforce Shield. We have also enabled Transaction Security policies in the developer org's enabling you to try before you buy.

Once enabled, you should point your mouse to Setup -> Transaction Security and Enable Transaction Security Policies. Have a look at the security release notes and product help documentation for additional Apex class examples.

You can also follow me and send questions on Twitter with handle @salomaa or send in your questions or comments below here. Looking forward hearing what you think!

05 October 2015

Event Monitoring Trailhead Module

Last march, I wrote a blog post on how to get started with Event Monitoring. It was a huge help to many of my customers because it got them up and running quickly with Event Monitoring.

Recently, our fantastic documentation writer, Justine Heritage created an incredible trailhead module that surpasses my blog posting and provides a great experience for learning about Event Monitoring. It's divided into three units: Get Started with Event Monitoring, Query Event Log Files, and Download and Visualize Event Log Files. It even has a fictitious character named Rob Burgle who burgled intellectual property before leaving his company.

But what exactly is trailhead?

Trailhead is an e-learning system created by the Salesforce developer marketing team with content generated from the documentation writers in research and development (R&D). Trailhead's mission is to help people learn more about the Salesforce platform and applications in fun and rewarding self-paced learning environment. Each learning module has units of content. Each unit has some kind of challenge such as a quiz or an interactive exercise. And upon successful completion of the challenge, the learner is presented a badge.

Why is the team make up important?

Because the developer marketing team has long had the mission of evangelizing the Salesforce platform. As a result, they have the background and experience to help people understand how to build apps on the platform. And the documentation team has direct access to the latest and greatest features and changes because they're integrated into the R&D scrum process. As a result, they are in the best position to generate and update content about products.

E-learning has been around for a long time at Salesforce. I used to help write e-learning modules when I was on the training team. Salesforce University has had various levels of e-learning for years. Nonetheless, trailhead is something different. If you came to Dreamforce 2015, you would have seen the buzz in the developer and administrator zone around trailhead. You might have even seen a bear walking around evangelizing the learning platform.

What's the big deal if e-learning has been around for a long time? Why is this time different?

For a couple of reasons:

The generation of content is a somewhat disruptive model organizationally. It's not instructional designers from training creating content, it's doc writers - the same people who write the online help users use every day.
The delivery of the content is done outside of the traditional Learning Management Systems that training typically uses which provides some flexibility to custom design a system to deliver content. And the content is fast - few modules have units longer than ~15 minutes.
Each module has a challenge. Whether that challenge is a quiz or a hands-on activity in a developer edition org, it forces a level of interactivity that is typically higher than what traditional e-learning systems can provide.
Learners earn badges that they can promote on social media. Gamificiation isn't new, nor is it new in e-learning. But it's the icing on the cake and a motivator for people to finish modules rather than abandon them mid-way.
It's free to sign up and use.

Since we unveiled the Event Monitoring trailhead module, we've had @mentions in over a dozen blogs and many twitter posts just on this one module. Below is a short list of just some of them:

Trailhead is a phenomenon. It's different than traditional e-learning. And it's an incredible way to learn about Event Monitoring. Give it a try. Who knows, you might even earn your first badge!

To start your new learning adventure with Event Monitoring and trailhead, click here.

03 August 2015

Who saw the CEO's pension record?

That's how the use case starts. It's like a game of Clue; it's very open ended and leaves a lot of open questions that need to be answered like:

why is it important to track this?
who really needs this information? Legal, risk, admin, an auditor, etc...
do you need to know every / any place where the data was 'seen' like a report view, list view, search result, API query, lookup, or any of the other ways you can access data on the platform or just when they drill into the record detail (e.g. is it sufficient to just track who clicked into the record)?
what data did they see when they saw the record; in particular did they see PII (personally identifiable information) or sensitive (salary, amount invested, diagnosis, etc...) data?
what did they do with the data after they saw it?
most importantly, once they have this information, what will they do with it? Take an employee to court, terminate, put them up on a wall of shame, document for a regulator?

These are difficult questions. Regardless of the answers, there are a lot of things Event Monitoring can do to help with this use case.

Since the primary case was answering who saw the CEO's pension record, lets walk through that example with Event Monitoring.

It starts with a record id. Did you ever notice that when you view a record, the browser address bar shows the record id?

A record id is an immutable (which means it can't be edited), fifteen character, unique record locator. Because it never changes, you can always find the record it belongs to. All you have to do is drop it in the address bar or use the API to query based on the id. That way, even if you rename the record, we'll always be able to track down the original easily.

As a result, every time a user clicks into a record, we capture that id in the address bar. It's possible we capture more information as well, for instance, whether the record was transferred, edited, cloned, or printed. The key here is that there has to be a server interaction that changes the address in the browser address bar. So if you click a link that only changes something in the browser, we won't capture it. But if the screen refreshes with a new link in the address bar, chances are we've captured it in the URI log file.

In the case of the CEO's pension record, we clicked into the record:

As a result, we can track this in the URI log file:

The log file won't tell us the name of the record or any details like the CEO's pension amount; however, we can always query the API to find out more about the record that was accessed:

Now it's possible that we saw the CEO's pension record in a report:

We'll track that the report was accessed in both the Report and URI log files, but we won't list the records that were viewed like the CEO's pension record. However, if the user clicked into the record from the report, we'll capture the click through.

In addition, if the user exported the results from the report, we won't know that the user exported the CEO's pension record; however, we will be able to identify the criteria used in the report and the fields included in the export in the Report Export log file:

Similarly, a user might search for the CEO's pension record:

We won't store the records viewed in the search results nor the field values that were viewed; however, if you click into the record from a search view, we do capture both the record id and the search position from the address bar:

And if the user accesses the pension record from an integrated application using the API, we'll know if they queried (SOQL) or searched (SOSL) the pension object, but not the specific results from the query unless they updated it through the API:

Event Monitoring captures a lot of data for many different use cases. Understanding specifically what it captures from what it doesn't helps ensure we meet the right use case each time.

28 July 2015

Who stole the cookie from the cookie jar?

Sample Visualforce Page using Google Charting API

Have you ever needed to track what users view, not just what they change? Have you ever had security, risk, or legal ask for a report on user activity for audit or regulatory reasons? Have you ever needed to track user's actions down to the device level so that activity on the phone, tablet, and web desktop are tracked separately?

Starting with the Summer '15 release, we're introducing key data leakage detection information through a pilot program. The goal of the pilot is to enable customers to query specific data leakage use cases in near real-time for analysis purposes.

The initial pilot of the Data Leakage Detection pilot tracks SOQL queries in near real-time from the SOAP, REST, and Bulk APIs. Because greater than half of all data accessed on the platform is performed via these APIs, organizations can gain greater insights into:

Who saw what data
When they saw that data
Where they accessed the data
What fields they accessed
How long a query took
How many records they accessed

When combined with the Login Forensics pilot, you can also track every query back to a unique login to identify anomalies in user behavior.

Each event consists of key information about the API transaction including:

AdditionalInfo
ApiType
ApiVersion
Client
ElapsedTime
EventTime
Id
LoginHistoryId
ObjectType
Operation
RowsProcessed
Soql
SourceIp
UserAgent
UserId
Username

This means you can find out who (e.g. Username), saw what (e.g. Soql) including sensitive PII (Personally Identifiable Information) fields, how much (e.g. RowsProcessed), how long (e.g. ElapsedTime), when (e.g. EventTime), how (e.g. UserAgent), and from where (e.g. SourceIp). Plus, you can correlate all of this information back to the original Login (e.g. LoginHistoryId) to profile user behavior and disambiguate between legitimate and questionable activity beyond the login.

Once the pilot is enabled in your organization, you can visualize a set of events using the sample Visualforce page with Google Charting API from my Github repository.

To learn more specifics about the Data Leakage Detection Pilot functionality, read the pilot tip sheet and to participate in the pilot, please contact your account executive or customer support rep.

13 July 2015

Track your Apex limits in production using Apex Limit Events

Salesforce Apex is the backbone of the programmatic platform. With it you can push the customization of any org beyond the button click realm.

Example of an app you can build with the
Apex Limit Events Pilot

Apex runs in a multitenant environment. The Apex runtime engine strictly enforces limits to ensure that runaway Apex doesn’t monopolize shared resources. If some Apex code ever exceeds a limit, the associated governor issues a run-time exception that cannot be handled. Apex limits are defined in the Salesforce documentation.

I wrote about tracking Apex limits in a blog post last May (That which doesn't limit us makes us stronger). As a force.com developer, you have the ability to instrument your Apex code with System.Limits() methods that allow you to compare how much you're using against the ceiling of what's allowed by any limit.

However, the more instrumentation you add, the more opportunity you have for code and performance issues. It's the Heisenberg concept - that which you try to measure affects the measurement. And even when you do try to track limits in this fashion, the results are stored in developer oriented tools like the Debug Logs or the Force.com Console logs which are really only meant to be used in sandbox. Bu what about DevOps? How can they monitor the health of their developer's code in production?

As a reaction to this, we are introducing the Apex Limit Events pilot program in Summer '15. The goal of this pilot is to enable operations to monitor their production instances in near real-time and with automated hourly roll-up aggregate metrics that tell you the state and health of your Apex according to their limits.

Each event consists of key information about the Apex execution in the context of a limit including:

EntryPointId
EntryPointName
EntryPointType
EventTime
ExecutionIdentifier
Id
LimitType
LimitValue
NamespacePrefix
UserId
Username

Each hourly metric consists of key aggregate information including:

Distinct Number of Apex Transactions
Distinct Number of Apex Transactions With Limits Exceeding 60% Threshold
Distinct Number of Apex Transactions With Limits Exceeding 60% Threshold By Entry Point Name
Distinct Number of Apex Transactions With Limits Exceeding 60% Threshold By Limit Type
Average Limit Value By Entry Point Name
Average Limit Value By Limit Type

Apex limit events is accessible via the Salesforce public API through the apexlimitevent sobject. The only user interface is an org preference in setup that an administrator can control to enable or disable the feature within their org.

To help developers and operations bootstrap their efforts to take advantage of this feature, I built a Github repo including a:

1. Visualforce page and tab integrating Apex limit events with the Google Charting API
2. LimitsTest Apex class that exposes a REST API webservice you can use to intentionally exceed limits and load data into the apexlimitevents sobject
3. ApexLimitJob Apex class that can be used with scheduled Apex to generate limits
4. limitsHammer python code that makes it easy to generate limits data

To participate in the pilot, contact salesforce support or your account executive. For more information about the pilot, check out the tip sheet.

09 June 2015

The Salesforce Hacker Way

This post is dedicated to the innovation exchange students who visited Salesforce on their recent tour of Silicon Valley companies. Innovation exchange is a summer program offered by my alma mater, James Madison University (JMU).

As JMU alumni, a couple of us here at Salesforce hosted a question and answer session for the students in this program. The panel members came from product, engineering, design, and analyst relations.

The students came prepared with a set of great questions like:

How do you know when you have enough product to launch?
How do you know when to kill a product?
How do you take an idea to market?
How is Salesforce different from Dropbox?
Who competes with Salesforce?
How do you get a job out of college?

We shared our perspective about what it means to work as a team building products that customers love. We talked about ideas like launch vehicles (e.g. pilots vs betas), about getting the right people in the design room early, and about our experiences getting our first jobs out of college as well as the meandering path that led us all to Salesforce.

It was a great session and it got me thinking about sharing some fundamental product guidelines that I call the Salesforce Hacker way (in priority order because I'm a product manager and that's just how I roll):

Have faith. Not faith in the religious sense of the word but instead faith in yourself and your team. Otherwise, how will your ideas survive the dark night of other people telling you to work on other priorities first?
Figure out what's most important. When stack ranking ten stories, there's the first priority and then there's everything else. And whatever you choose, someone will tell you it is wrong. See rule one. Rob Woollen, former SVP of platform, gave this tip to me.
People over technology every time. Creating new products using new technologies like BigData for the Internet of Things use cases using Agile methodology with Full Stack developers may win you a game of engineering bingo. But if it doesn't solve real people's problems, what good is it? It's like a tree falling in the forest and no one being around to hear it.
Always be listening. Sales has their mantra, 'always be closing'. But for product, it's listening. Incidentally, when people say they are good listeners, they're talking, not listening. Listening is a skill everyone needs to develop; even those people who think they're good at it.
Fail fast, fail often, fail spectacularly. Fear of failure blocks innovation. Learning from failure enables the next product to be better. Make sure your product organization supports your ability to fail as much as your ability to succeed. One of my best learning experiences came from a product that never launched leading to my next product being a success.
Complex designs lead to simple user experiences, simple designs lead to complex user experiences. How can you scope, constrain, and deliver the minimal amount of product necessary to trade customer value for feedback. The best feedback comes from less product, not more. I got this gem from the venerable Craig Villamor during a PTOn project a couple of years back. PTOn is where you take time out to work on a project that is not necessarily related to your current goals or team objectives.
Iterate, iterate, iterate. The longer a product stays in design or development, the bigger the chance of it never launching. Iterating enables a feedback loop from customers which will influence a roadmap of enhancements.
Change takes courage. Designing and delivering a product may mean changing someone's concept of how things currently work. Treat that transition with respect and find ways to overcome the fear that comes with any change.
Stay calm, especially when everyone else isn't. This one is pure psychology. When stressed, our fight or flight reflex takes over. But what if there was a third option? When everyone else is stressed, a calm voice of reason can diffuse almost any situation.
You'd be amazed at what a smile and question can accomplish. Seems silly but keep in mind that everyone in the product lifecycle will protect resources, over estimate time or effort, and challenge ideas or priorities. These are actually good things to have in a product lifecycle. So keep smiling and don't be afraid to ask a clarifying question. My favorite starts with, 'why...?' and usually ends with, 'if you had this product, what would you actually do with it.'
Transparency and honesty is critical. If your not willing to communicate via a banner trailing in the sky behind a plane, then you need to question what you're saying and to whom.
The size of an opportunity is directly proportional to the size of the problem. Keep looking for ways to disrupt people's mindsets and be ready to embrace it when it comes.
Momentum is not the same as inertia. Never dismiss the power of momentum or how hard it is to start building a product. Never mistake inertia for forward progress. Products die sooner from inertia but get built with the right amount of momentum.
Written specs are out of date as soon as the first person reads them. Requirements come from the tests and the documentation comes from the code. Everything else is a well articulated conversation between partners who have taken ownership of an idea.
Have Fun. Building product isn't easy but it is a lot of fun. If it's not fun, find a new job.
Come up with ten impossible ideas everyday. This exercise reminds me about possibilities in the face of compromise and even the basic laws of physics. This exercise actually originated in Alice in Wonderland.

I don't really expect anyone, including the students I met last week, to follow any of these guidelines. In some situations, I don't even follow all of them. That's why they're called guidelines.

It took me a long time to learn them through a set of experiences and challenges that culminated in successful products that I'm truly proud of. Many of these guidelines are inherited, borrowed, and stolen from the incredible people I work with on a daily basis. But if you work with product or aspire to build incredible products one day and you don't have any framework already, you might find some minimally viable goodness in these words. And from their, iterate, iterate, iterate.

That's the Salesforce Hacker way.

01 June 2015

Cloudlock Event Monitoring Viewer

Event Monitoring makes downloading log files easy using the Salesforce API.

But what happens when you just want a quick look at the events in those files? And how can you easily map the location of each event based on the IP address?

API first features like Event Monitoring make it easy to create apps that meet a wide set of use cases. Last week's blog post was about an app that makes downloading Event Monitoring files easy. This week's blog post is about an app that makes it easy to view event log files within a salesforce organization.

Cloudlock, one of our Event Monitoring partners, created a free app using the Salesforce 1 platform that you can install in your org. It enables administrators to easily filter files by date and type, view events within smaller files, download larger files that may not be easily viewable in the page, and map events by IP with a Google maps mashup.

Cloudlock also provides an integration with Report Exports as well as a host of other incredible security features with their full paid for app offering. You can read more about it and visualizing anomalies by clicking the Learn More button in the app.

You can install this app using this link. It's free to use and provides access to your Event Monitoring data within your salesforce organization.

It's now easier than ever to view and map your Event Log File data with this great free app from Cloudlock!

26 May 2015

Download Event Log Files Using the ELF Browser

Event Monitoring makes downloading application logs easy using the Salesforce API.

But what happens if you don't know how to use the API? Or you don't have an operating system that makes running a download script easy? Or you've never written a download script before? Or you just want a quickly download a newly generated file without messing around with code?

Introducing the Salesforce Event Log File Browser: https://salesforce-elf.herokuapp.com/.

This browser based app, built by Abhishek Sreenivasa and the platform monitoring team, uses Ruby on Rails and is hosted on Heroku.

It's designed to enable administrators or developers, who just want to focus on the log data, to easily download an Event Log File without writing any code or setting up any integrations. This makes it perfect for both trying out Event Monitoring as well as getting started downloading log files.

The app is designed to be very simple. After logging into your production or sandbox organization using OAuth, you are presented a list of downloadable files.

Because you may have up to thirty days of files, you can filter on both the date range and the file type to find specific files that you want to download.

You can choose to either download the file by selecting the green download action icon or you can get a jump start on a simple cURL script by selecting the light blue page action. The latter action was created to help bootstrap the integration effort. For instance, if an integration specialist asks how to create a script to automate the downloads on a daily basis, you could give them this script to help get them started.

The code for this app is available publicly on Github. You can log any issues you may encounter directly to this Github repo as well. The app is licensed under MIT licensing terms, so you're free to take the source code and modify it to meet your use case.

API first features like Event Monitoring make it easy to create apps that meet a wide set of use cases. The Salesforce Event Log File Browser app is just one example.

While this browser doesn't take the place of an automated download script, it does simplify both the trial experience as well as enable simple downloads of Event Log Files without writing any code or understanding how OAuth works. And because all organizations now have at least login and logout log file types, if not all twenty nine types, anyone should be able to use it. Happy downloading!

Salesforce Hacker