03 August 2015

Who saw the CEO's pension record?

That's how the use case starts. It's like a game of Clue; it's very open ended and leaves a lot of open questions that need to be answered like:
  • why is it important to track this?
  • who really needs this information? Legal, risk, admin, an auditor, etc...
  • do you need to know every / any place where the data was 'seen' like a report view, list view, search result, API query, lookup, or any of the other ways you can access data on the platform or just when they drill into the record detail (e.g. is it sufficient to just track who clicked into the record)?
  • what data did they see when they saw the record; in particular did they see PII (personally identifiable information) or sensitive (salary, amount invested, diagnosis, etc...) data?
  • what did they do with the data after they saw it?
  • most importantly, once they have this information, what will they do with it? Take an employee to court, terminate, put them up on a wall of shame, document for a regulator?
These are difficult questions. Regardless of the answers, there are a lot of things Event Monitoring can do to help with this use case.

Since the primary case was answering who saw the CEO's pension record, lets walk through that example with Event Monitoring.

It starts with a record id. Did you ever notice that when you view a record, the browser address bar shows the record id?

A record id is an immutable (which means it can't be edited), fifteen character, unique record locator. Because it never changes, you can always find the record it belongs to. All you have to do is drop it in the address bar or use the API to query based on the id. That way, even if you rename the record, we'll always be able to track down the original easily.

As a result, every time a user clicks into a record, we capture that id in the address bar. It's possible we capture more information as well, for instance, whether the record was transferred, edited, cloned, or printed. The key here is that there has to be a server interaction that changes the address in the browser address bar. So if you click a link that only changes something in the browser, we won't capture it. But if the screen refreshes with a new link in the address bar, chances are we've captured it in the URI log file.

In the case of the CEO's pension record, we clicked into the record:

As a result, we can track this in the URI log file:

The log file won't tell us the name of the record or any details like the CEO's pension amount; however, we can always query the API to find out more about the record that was accessed:

Now it's possible that we saw the CEO's pension record in a report:

We'll track that the report was accessed in both the Report and URI log files, but we won't list the records that were viewed like the CEO's pension record. However, if the user clicked into the record from the report, we'll capture the click through.

In addition, if the user exported the results from the report, we won't know that the user exported the CEO's pension record; however, we will be able to identify the criteria used in the report and the fields included in the export in the Report Export log file:

Similarly, a user might search for the CEO's pension record:

We won't store the records viewed in the search results nor the field values that were viewed; however, if you click into the record from a search view, we do capture both the record id and the search position from the address bar:

And if the user accesses the pension record from an integrated application using the API, we'll know if they queried (SOQL) or searched (SOSL) the pension object, but not the specific results from the query unless they updated it through the API:

Event Monitoring captures a lot of data for many different use cases. Understanding specifically what it captures from what it doesn't helps ensure we meet the right use case each time.

28 July 2015

Who stole the cookie from the cookie jar?

Sample Visualforce Page using Google Charting API
Have you ever needed to track what users view, not just what they change? Have you ever had security, risk, or legal ask for a report on user activity for audit or regulatory reasons? Have you ever needed to track user's actions down to the device level so that activity on the phone, tablet, and web desktop are tracked separately?

Starting with the Summer '15 release, we're introducing key data leakage detection information through a pilot program. The goal of the pilot is to enable customers to query specific data leakage use cases in near real-time for analysis purposes.

The initial pilot of the Data Leakage Detection pilot tracks SOQL queries in near real-time from the SOAP, REST, and Bulk APIs. Because greater than half of all data accessed on the platform is performed via these APIs, organizations can gain greater insights into:

  • Who saw what data
  • When they saw that data
  • Where they accessed the data
  • What fields they accessed
  • How long a query took
  • How many records they accessed

When combined with the Login Forensics pilot, you can also track every query back to a unique login to identify anomalies in user behavior.

Each event consists of key information about the API transaction including:

  • AdditionalInfo
  • ApiType
  • ApiVersion
  • Client
  • ElapsedTime
  • EventTime
  • Id
  • LoginHistoryId
  • ObjectType
  • Operation
  • RowsProcessed
  • Soql
  • SourceIp
  • UserAgent
  • UserId
  • Username

This means you can find out who (e.g. Username), saw what (e.g. Soql) including sensitive PII (Personally Identifiable Information) fields, how much (e.g. RowsProcessed), how long (e.g. ElapsedTime), when (e.g. EventTime), how (e.g. UserAgent), and from where (e.g. SourceIp). Plus, you can correlate all of this information back to the original Login (e.g. LoginHistoryId) to profile user behavior and disambiguate between legitimate and questionable activity beyond the login.

Once the pilot is enabled in your organization, you can visualize a set of events using the sample Visualforce page with Google Charting API from my Github repository.

To learn more specifics about the Data Leakage Detection Pilot functionality, read the pilot tip sheet and to participate in the pilot, please contact your account executive or customer support rep.

13 July 2015

Track your Apex limits in production using Apex Limit Events

Salesforce Apex is the backbone of the programmatic platform. With it you can push the customization of any org beyond the button click realm.

Example of an app you can build with the
Apex Limit Events Pilot
Apex runs in a multitenant environment. The Apex runtime engine strictly enforces limits to ensure that runaway Apex doesn’t monopolize shared resources. If some Apex code ever exceeds a limit, the associated governor issues a run-time exception that cannot be handled. Apex limits are defined in the Salesforce documentation.

I wrote about tracking Apex limits in a blog post last May (That which doesn't limit us makes us stronger). As a force.com developer, you have the ability to instrument your Apex code with System.Limits() methods that allow you to compare how much you're using against the ceiling of what's allowed by any limit.

However, the more instrumentation you add, the more opportunity you have for code and performance issues. It's the Heisenberg concept - that which you try to measure affects the measurement. And even when you do try to track limits in this fashion, the results are stored in developer oriented tools like the Debug Logs or the Force.com Console logs which are really only meant to be used in sandbox. Bu what about DevOps? How can they monitor the health of their developer's code in production?

As a reaction to this, we are introducing the Apex Limit Events pilot program in Summer '15.  The goal of this pilot is to enable operations to monitor their production instances in near real-time and with automated hourly roll-up aggregate metrics that tell you the state and health of your Apex according to their limits.

Each event consists of key information about the Apex execution in the context of a limit including:
  • EntryPointId
  • EntryPointName
  • EntryPointType
  • EventTime
  • ExecutionIdentifier
  • Id
  • LimitType
  • LimitValue
  • NamespacePrefix
  • UserId
  • Username
Each hourly metric consists of key aggregate information including:
  • Distinct Number of Apex Transactions
  • Distinct Number of Apex Transactions With Limits Exceeding 60% Threshold
  • Distinct Number of Apex Transactions With Limits Exceeding 60% Threshold By Entry Point Name
  • Distinct Number of Apex Transactions With Limits Exceeding 60% Threshold By Limit Type
  • Average Limit Value By Entry Point Name
  • Average Limit Value By Limit Type
Apex limit events is accessible via the Salesforce public API through the apexlimitevent sobject. The only user interface is an org preference in setup that an administrator can control to enable or disable the feature within their org.

To help developers and operations bootstrap their efforts to take advantage of this feature, I built a Github repo including a:

1. Visualforce page and tab integrating Apex limit events with the Google Charting API
2. LimitsTest Apex class that exposes a REST API webservice you can use to intentionally exceed limits and load data into the apexlimitevents sobject
3. ApexLimitJob Apex class that can be used with scheduled Apex to generate limits
4. limitsHammer python code that makes it easy to generate limits data

To participate in the pilot, contact salesforce support or your account executive. For more information about the pilot, check out the tip sheet.