21 October 2013

Visualizing Identity Fraud Using Login History



I love the cool data visualizations that I find on the web. I spend hours upon hours perusing great visualizations and infographics on sites like visual.ly, flowingdata.com, and Information is Beautiful. It's amazing how much I have learned about the US debt ceiling, the complete history of coffee, or ten things I never knew about Disneyland.

While these sites visualize great esoteric knowledge, I'm always amazed at the lack of great, easy to use data visualization tools within the enterprise to understand common problems like discovering when a user's credentials have been compromised.

In salesforce.com, administrators can use login history to help solve the problem of discovering identity fraud. Administrators can download the last six months of login data including who logged in, how they are did it, and from where. This is incredibly valuable data but is also a sea of information that only grows with the number of users and the adoption of the platform.

Yesterday, I came across a great new app that makes visualizing data sets like this incredibly easy.

The site is called Raw from Density Designs. It allows you to drop data into a free form text box and declaratively drag-and-drop from any number of easy to use vector graphics including treemaps, bubble charts, dendrograms, alluvial diagrams, circle packings, and scatterplots.

To try this, use workbench to first download your login history.
You should also download your users so that you can transform the user id in login history into user name which will make it easier to visualize your data.
To transform your user ids to user names, use the VLOOKUP function. You don't have to do this but it really will make your visualization easier to understand.

Once you have your complete login history data, copy and drop it into the Raw free form text box.
One of the great features of this app is that it automatically understands the data and gives you options. I kept it simple and decided I wanted to understand the relationship of users and how they logged into my organization so I used an alluvial diagram and compared the username, platform, sourceip, browser, and application attributes.
This gave me a great visualization of how these attributes relate to one another.

It also gave me the ability to export the resulting visualization to SVG, PNG, or JSON so I could use them. The first image in this blog post came from the PNG export.
With this visualization, I can now look for anomalies where a user may log in from an unknown IP, browser, application, or computing platform. These anomalies help me understand whether a user has had their account compromised and whether identity fraud has taken place within my organization.

Visualization of large data sets is all over the internet and there are great tools to help explore the data. Why shouldn't we be able to use them to solve our everyday enterprise sized problems?

3 comments:

  1. I think your first workbench screenshot might be wrong, it's showing a query on CrossOrgEventLogFileMetrics, which isn't exposed to non-SFDC orgs. Assuming you had a screenshot of exporting LoginHistory instead?

    I was also going to comment that you don't need two queries and a VLOOKUP, but it seems that you do since the userId on LoginHistory is a polymorphic lookup and unless you have TYPEOF you do have to do it this way.

    Very neat example!

    ReplyDelete
    Replies
    1. Hi Chris,

      Great catch on the screenshot - just updated it to reflect the correct SOQL.

      It's a bummer on the UserId field - I was really hoping to pull out more information about user with the login history SOQL query but then I remembered that they had to make some optimizations on that table since it's one of the bigger tables out there.

      Glad you like the blog posting!

      Delete
  2. Hmmm...I wonder if you could do something like this completely on platform using a Login History report, some VF and the Analytics API.

    ReplyDelete