29 October 2015

LogDate vs CreatedDate - when to use one vs the other in an integration

Why are there two date-time fields for each Event Log File: LogDate and CreatedDate? Shouldn't one be good enough?

Seems like it should be a straightforward question, but it comes up frequently and the answer can effect how you integrate with Event Log Files.

Lets start with the definition of each:
  • LogDate tracks usage activity of a log file for a 24-hour period, from 12:00 a.m. to 11:59 p.m.
  • CreatedDate tracks when the log file was generated.
Why is this important? Why have these two different timestamps? Because having both ensure the reliability and eventual consistency of log delivery within a massively distributed system.

Each customer is co-located on a logical collection of servers we call a 'pod'. You can read more about pods and the multi-tenant architecture on the developer force blog.

There can be anywhere from ten to one hundred thousand customer organizations on a single pod. Each pod has numerous app servers which, at any given time, handle requests from any of those customer organizations. In such a large, distributed system, it's possible, though not frequently, for an app server to go down.

As a result, while a customer's transactions are being captured, if an app server does goes down, their transactions can be routed to another server seamlessly without affecting the end-user's experience or the integrity of the data. For a variety of reasons, app servers can go up or down throughout the day but what's important is that this activity doesn't affect the end-user's experience of the app or the integrity of the customer's data.

Each app server captures it's own log files throughout the day regardless of which customer's transactions are being handled by the server. Each log file therefore represents log entries for all customers who had transactions on that app server throughout the day. At the end of the day (~2am local server time), Salesforce ships the log files from active app servers to an HDFS where Hadoop jobs run. The Hadoop server generates the Event Log File content (~3am local server time) for each customer based on the app logs that were shipped earlier. This job is what generates Event Log File content which is accessible to the customer via the API.

It's possible that some log files will have to get shipped at a later date. This could have been from an app server that was offline during some part of the day that comes back on line after the log files are shipped for that day.  Therefore log files may be considered eventually consistent based on the log shipper or Hadoop job picking up a past file in a future job run. As a result, it's possible that Salesforce will catch up at a later point. We have built look back functionality to address this scenario. Every night when we run the Hadoop job, we check to see if new files exist for previous days and then re-generate new Event Log File content, overwriting the existing log files that were previously generated.

This is why we have both CreatedDate and LogDate fields - LogDate reflects the actual 24-hour period when the user activity occurred and CreatedDate reflects when the actual Event Log File was generated. So it's possible, due to look back functionality, that we will re-generate a previous LogDate's file and in the process, write more lines than we did on the previous day with the newly available app server files co-mingled with the original app server log files that were originally received.



This eventual consistency of log files may impact your integration with Event Log Files.

The easy way to integrate with Event Log Files is to use LogDate and write a daily query that simply asks for the last 'n' days of log files based on the LogDate:

Select LogFile from EventLogFile where LogDate=Last_n_Days:7

However, if you query on LogDate, it is possible to miss data that you might get from downloading it later. For instance, if you downloaded yesterday's log files and then re-download them tomorrow, you may actually have more log lines in the newer download. This is because some app log files may have caught up, overwriting the original log content with more log lines.

To ensure a more accurate query that also captures look back updates of the previous day's log files, you should use CreatedDate:

Select LogFile from EventLogFile where CreatedDate=Last_n_Days:7

This is a more complicated integration because you will have to keep track of the CreatedDate for each LogDate and EventType that was previously downloaded in the case that a CreatedDate is newer than a previously downloaded file. You may also need to handle event row de-duplication where you've already downloaded log lines from a previous download into an analytics tool like Splunk only to find additional log lines added in a subsequent download.

There's one option that simplifies this a little bit. You can overwrite past data every time you run your integration. This is what we do with some analytics apps that work with Event Log Files; the job automatically overwrites the last seven days worth of log data with each job rather than appending new data and de-duplicating older downloads.

This may seem antithetical but believe it or not, having look back is a really good thing because it increases the reliability and eventual consistency when working with logs to ensure you get all of the data you expect to be there.

19 October 2015

ELF on ELK on Docker


The ELF on ELK on Docker repository is available!

You can download it from Github: https://github.com/developerforce/elf_elk_docker.

What in the world is ELK? How does an ELF fit on top of an ELK? Who is this Docker I keep hearing about? Why do I feel like I've fallen down the on-premise rabbit hole of acronym based logging solutions??!!

Okay, lets back up a second. We're trying to solve the problem of creating insights on top of Event Log File (ELF) data.

ELF stands for Event Log Files. It's Salesforce's solution for providing an easy to download set of organization specific log files. Everything from page views to report downloads. You can't really swing a cat by it's tail (not that I really would try) without hitting a blog post on SalesforceHacker.com about Event Log Files. Event Monitoring is the packaging of Event Log Files.

Since we launched Event Log Files last November, I've talked with a lot of customers about how to derive insights and visualizations on top of the log data. One of the solutions I keep hearing about is the ELK stack.

ELK stands for Elasticsearch, Logstash, and Kibana. The ELK stack is an open-source, scalable log management stack that supports exploration, analysis, and visualization of log data.

It consists of three key solutions:
  1. Elasticsearch: A Lucene-based search server for storing log data.
  2. Logstash: ETL process for retrieving, transforming, and pushing logs into data warehouses.
  3. Kibana: Web GUI for exploring, analyzing, and visualizing log data in Elasticsearch.
ELK requires multiple installations and configurations on top of commoditized hardware or IaaS like AWS. To simplify the installation and deployment process, we use Docker.

Docker is an emerging open source solution for software containers. From the Docker website:
"Docker is an open platform for building, shipping and running distributed applications. It gives programmers, development teams and operations engineers the common toolbox they need to take advantage of the distributed and networked nature of modern applications."
With Docker, all the user needs to do to start working with ELF on ELK is:
  1. download the ELF on ELK from Github
  2. change the sfdc_elf.config file (add authorization credentials)
  3. run Docker from the terminal
The purpose of the plug-in is to reduce the time it takes integrating Event Log Files into ELK, not to provide out-of-the-box dashboards like this one that I quickly created:

As a result, once you start importing Event Log Files into ELK through this ETL plug-in, you'll still need to create the visualizations on top of the data. The advantage of Kibana is that it makes that part point-and-click easy.

Depending on how you configure Docker and ELK, you might want to expose your new dashboards onto to the corporate network. I found the following terminal command helps to enable access across the VPN:
VBoxManage controlvm "default" natpf1 "tcp-port8081,tcp,,8081,,8081";
ELF on ELK on Docker provides an on-premise, scalable solution for visualizing Event Monitoring data.

The ELF on ELK on Docker plug-in was created by the dynamic duo of Abhishek Sreenivasa and Mohammaed Islam.

Let us know what you think!

12 October 2015

IdeaExchange 50K Club

Platform Monitoring team at Salesforce
At Dreamforce 2015, the Platform Monitoring team received an award for retiring fifty thousand idea exchange points. I'm really proud of our team and our accomplishments!

This was a big deal for us for a number of reasons. The team was formed two years ago to solve the problem of providing customer access to log and event data. This data helps customers gain insights into the operational health of their organization including how to support their end-users, audit user activity, and optimize programmatic performance. But in addition, we put together a temporary team of architects to take on some additional stories that weren't part of the team's charter but we felt nonetheless should be completed.

And in that time we closed some of the following ideas:
We take the ideas on the IdeaExchange seriously on our team because we know it's important to our customers and our users. 

I wish we could get to all of the ideas that are out there. But prioritizing is a bit like being Charlie in Willy Wonka and the Chocolate Factory. Early in the movie, when Charlie walks into the candy store, he has to choose from everything but only has the one dollar that he found. I always wonder how he knew to pick a Wonka bar from everything he could choose from to spend his single dollar.

It's not an easy task to prioritize the stories we complete. Many of the great ideas on the IdeaExchange seem like they should be really easy to solve. But in truth, if they were easy to solve, we probably would have solved them a long time ago. Unfortunately, we have to choose and it's never an easy choice. 

I hope you keep posting ideas, commenting on them, and voting on them. Even though sometimes it may not seem like we're listening, we are and we're really excited to complete as many of these ideas as we can!

09 October 2015

Cause and Consequence: Need of real time security actions

Hi there! I'm Jari Salomaa (@salomaa) and I recently joined Salesforce to work with security product management with various exciting new features, frameworks and capabilities in always interesting world of security, privacy and compliance. I have history with static, dynamic and behavioral analysis that we can talk about some other time..

One of these cutting edge projects that I'm collaborating with the founder of this fantastic blog, Adam Torman, is Transaction Security.

Introduction

Transaction Security is a real time security event framework built inside Salesforce Shield, which is a new very focused security offering from Salesforce for our most sophisticated customers with specific security needs. Having security built in to the Salesforce platform gives customers the best breed performance, rich intelligence and flexible user experience ready to integrate with customer's existing security investments, visualizations, dashboards and so on.

Salesforce Shield offers various security components, where Event Monitoring offers the most value in the areas of forensic investigations to dig deep, who - where - what and how.

Having access to Salesforce Shield - Event Monitoring logs gives Salesforce administrators capabilities to integrate the different events flows such as "login", "resource access", "entity" and "data exfiltration" event information to their data visualization dashboards like Splunk, New Relic, Salesforce Wave etc.

Once administrators and organizations have come to terms with their prioritized security use cases from their Event Monitoring Logs they can use Transaction Security framework to build real time security policies. Transaction Security can apply Concurrent Session Login Policy logic, for example, to enable only two administrator sessions may be open at any given time or users with the Standard User Profile should be limited to five active concurrent sessions. If for some reason end users would have more open sessions, they would be automatically forced to close them before continuing. Real time. As it happens.

Building Real Time Security Policies

Discussions with many security teams around the world highlight the question that who is accessing my data, exfiltrating or downloading my data and what can I do about it? Since Salesforce touches the many aspects of business lifecycle, what is important and confidential may be different from one company to another. This is why we have chosen to introduce Transaction Security in the form of a easy to use interface where you define the event type.

We currently support four (4) different event types:
  1. Login - for user sessions
  2. Entity - for authentication providers, sessions, client browsers and IP
  3. DataExport - for Account, Contact, Lead and Opportunity objects
  4. AccessResource - for connected apps, reports and dashboards
Each of the corresponding real time event has a set of defined actions. 

Administrators can choose from receiving email notifications and in-app notifications to real time actions of either block, enforcing two factor authentication (2FA) or choosing to end the active session. You can also choose to take no action and just receive real-time alerts. Isn't that neat?

Each policy type automatically generates APEX code, that is highly customizable for your needs around defining the specific condition or additional criteria around the action.

As a security administrator in Salesforce you can edit the APEX to define more specific condition for the action. As an example you can define the action to only exhibit when specific platform conditions occur. 

For example you may want to restrict access to specific corporate platforms, if you have corporate phone program like iOS or Android or specific operating systems in use, like Windows or OS X or Safari or Chrome, you can block those access requests coming from different environment unapproved by IT. Or at least ask a higher assurance with two factor authentication to validate they are not coming from unwanted and untrusted sources. This might be a really useful way for you to protect sensitive reports and dashboards, mass data exports with Dataloader or just simply user or administrator logins.

Next Steps

So what can customers do to enable real time security policies for their Salesforce applications?

You are required to have Salesforce Shield and Event Monitoring as a prerequisite to have Transaction Security enabled to your production Orgs. Please have a conversation with your Salesforce Account Executive about Salesforce Shield. We have also enabled Transaction Security policies in the developer org's enabling you to try before you buy.

Once enabled, you should point your mouse to Setup -> Transaction Security and Enable Transaction Security Policies. Have a look at the security release notes and product help documentation for additional Apex class examples.

You can also follow me and send questions on Twitter with handle @salomaa or send in your questions or comments below here. Looking forward hearing what you think!

05 October 2015

Event Monitoring Trailhead Module



Last march, I wrote a blog post on how to get started with Event Monitoring.  It was a huge help to many of my customers because it got them up and running quickly with Event Monitoring.

Recently, our fantastic documentation writer, Justine Heritage created an incredible trailhead module that surpasses my blog posting and provides a great experience for learning about Event Monitoring. It's divided into three units: Get Started with Event Monitoring, Query Event Log Files, and Download and Visualize Event Log Files. It even has a fictitious character named Rob Burgle who burgled intellectual property before leaving his company.

But what exactly is trailhead?

Trailhead is an e-learning system created by the Salesforce developer marketing team with content generated from the documentation writers in research and development (R&D). Trailhead's mission is to help people learn more about the Salesforce platform and applications in fun and rewarding self-paced learning environment. Each learning module has units of content. Each unit has some kind of challenge such as a quiz or an interactive exercise. And upon successful completion of the challenge, the learner is presented a badge.

Why is the team make up important?

Because the developer marketing team has long had the mission of evangelizing the Salesforce platform. As a result, they have the background and experience to help people understand how to build apps on the platform. And the documentation team has direct access to the latest and greatest features and changes because they're integrated into the R&D scrum process. As a result, they are in the best position to generate and update content about products.

E-learning has been around for a long time at Salesforce. I used to help write e-learning modules when I was on the training team. Salesforce University has had various levels of e-learning for years. Nonetheless, trailhead is something different. If you came to Dreamforce 2015, you would have seen the buzz in the developer and administrator zone around trailhead. You might have even seen a bear walking around evangelizing the learning platform.

What's the big deal if e-learning has been around for a long time? Why is this time different?

For a couple of reasons:
  1. The generation of content is a somewhat disruptive model organizationally. It's not instructional designers from training creating content, it's doc writers - the same people who write the online help users use every day. 
  2. The delivery of the content is done outside of the traditional Learning Management Systems that training typically uses which provides some flexibility to custom design a system to deliver content. And the content is fast - few modules have units longer than ~15 minutes. 
  3. Each module has a challenge. Whether that challenge is a quiz or a hands-on activity in a developer edition org, it forces a level of interactivity that is typically higher than what traditional e-learning systems can provide. 
  4. Learners earn badges that they can promote on social media. Gamificiation isn't new, nor is it new in e-learning. But it's the icing on the cake and a motivator for people to finish modules rather than abandon them mid-way. 
  5. It's free to sign up and use.
Since we unveiled the Event Monitoring trailhead module, we've had @mentions in over a dozen blogs and many twitter posts just on this one module. Below is a short list of just some of them:
Trailhead is a phenomenon. It's different than traditional e-learning. And it's an incredible way to learn about Event Monitoring. Give it a try. Who knows, you might even earn your first badge!

To start your new learning adventure with Event Monitoring and trailhead, click here.