Recently, while on a customer on-site, I was asked a simple question - how can we do real-time monitoring of salesforce.com?
These were system administrators and operations people used to monitoring the uptime of their data center. Of course they expected real-time monitoring and automated alerts.
There are many ways to monitor salesforce. And when there isn't standard functionality to monitor, there is always a custom solution.
About a week ago, I started running into some issues with a new service that I was building. I was inspired by a sparkfun blog article I read about an open API based on Phant that allows you to post arbitrary custom values for real-time monitoring. I decided to build my own real-time monitoring system based on a simple heartbeat design that would notify me when my heartbeat skipped a beat. And when it didn't skip a beat, I just wanted to log the success and chart the trend over time for discussion with our engineers. This was similar to the requirements I heard while at the on-site with my customer.
I had some basic requirements for the first iteration of my monitoring service:
- it had to be automated to provide real-time data
- it had to perform the simplest query to determine availability
- the query mechanism needed to be secure and hosted outside of salesforce
- the charting and notification systems had to be as simple as possible, preferably no passwords or fees for using it in the first iteration. As long as I could obfuscate sensitive data, it could even be publicly exposed data.
My first prototype was done in about half an hour.
- I created a bash shell script that I hosted on a Linux box under my desk. This was the secure part hosted outside of salesforce.
- I created a CRON job on my Linux box set to run the shell script every minute. This would consume 1440 API calls a day as a result but I thought I could fine tune the frequency of the script later to suit my needs. Increasing the real time nature increases cost of API calls and vice versa, I can decrease cost by loosening my requirements. This was the automated part of the solution.
- The shell script data flow was simple: log in using OAuth and curl, query to get a count of an sObject, and parse the result. If the result has a number, consider it a success, otherwise consider it a failure and log the error.
- I used a free data publishing service from data.sparkfun.com. Originally created for publicly accessible IoT (Internet of Things) apps like weather device data, it made it trivial to expose the data I needed in a simple Rest API. In the next iteration, I would use keen.io which has more functionality and freemium options but involved more design than necessary wiring up my first iteration. You can check out my live heartbeat monitor that I'm still using to monitor my service.
- I created a google charting API report to visualize the data. This was the visualization part of the solution and entirely based on a phant.io blog posting.
- I used a freemium SMS service called SendHub to handle the notifications. I originally used Twilio but needed a simpler, freemium option for the first iteration.
Every minute, the CRON job would wake the bash shell script. The script would log into salesforce using the rest API, query a count of my new sobject, and if successful it would log a row to sparkfun which I viewed on their public page. If it failed, I would log another row to sparkfun with the error message. I then sent a SMS notification of the failure to my cell phone. To view a trend of successes and failures over time (which was useful to see what happened when I was away from my phone or asleep), I used my Google charting report.
This DIY project highlights a simple case of real-time monitoring. If you want to try it out, you can find the code for this project in my Github repository - heartbeatMonitor.