09 April 2015

The Power of Compression

I had an interesting customer case come up last week. The customer was using the Splunk App for Salesforce. But unfortunately, they kept getting an error when trying to download Event Log Files.

We finally discovered that there is a ten minute timeout on API requests by default. That means an API call must complete within ten minutes or an error will occur. Because we were trying to download a large amount content in California, but the data center was based in Chicago, the addition of network latency and file size contributed to the ten minute timeout.

Upon further investigation, we determined that weren't compressing the CSV file content over the network and instead downloading everything in an uncompressed format. However, we also determined that there is a white list of content types where you can request compression during download including:
  • text/html
  • application/json
  • text/css
  • text/javascript
  • text/xml
  • application/javascript
  • application/x-javascript
  • application/vnd.edgemart
As a result, even when requesting text/csv content in a compressed format, we were still delivering it uncompressed. We have since patched a fix to allow compression when you request text/csv file content using the API.

It's important to know that this is an optional configuration. Nothing changes from current functionality while downloading Event Log Files. Your scripts that worked previously should still work. However, if you add the compression flag to your API request or header, we'll transmit your Event Log File in a compressed format enabling quicker delivery.

How much quicker?

With some initial testing I conducted downloading NA1 files from California using a modified python download script, I was able to download files on average 65% faster than before with this option.


To request compression with cURL, it's as easy as adding the --compression flag:

curl --compressed "https://na8.soma.salesforce.com/services/data/v33.0/sobjects/EventLogFile/0ATD00000000HnXOB3/LogFile" -H "Authorization: Bearer ${access_token}" -H "X-PrettyPrint:1" -o "compressedELF.csv"

Otherwise, you typically can request compression via a header. For example, with Python:

request.add_header('Accept-encoding', 'gzip')

So regardless of whether your files are large or you are in a different region from the data center, it's still worth downloading your files using compression.


Icons by DryIcons

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.