To get customers up and running faster, I created a script that incorporated all of these best practices. You can download this script from my Github repository: elfWave.sh to give it a try. Alternatively, you can use this script to understand the data flow in case you create scripts using other languages or middleware integration tools.
There are a couple of gotchas that you should be aware of that are specific to this script and process:
- jq is a pre-requisite - I recommend downloading it into your /bin directory.
- datasetutils.jar is a pre-requisite for making it easy to upload your data into Wave. You can download the latest version from the Analytics cloud Github repo, or download version 32.0.0 that I used from my Github repo.
- Event Monitoring and Wave needs to be enabled in your org. The user downloading the Event Monitoring files needs 'View Event Log Files' user permission on their profile or permission set. And the user uploading files to Analytics Cloud needs 'Upload External Data to Analytics Cloud' user permission on their profile or permission set.
- When running the script, be sure you have the rights (e.g. chmod +x elfWave.sh) and if you don't save the script in your /bin directory, you start it properly by navigating to the directory where you saved the script (e.g. cd ~/Downloads) and invoke it (e.g. sh elfWave.sh or ./elfWave.sh).
- The script has been tested on Mac OSX and Ubuntu Linux but may need some additional configuration if you run it on Windows (e.g. install cygwin and curl).
Below is a breakdown of the more important parts of the script:
If you are not scheduling the script to run automatically, you should prompt the user for logins to both their Event Monitoring and Wave orgs. It's likely that these are the same orgs; however, I'm making the assumption that these may be two different orgs.
#prompt the user to enter their username for the target (Wave) org read -p "Please enter the Wave target org username (and press ENTER): " tUsername #prompt the user to enter their password for the target (Wave) org read -s -p "Please enter the Wave target org password (and press ENTER): " tPassword #prompt the user to enter their username for the source (Event Monitoring) org read -p "Please enter username for Event Monitoring source org (and press ENTER): " username #prompt the user to enter their password for the source (Event Monitoring) org read -s -p "Please enter password for Event Monitoring source org (and press ENTER): " password #prompt the user to enter their instance end-point for the source (Event Monitoring) org echo read -p "Please enter instance (e.g. na1) for the for Event Monitoring source org loginURL (and press ENTER): " instance #prompt the user to enter the date for the logs they want to download for the source (Event Monitoring) org read -p "Please enter logdate (e.g. Yesterday, Last_Week, Last_n_Days:5) (and press ENTER): " day
You can choose whether to upload any single event type or all 28 types to Wave. You might want to try out just one event type (e.g. API or URI) to test it. However, if you don't choose anything, all 28 will be selected.
#prompt the user to enter the eventType they want to download for the source (Event Monitoring) orgs printf 'What EventType do you want to download?\n' printf '1. All 28 event types (Default)\n' printf '2. API\n' printf '3. ApexCallout\n' printf '4. ApexExecution\n' printf '5. ApexSoap\n' printf '6. ApexTrigger\n' printf '7. AsyncReportRun\n' printf '8. BulkApi\n' printf '9. ChangeSetOperation\n' printf '10. ContentDistribution\n' #…. read eventMenu case $eventMenu in 1) eventType=${eventType:-All} ;; 2) eventType=${eventType:-API} ;; 3) eventType=${eventType:-ApexCallout} ;; 4) eventType=${eventType:-ApexExecution} ;; 5) eventType=${eventType:-ApexSoap} ;; 6) eventType=${eventType:-ApexTrigger} ;; 7) eventType=${eventType:-AsyncReportRun} ;; 8) eventType=${eventType:-BulkApi} ;; 9) eventType=${eventType:-ChangeSetOperation} ;; 10) eventType=${eventType:-ContentDistribution} ;; #…. *) eventType=${eventType:-All} ;; esac echo ${eventType}
Download the files into the 'eventType-raw' directory, convert the TIMESTAMP from integer to date/time format, and store them in the eventType directory.
#loop through the array of results and download each file with the following naming convention: EventType.csv for i in "${!ids[@]}"; do #make directory to store the files by date and separate out raw data from #converted timezone data mkdir "${eventTypes[$i]}-raw" mkdir "${eventTypes[$i]}" #download files into the ${eventTypes[$i]}-raw directory curl "https://${instance}.salesforce.com/services/data/v32.0/sobjects/EventLogFile/${ids[$i]}/LogFile" -H "Authorization: Bearer ${access_token}" -H "X-PrettyPrint:1" -o "${eventTypes[$i]}-raw/${eventTypes[$i]}-${logDates[$i]}.csv" #convert files into the ${eventTypes[$i]} directory for Salesforce Analytics awk -F ',' '{ if(NR==1) printf("%s\n",$0); else{ for(i=1;i<=NF;i++) { if(i>1&& i<=NF) printf("%s",","); if(i == 2) printf "\"%s-%s-%sT%s:%s:%sZ\"", substr($2,2,4),substr($2,6,2),substr($2,8,2),substr($2,10,2),substr($2,12,2),substr($2,14,2); else printf ("%s",$i); if(i==NF) printf("\n")}}}' "${eventTypes[$i]}-raw/${eventTypes[$i]}-${logDates[$i]}.csv" > "${eventTypes[$i]}/${eventTypes[$i]}-${logDates[$i]}.csv" done
Because there may be 'n' number of days resulting in 'n' number of files, we merge the files into a single eventType file that will be uploaded into a single dataset with the eventType name. This will simplify the development of lenses and dashboards that trend events over a period of days.
#variable to count the number of unique event types uEventTypes=( $(echo ${elfs} | jq -r ".records[].EventType" | uniq) ) #merge data into single CSV file for j in "${uEventTypes[@]}" do output_file="$j.csv" count=0 for f in `ls $j/*.csv` do echo "still merging [$f]" echo "merging file: $f to $output_file." if [ $count -eq 0 ]; then awk -F ',' '{print $0}' $f 1>$output_file else awk -F ',' 'FNR>1 {print $0}' $f 1>>$output_file fi count=`expr $count + 1` echo "number of input files: $count merged to output file: $output_file" done done
Using the datasetutils java app, upload the transformed, merged CSV files into Wave.
#load CSV files to datasets in Wave for i in `ls *.csv`; do #variables to specify file and dataset name eventFile=`echo $i` eventName=`echo $i | sed 's/\.csv//g'` #comment next line to test before uploading to Wave java -jar datasetutils-32.0.0.jar --action load --u ${tUsername} --p ${tPassword} --inputFile ${eventFile} --dataset ${eventName} done
Finally, because we may want to repeat this process on a regular basis (e.g. daily, weekly, monthly, etc…), choose whether to remove the files locally or keep them around for long-term storage and audit purposes.
#prompt user to clean up data and directories read -p "Do you want to delete data directories and files? (Y/N)" del if [ $del == Y ] || [ $del == y ] || [ $del == Yes ] || [ $del == yes ]; then #clean up data directories for i in "${!uEventTypes[@]}"; do rm -r "${uEventTypes[$i]}-raw" rm -r "${uEventTypes[$i]}" rm "${uEventTypes[$i]}.csv" done rm -r "archive" #leave data and directories for audit reasons echo "The files were removed." elif [ $del == N ] || [ $del == n ] || [ $del == No ] || [ $del == no ]; then echo "The files will remain." fi
This script doesn't replace the need for an integration specialist to automate the ETL process; however, it should hopefully make it easier to trial both Event Monitoring and Wave.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.