Links
Comment on page

Splunk

Overview & Setup

Splunk is an intelligence analysis tool to make sense of vast amounts of data.
  • can run python script to save sanitized snapshots on a recurring basis
  • can make API requests to other APIs, and has an API to make requests against
  • can save alerts for the appearance of specific data

Hosted

Access it here. For this iteration, there’s a shared username and password. Reach out to Alec Akin for it. This is installed on a $10 Vultr instance with 2gb ram, 55gb ssd, and 1 core.

Local

If you want to install a private instance for local use, it only takes about 10 minutes to set up on Mac or Windows. Here’s where to get started.

Example use cases

When fed a batch of 3,000,000 documents from NIBRS, it quickly revealed that the 8 days responsible for the most hate crime in the United States were those immediately following September 11. The 9th was the final day of the Rodney King riots. It’s also a way to power parts of the site: a front end app could consume parts of the Splunk API.

Workflows powered by Splunk

Query data → Analyze data → Export insights
Upload data → Analyze data* *The user agrees that we can keep the data, and provides information or verification about it.
Save an Analysis → Share the Analysis with someone else
Save an Analysis → Revisit it with updated data Alert user if an analysis changes based on updated information
Query data from many dolt repos with the same format

Specific abilities granted by Splunk

  • easily write regex
  • accept any type of data
    • oddly / non-delimited
    • many file types
  • faster analysis / searching on the server rather than locally
  • automatically find "interesting fields"
  • search
  • analysis

Basic usage

Here’s how to make queries in SPL, Splunk’s proprietary searching language.
index=nibrs source="hatecrimes.csv" incident_date="12-SEP-01"