The datasets database is a utility we're always maintaining. It contains known police datasets. Each dataset has a status, and can potentially be scraped.


Datasets are maintained and can be easily viewed in DoltHub. The same tables are in a Hadoop PostgreSQL mirror, which is a better back end for Scrapers.

Terms & Hierarchy

  1. Country The United States has ~18,000 Agencies across all regions.

  2. Region A state, county, municipality, or other region containing multiple police agencies.

  3. Agency A specific police organization, typically containing many Datasets.

  4. Dataset A URL where one type of police data can be found.

  5. Scraper A bit of code which downloads everything it can find on a Dataset.