What is a Data Source?
A Data Source is a web page, file, database, or filing cabinet somewhere which contains records about a criminal legal system agency (law enforcement, courts, corrections). Often, they are published by the agency itself. Many more records are hidden behind records request processes.
Data Sources can be records describing an agency's activities, like traffic stops or use of force policies. They describe agencies in the criminal legal system like the
Pittsburgh Bureau of Police
or Allegheny County Jail
.People need to be able to find data to do anything with it. The foundation of our work is creating a common system for classifying and tracking public data. Does your organization have a giant, unwieldy spreadsheet tracking FOIA requests and web sites? You're not alone—and you can share your work with others.
- Automatic archives of each URL, creating a lasting resource for future research and web scraping. As it stands, information is lost to time due to data retention policies.
- A classification system using metadata about which records are available, how it was collected, and how it relates to other records. This is the path to doing big, complicated aggregation projects.
- Better transparency. We can improve transparency by being a hub for people who are using what's already there, finding its limits, and addressing them one by one.
- Shared tools. When someone finds a Data Source in our database, they will also be able to see associated scrapers, extractions, and archives.
Finding all the Data Sources for your hometown is a human-sized project that can make a real impact.
If it's about a criminal legal system agency, we want to track it. This includes FOIA'd documents, web URLs, and independently scraped records.
Last modified 22d ago