A URL pointing to a place on a police website where public records may be scraped, like "police-agency.com/arrest-reports". Each Dataset needs a Scraper.
Scraper / Data Scraper
A bit of Python code responsible for collecting public records from an agency website within our legal requirements. Some of the code is "common," shared between Scrapers. The starting point for information about Scrapers is the GitHub repo.
Colloquially, "scraper" may refer to a person writing a Scraper.
The result of running a Scraper is an Extraction, which represents all the records found at a given Dataset at the time the Scraper was run.
Packaged with each Extraction is metadata with information about when the Extraction was made, from which Dataset, and using which Scraper.
By observing the requirements for Data Intake, we can be sure that every record has an auditable history and passes the "bright-line" test for tracing data back to its origin. This is how we prove we aren't inventing data—we're surfacing what's already public, and we can prove it.