Data Sources data dictionary
To see which options are available for select fields, consult the submission form.
Required properties for submission
submitted_name
, submitter_contact_info
, record_type
, agency_supplied
(+ other "provenance" properties, if "no")
What is it?
name
string
Uses submitted_name
if present or concatenates record_type
+ " for "
+ agency_described
; can get weird when one or both are not present
submitted_name
string
Required for individual Data Source submissions for clarity.
record_type
string
What kind of data is accessible from this source? For more info, see the Record Types taxonomy.
tags
array
Are there any keyword descriptors which might help people find this in a search? Try to limit tags to information which can't be contained in other properties.
description
string (textarea)
Information to give clarity and confidence about what this source is, how it was processed, and whether the person reading the description might want to use it. Especially important if the source is difficult to preview or categorize.
Agency
agency_described
array (foreignkey based on an agency's ID within Airtable)
To which criminal legal system agency or agencies does this Data Source refer?
agency_aggregation
array
If present, the Data Source describes multiple agencies. Can be an item like local
or county
.
state
string
2-character ISO code, related to the associated Agency object, if present.
county
string
Related to the associated Agency object, if present.
municipality
string
Related to the associated Agency object, if present.
agency_type
string
Related to the associated Agency object, if present.
jurisdiction_type
array
Related to the associated Agency object, if present. What is the highest level of jurisdiction for the agency? Can be an item like local
or county
.
Provenance
Where did it come from?
agency_supplied
boolean
Is the relevant Agency also the entity supplying the data? This may be "no" if the Agency or local government contracted with a third party to publish this data, or if a third party was the original record-keeper.
supplying_entity
string
If the Agency didn't publish this, who did?
agency_originated
boolean
Is the relevant Agency also the original record-keeper? This is usually "yes", unless a third party collected data about a police Agency.
originating_entity
string
If the Agency was not the original record-keeper, who was?
Access & format
source_url
string
A URL where these records can be found or are referenced.
readme_url
string
A URL where supplementary information about the source is published.
access_type
array
Array items can have values such as Web page
or API
record_format
array
What format(s) are the records in natively? Array items can have values such as CSV
, JSON
, XML
, RDF
, RSS
, HTML table
and others
detail_level
array
Is this an individual record, an aggregated set of records, or a summary without underlying data?
size
string
The file size on disk of all the data at this source, if downloaded.
data_portal_type
string
Some data is published via a standard third-party portal, typically named somewhere on the page.
access_notes
string
Is anything special required to access the data?
Coverage & retention
coverage_start
date
The earliest date covered by this source, if known, in the format YYYY-DD-MM.
coverage_end
date
The date at which updates stop, in the format YYYY-DD-MM.
source_last_updated
date
The date this source was last updated, in the format YYYY-DD-MM.
update_frequency
array
How often is this data source updated?
update_method
array
Are records replaced (Overwrite
) or added (Insert
)?
retention_schedule
array
How long are records kept? Are there published guidelines regarding how long important information must remain accessible for future use?
number_of_records_available
integer
How many similar pieces of information are available at this source?
Meta & utility
scraper_url
string
The url of any web scraping efforts associated with this Data Source.
url_status
array
The status of the source_url
, including options like ok
, none found
, broken
"ok"
approval_status
array
Set manually by the PDAP team; statuses include: approved
rejected
needs identification
null
data_source_created
datetime
The date this source was first created in our database.
Date of data source submission, in the format YYYY-DD-MM.
agency_described_linked_uid
string
The Airtable-generated UID of an associated Agency
airtable_uid
string
The Airtable-generated UID of this particular data source
Last updated