Data Sources data dictionary

To see which options are available for select fields, consult the submission form.

Required properties for submission

submitted_name, submitter_contact_info, record_type, agency_supplied (+ other "provenance" properties, if "no")

What is it?

Property
Type
Description

name

string

Uses submitted_name if present or concatenates record_type + " for " + agency_described; can get weird when one or both are not present

submitted_name

string

Required for individual Data Source submissions for clarity.

record_type

string

What kind of data is accessible from this source? For more info, see the Record Types taxonomy.

tags

array

Are there any keyword descriptors which might help people find this in a search? Try to limit tags to information which can't be contained in other properties.

description

string (textarea)

Information to give clarity and confidence about what this source is, how it was processed, and whether the person reading the description might want to use it. Especially important if the source is difficult to preview or categorize.

Agency

Property
Type
Description

agency_described

array (foreignkey based on an agency's ID within Airtable)

To which criminal legal system agency or agencies does this Data Source refer?

agency_aggregation

array

If present, the Data Source describes multiple agencies. Can be an item like local or county.

state

string

2-character ISO code, related to the associated Agency object, if present.

county

string

Related to the associated Agency object, if present.

municipality

string

Related to the associated Agency object, if present.

agency_type

string

Related to the associated Agency object, if present.

jurisdiction_type

array

Related to the associated Agency object, if present. What is the highest level of jurisdiction for the agency? Can be an item like local or county.

Provenance

Where did it come from?

Property
Type
Description

agency_supplied

boolean

Is the relevant Agency also the entity supplying the data? This may be "no" if the Agency or local government contracted with a third party to publish this data, or if a third party was the original record-keeper.

supplying_entity

string

If the Agency didn't publish this, who did?

agency_originated

boolean

Is the relevant Agency also the original record-keeper? This is usually "yes", unless a third party collected data about a police Agency.

originating_entity

string

If the Agency was not the original record-keeper, who was?

Access & format

Property
Type
Description

source_url

string

A URL where these records can be found or are referenced.

readme_url

string

A URL where supplementary information about the source is published.

access_type

array

Array items can have values such as Web pageor API

record_format

array

What format(s) are the records in natively? Array items can have values such as CSV, JSON, XML, RDF, RSS, HTML table and others

detail_level

array

Is this an individual record, an aggregated set of records, or a summary without underlying data?

size

string

The file size on disk of all the data at this source, if downloaded.

data_portal_type

string

Some data is published via a standard third-party portal, typically named somewhere on the page.

access_notes

string

Is anything special required to access the data?

last_cached

date

When was this last archived by our automated archives app? Formatted as YYYY-DD-MM.

Coverage & retention

Property
Type
Description

coverage_start

date

The earliest date covered by this source, if known, in the format YYYY-DD-MM.

coverage_end

date

The date at which updates stop, in the format YYYY-DD-MM.

source_last_updated

date

The date this source was last updated, in the format YYYY-DD-MM.

update_frequency

array

How often is this data source updated?

update_method

array

Are records replaced (Overwrite) or added (Insert)?

retention_schedule

array

How long are records kept? Are there published guidelines regarding how long important information must remain accessible for future use?

number_of_records_available

integer

How many similar pieces of information are available at this source?

Meta & utility

Property
Type
Description
Default value

scraper_url

string

The url of any web scraping efforts associated with this Data Source.

url_status

array

The status of the source_url, including options like ok , none found , broken

"ok"

approval_status

array

Set manually by the PDAP team; statuses include: approved rejected needs identification

null

data_source_created

datetime

The date this source was first created in our database.

Date of data source submission, in the format YYYY-DD-MM.

agency_described_linked_uid

string

The Airtable-generated UID of an associated Agency

airtable_uid

string

The Airtable-generated UID of this particular data source

Last updated