Data Sources database

Use the API to find, use, and manage Data Sources.

Base URL

https://data-sources.pdap.io/api

Search Tokens

GET [base-url]/search-tokens

The search tokens endpoint is located in resources/SearchTokens.py. The search tokens endpoint generates an API token valid for 5 minutes and forwards the search parameters to the Quick Search endpoint. This endpoint is meant for use by the front end only.

Query Parameters

NameTypeDescription

arg1*

String

The first argument that will be forwarded on to the appropriate endpoint. Currently either "search" for quick-search or "id" for data-sources

arg2*

String

The second argument that will be forwarded on to the appropriate endpoint. Currently just used for "location" for quick-search

endpoint

String

The endpoint that will be accessed after a search token is generated

count int
data array[object]
    agency_name string
    municipality string
    state_iso string
    data_source_name string
    description string
    record_type string
    source_url string
    record_format string
    coverage_start string
    coverage_end string
    agency_supplied boolean

Quick Search Data Sources by search term and location

GET [base-url]/quick-search/{search}/{location}

The quick search endpoint is located in resources/QuickSearch.py. The quick search endpoint executes its search using the agency_source_link table in the Data Sources database, which links each data source in the data_sources table with its associated agency in the agencies table. This endpoint is meant for use by the search tokens endpoint only.

Path Parameters

NameTypeDescription

search*

String

Checks partial matches on any of the following properties on the data_source table: "name", "description", "record_type", and "tags". The search term will is case insensitive and will match singular and pluralized versions of the term.

location*

String

Checks partial matches on any of the following properties on the agencies table: "county_name", "state_iso", "municipality", "agency_type", "jurisdiction_type", "name"

Headers

NameTypeDescription

Authorization*

String

Value formatted as "Bearer [access token/api key]”

count int
data array[object]
    agency_name string
    municipality string
    state_iso string
    data_source_name string
    description string
    record_type string
    source_url string
    record_format string
    coverage_start string
    coverage_end string
    agency_supplied boolean

Data Sources

Get all Data Sources

GET [base-url]/data-sources

The data sources endpoint is located in resources/DataSources.py. The data sources endpoint returns all approved rows in the corresponding Data Sources database table by default. An optional JSON object can be passed to get data sources needing identification instead.

Headers

NameTypeDescription

Authorization*

String

Value formatted as "Bearer [access token/api key]”

Request Body

NameTypeDescription

Data

JSON

In order to get data sources needing identification: {"approved": False}

count int
data array[object]
    name string
    submitted_name string
    description string
    record_type string
    source_url string
    agency_supplied boolean
    supplying_entity string
    agency_originated string
    originating_entity string
    agency_aggregation string
    coverage_start string
    coverage_end string
    source_last_updated string
    retention_schedule string
    detail_level string
    number_of_records_available string
    size string
    access_type array
    data_portal_type string
    record_format array
    update_frequency string
    update_method string
    tags string
    readme_url string
    scraper_url string
    data_source_created string
    airtable_source_last_modified string
    url_status string
    rejection_note string
    last_approval_editor object
    agency_described_submitted string
    agency_described_not_in_database string
    approval_status string
    record_type_other string
    data_portal_type_other string
    records_not_online string
    data_source_request string
    url_button string
    tags_other string
    access_notes string
    last_cached string

Data Sources for Map

GET [base-url]/data-sources-map

Headers

The data sources for map endpoint is located in resources/DataSourcesMap.py. The data sources endpoint returns all approved rows in the corresponding Data Sources database table by default with only the columns relevant to mapping.

NameTypeDescription

Authorization*

String

Value formatted as "Bearer [access token/api key]”

count int
data array[object]
    data_source_id string
    name string
    agency_id string
    agency_name string
    state_iso string
    municipality string
    county_name array[string]
    record_type string
    lat float
    lng float

Get Data Source by Id

GET [base-url]/data-sources-by-id/[id]

The data sources endpoint is located in resources/DataSources.py. The data source by id endpoint returns just the row for the data source that corresponds to the id passed.

Path Parameters

NameTypeDescription

id*

String

Data source id

Headers

NameTypeDescription

Authorization*

String

Value formatted as "Bearer [access token/api key]”

count int
data array[object]
    name string
    submitted_name string
    description string
    record_type string
    source_url string
    agency_supplied boolean
    supplying_entity string
    agency_originated string
    originating_entity string
    agency_aggregation string
    coverage_start string
    coverage_end string
    source_last_updated string
    retention_schedule string
    detail_level string
    number_of_records_available string
    size string
    access_type array
    data_portal_type string
    record_format array
    update_frequency string
    update_method string
    tags string
    readme_url string
    scraper_url string
    data_source_created string
    airtable_source_last_modified string
    url_status string
    rejection_note string
    last_approval_editor object
    agency_described_submitted string
    agency_described_not_in_database string
    approval_status string
    record_type_other string
    data_portal_type_other string
    records_not_online string
    data_source_request string
    url_button string
    tags_other string
    access_notes string
    last_cached string
    homepage_url string
    count_data_sources string
    agency_type string
    multi_agency string
    submitted_name string
    jurisdiction_type string
    state_iso string
    municipality string
    zip_code string
    county_fips string
    county_name array
    lat float
    lng float
    data_sources array
    no_web_presence string
    airtable_agency_last_modified string
    data_sources_last_updated string
    approved boolean
    rejection_reason string
    last_approval_editor object
    agency_created string
    county_airtable_uid string
    defunct_year string
    data_source_id string
    agency_id string
    agency_name string

Create Data Source

POST [base-url]/data-sources

The data sources endpoint is located in resources/DataSources.py. The create data source endpoint posts a new data source to the database and returns True if successful and False if not.

Headers

NameTypeDescription

Authorization*

String

Value formatted as "Bearer [access token/api key]”

Request Body

NameTypeDescription

Data

JSON

A JSON object of the data source information. Refer to the Data Source dictionary for available fields: https://docs.pdap.io/activities/data-dictionaries/data-sources-data-dictionary. However, the following fields cannot be edited: rejection_note, data_source_request, approval_status, airtable_uid, airtable_source_last_modified

Below is an example of an acceptable body:

{ "name": "Calls for Service for Chicago Police Department - IL",

"record_type": "Calls for Service",

"source_url": "https://informationportal.igchicago.org/911-calls-for-cpd-service",

"coverage_start": "2019-01-01"

}

message string

Update Data Source by Id

PUT [base-url]/data-sources-by-id/[id]

The data sources endpoint is located in resources/DataSources.py. The update data source by id endpoint updates a data source and returns a status to confirm a successful update.

Path Parameters

NameTypeDescription

id*

String

Data source id

Headers

NameTypeDescription

Authorization*

String

Value formatted as "Bearer [access token/api key]”

Request Body

NameTypeDescription

Data

JSON

A JSON object of the data to be updated. Refer to the Data Source dictionary for available fields: https://docs.pdap.io/activities/data-dictionaries/data-sources-data-dictionary. However, the following fields cannot be edited: data_source_request, airtable_uid, airtable_source_last_modified

Below is an example of an acceptable body:

{ "name": "Calls for Service for Chicago Police Department - IL",

"record_type": "Calls for Service",

"source_url": "https://informationportal.igchicago.org/911-calls-for-cpd-service",

"coverage_start": "2019-01-01"

}

{"message": "Data source successfully updated."}

Archives

Get all Archived urls

GET [base-url]/archives

The archives endpoint is located in resources/Archives.py. The get method on the archives endpoint returns all rows for urls that the automatic archives script has cached in the Internet Archive.

Headers

NameTypeDescription

Authorization*

String

Value formatted as "Bearer [access token/api key]”

id string
source_url string
update_frequency string
last_cached string

Get all Archived urls

PUT [base-url]/archives

The archives endpoint is located in resources/Archives.py. The put method on the archives endpoint updates the data source matching the passed id, updating the last_cached date if it alone is passed, or it and the broken_source_url_as_of field and the url_status to 'broken'.

Headers

NameTypeDescription

Authorization*

String

Value formatted as "Bearer [access token/api key]”

Request Body

NameTypeDescription

id*

String

The airtable uid for the data source that was cached

broken_source_url_as_of*

Date

The current date if the url is no longer active, otherwise None

last_cached*

Date

The current date since the data source url was just cached in the Internet Archive

count int
data array[object]
    agency_name string
    municipality string
    state_iso string
    data_source_name string
    description string
    record_type string
    source_url string
    record_format string
    coverage_start string
    coverage_end string
    agency_supplied boolean

Homepage search cache

GET [base-url]/homepage-search-cache

Headers

NameTypeDescription

Authorization

String

Value formatted as "Bearer [access token/api key]”

Responses

[
    {
        "SUBMITTED_NAME": "string",
        "JURISDICTION_TYPE": "string",
        "STATE_ISO": "string",
        "MUNICIPALITY": "string",
        "COUNTY_NAME": "string",
        "AIRTABLE_UID": "string",
        "COUNT_DATA_SOURCES": "integer",
        "ZIP_CODE": "string",
        "NO_WEB_PRESENCE": "Boolean"
    }
]

POST [base-url]/homepage-search-cache

Headers

NameTypeDescription

Authorization

String

Value formatted as "Bearer [access token/api key]”

Request body

NameTypeDescription

agency_uid

String

The UID of the agency

search_results

[String]

List of search results to be cached

Example request body

{
    "agency_uid": "uid123",
    "search_results": ["result1", "result2"]
}

Responses

{
    "message": "Search Cache Updated"
}

Agencies

Get all Agencies

GET [base-url]/agencies/{page}

The agencies endpoint is located in resources/Agencies.py. The agencies endpoint returns 1000 rows from the corresponding Data Sources database table offset by the page number passed.

Path Parameters

NameTypeDescription

page*

String

Passing 1 will return the first 1000 rows. Subsequent page number return subsequent results

Headers

NameTypeDescription

Authorization

String

Value formatted as "Bearer [access token/api key]”

count int
data array[object]
    agency_name string
    municipality string
    state_iso string
    data_source_name string
    description string
    record_type string
    source_url string
    record_format string
    coverage_start string
    coverage_end string
    agency_supplied boolean

Last updated