# Resources for using data

## Using & analyzing data

[Common abbreviations in criminal legal system records](http://amerusa.net/resource_documents/CriminalRecordAbbreviations.pdf)

[QA process for data analysis](https://source.opennews.org/articles/qa-process-confidence-data-stories/)

[CPE's guide on analyzing traffic stop data](https://policingequity.org/data-collection-insights/33-cpe-toolkit-stop-data-collection-guidebook/file)

[RATH](https://rath.kanaries.net/) by Kanaries for no-code exploratory data analysis

[Splunk](https://www.splunk.com/) is an intelligence analysis tool to make sense of vast amounts of data.

## Collecting data

[Compendium of state open records laws](https://www.rcfp.org/open-government-guide/)

[ScrapingBee for headless websites](https://www.scrapingbee.com/)

[Best practices for data collection](https://inspectelement.org/best-practices-data-collection.html) from Inspect Element

[Finding undocumented APIs](https://inspectelement.org/apis.html) from Inspect Element

[Automatic DocumentCloud scraping](https://www.muckrock.com/news/archives/2022/may/24/release-notes-keep-an-eye-on-your-favorite-agencie/) which requires a verified MuckRock account. If you don't have one, you can use ours! Reach out.

[The Berkeley COPWATCH "People's Database" guide](https://www.berkeleycopwatch.org/people-s-database) for creating a community accountability tool

## Processing data

[Crosswalker](https://github.com/washingtonpost/crosswalker) is for joining columns of text data that don’t match perfectly.

[CJWorkbench](https://github.com/CJWorkbench/cjworkbench), a spreadsheet-like program with powerful tools for data journalism.

[PDF processing and OCR](https://github.com/chadday/nicar_ocr) from Chad Day for NICAR

[Useful file types: Parquet, SQLite, FlatGeobuf](https://observablehq.com/@asg017/nicar23-lightning-talk-tipsheet-3-file-formats) from Alex Garcia

[Frictionless data](https://frictionlessdata.io/) for transforming and describing messy datasets, making them more interoperable, and creating a pipeline

[Kyle Walker's guide](https://walker-data.com/umich-workshop-2023/python/#1) for using Python to do spatial mapping and analysis

## Learning Python

[Consuming APIs with Python](https://realpython.com/python-api/#consuming-apis-with-python-practical-examples) from Realpython

[An intro to your first notebook using Python & Jupyter](https://github.com/palewire/first-python-notebook) from Palewire

[A mini Python and Jupyter bootcamp](https://github.com/ireapps/pycar) from NICAR

[Python scraping](https://github.com/oxylabs/web-scraping-tutorials/tree/main/python) from Oxylabs


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.pdap.io/tools/resources.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
