2021-03-27 database working session
Date
27 Mar 2021
Participants
Mitch Miller
Kristin Tynski
Jeff Jockisch
Goals
Discussion topics
Action items
Former user (Deleted) find out how much overhead (time) is involved in an end to end dolt scrape → public
Former user (Deleted) add data format (NIBRS) column to dataset catalogue
Former user (Deleted) Identify minimum data properties to meet NIBRS data format, has this somewhere alreadyhttps://pdap.atlassian.net/browse/PDAP-118
Former user (Deleted) Document data_types priority in scrapers readme. Arrest Reports, Traffic Stops, Incident reports. What’s most available. What most easily paints the full picture. Tiers. https://pdap.atlassian.net/browse/PDAP-119
Former user (Deleted) validate workflow: localized, raw data is stored in dolt → it could be aggregated / ETL’d to a centralized server. dolt is the audit/transparency.
Former user (Deleted) Expose basic roadmap in documentation with backup DB → API
Josh Chamberlain Draft a policy and rationale for “fields not to collect” → A&P
Josh Chamberlain Draft a policy and rationale for mirroring dataset websites → A&P
Former user (Deleted) import dataset catalogue form submissions and deprecate form
Richard Ji Get @stabs base scraper approved
Mitch Miller is making an ETL framework
https://pdap.atlassian.net/browse/PDAP-113
https://pdap.atlassian.net/browse/PDAP-114
Decisions
@stabs need to be recognized. Let’s be sure to celebrate their hard work