May 1, 2021

Attendees

  • Josh

  • Mitch

  • Jeff

  • Eddie

Topics

Topic

Notes

Downloadable scrapers package

Chain of custody

  • Where

  • When

  • Who

  • With what scraper code

What granularity do we want for audit history? Cellwise

We need auth—anything else is too easy to spoof

Data collision

Treat each timestamp as its own piece of data

Datasets

Maintain source info / keep it up to date

Auth

  • Keybase (infra)

  • Django built-in

  • GitHub for scraping (pub/privkey)

  • Medium-term SSO

Incentives

  • Dolt bounties

  • Paid scrapers—pay for a quick project (fiverr, freelancer)

    • Someone paid has ~guaranteed expertise. Collect feedback from professional consumers

$

We're covering a few licenses, but need more donations. Considering Patreon

Support

EFF, ACLU, NFOIC

  1. volunteer time

  2. volunteer money

  3. introduce us

  4. use the data

Scraper utilities

Data centers

We have NY and SF data centers in DigitalOcean, but they don't talk to each other

Security

https://pdap.atlassian.net/browse/PDAP-162

  1. perimeter

  2. secrets manager

Who can make data PR approvals?

Front end

Last updated