# 2021-04-17 Meeting notes

## 44RDCXDSDDDate <a href="#id-2021-04-17meetingnotes-44rdcxdsdddate" id="id-2021-04-17meetingnotes-44rdcxdsdddate"></a>

17 Apr 2021

## Participants <a href="#id-2021-04-17meetingnotes-participants" id="id-2021-04-17meetingnotes-participants"></a>

Mitch Miller

[Josh Chamberlain](https://pdap.atlassian.net/wiki/people/6068f9e790e3950069fbaaf4?ref=confluence)

[Eric Turner](https://pdap.atlassian.net/wiki/people/6069da262b469c007014d7fa?ref=confluence)

Jeff Joskisch

[Richard Ji](https://pdap.atlassian.net/wiki/people/5f8f95be0e068b00766b6903?ref=confluence)

## Discussion topics <a href="#id-2021-04-17meetingnotes-discussiontopics" id="id-2021-04-17meetingnotes-discussiontopics"></a>

| Item                    | Notes                                                                                                                                                                                                                                                                                                                                                                                                                        |
| ----------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Dolt / Databases        | <ul><li><p>Lots of changes being made in datasets</p><ul><li>agencies</li></ul></li><li>Does Dolt support SQL COPY</li><li>Richard has MongoDB creds for anyone who would like to experiment</li><li>How slow is Dolt? Breakingly? We’re going to have \~2 million records.</li><li>We can host a mirror on our server and run an instance if people want the same data quicker / without the dolt UI</li></ul>              |
| <p>Podcast fame<br></p> | <p>Jeff was interviewed on <a href="https://www.spirion.com/privacy-please-podcast/">Privacy Please</a> releasing this week (wednesday) and mentioned PDAP<br></p>                                                                                                                                                                                                                                                           |
| OCR                     | <p>Tensorflow has been suggested</p><p>We may need a lot of training data, which we don’t have.</p><ul><li>Is there a way we could slowly start to feed this stuff to tensorflow now?</li></ul><p>Requirements:</p><ul><li>We need to be able to comma delimit things on the way in</li></ul><p>There’s no harm in the meantime with publishing unedited PDFs</p><ul><li>may inspire contributors to help with OCR</li></ul> |
| FE                      | <p>There are some folks ready to work on stuff for when we have data</p><p>For now it’s pretty small and people could clone it and run it locally</p><p>Miles is converting gatsby to JSX which will make iteration easier</p>                                                                                                                                                                                               |
| mongodb                 | We should have a template if we’re using Mongo                                                                                                                                                                                                                                                                                                                                                                               |
| Docker compose file     | Mitch is working on a docker compose file for dolt and mongo, which will be helpful as we get our ETL framework together                                                                                                                                                                                                                                                                                                     |

## Action items <a href="#id-2021-04-17meetingnotes-actionitems" id="id-2021-04-17meetingnotes-actionitems"></a>

* [Eric Turner](https://pdap.atlassian.net/wiki/people/6069da262b469c007014d7fa?ref=confluence) to ping Mitch in slack when sql-server POC is done (nearly)
* [Josh Chamberlain](https://pdap.atlassian.net/wiki/people/6068f9e790e3950069fbaaf4?ref=confluence) policy / rationale for PII → docs (this is a high priority)
* [Josh Chamberlain](https://pdap.atlassian.net/wiki/people/6068f9e790e3950069fbaaf4?ref=confluence) make shitty base tables from examples of other data types
* [Josh Chamberlain](https://pdap.atlassian.net/wiki/people/6068f9e790e3950069fbaaf4?ref=confluence) Do meeting notes in Docs next time so they can be shared


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.pdap.io/meta/operations/staff/meeting-minutes/project-home-2021-04-17-meeting-notes.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
