Personally Identifiable Information

Also known as PII

Defining PII

[...] for example, first and last name, address, work telephone number, email address, home telephone number, and general educational credentials. The definition of PII is not anchored to any single category of information or technology. Rather, it requires a case-by-case assessment of the specific risk that an individual can be identified. Non-PII can become PII whenever additional information is made publicly available, in any medium and from any source, that, when combined with other available information, could be used to identify an individual.

This definition of public data is also supported in HiQ Labs v. LinkedIn.


We want people to feel sound ethically and legally when they contribute to PDAP. To that end, these are our goals:

  1. Draw a “bright line” from our Data Sources Database to the source material. We are not altering public records, we are helping people find them.

  2. Avoid aggregating already-public PII in a way that would make it more usable (e.g. turning arrests from different jurisdictions into a unified database). Citation.

    • We don't want to become or facilitate a database of people impacted by the legal system.

  3. Strive for completion, and not be accused of editorializing by omission.

    • The board decided submitting PII is allowed.

    • We can track whether a given Data Source contains PII.

    • PII collection is not required. If we aggregate data, we can choose to omit PII like name and address because those properties do not become statistically useful in aggregate, only dangerous.

  4. Transparency in our published resources and information.

    • Our open-source tools are transparent in how they work.

    • Documentation about published data and tools should be clear about any properties which are collected—or left behind—and why the decision was made.

Last updated