Home / Companies / Gretel.ai / Blog / Post Details
Content Deep Dive

Quantifying PII Exposure in Synthetic Data

Blog post from Gretel.ai

Post Details
Company
Date Published
Author
Alexa Haushalter
Word Count
2,382
Language
English
Hacker News Points
-
Summary

Gretel's PII Replay is a new privacy metric that identifies instances of sensitive information found in original training data and counts how often those values appear in synthetic output. This tool works alongside Membership Inference Protection and Attribute Inference Protection, ensuring your synthetic data remains private by design. By leveraging Gretel Transform to identify and classify instances of PII in the original training data, users can now easily see whether any of the original PII is showing up in their synthetic data. Strategies to minimize PII Replay include using Transform before generating synthetics, choosing a model designed to minimize PII replay, leveraging differential privacy, pre-processing to remove unnecessary columns, and using pre- and post-processing strategies strategically.