/plushcap/analysis/gretel-ai/practical-privacy-with-synthetic-data

Practical Privacy with Synthetic Data

What's this blog post about?

This post discusses the implementation of a practical attack on synthetic data models to measure unintended memorization in neural network models, as described in "Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks" by Nicholas Carlini et al. The authors use this attack to evaluate how well synthetic data models with various neural network and differential privacy parameter settings protect sensitive data and secrets in datasets. They work with a smaller dataset containing sensitive location data, which is considered challenging to anonymize. The authors insert canary values into the model's training data and measure each model's propensity to memorize and replay these canary values. Results show that differential privacy works well at preventing memorization of secrets across all tested configurations, while gradient clipping also effectively prevented any replay of canary values with only a small loss in model accuracy.

Company
Gretel.ai

Date published
April 27, 2021

Author(s)
Alex Watson

Word count
1003

Hacker News points
None found.

Language
English


By Matt Makai. 2021-2024.