/plushcap/analysis/gretel-ai/how-we-accidentally-discovered-personal-data-in-a-popular-kaggle-dataset

How we accidentally discovered personal data in a popular Kaggle dataset

What's this blog post about?

The upcoming features in Gretel Public Beta include automatic data labeling using Natural Language Processing (NLP) and neural network-based entity recognition for names and addresses, managed regular expressions, and custom extractors. These features enable the discovery of personally identifiable information (PII) such as full names and email addresses in datasets like Lending Club's financial dataset on Kaggle. Gretel helps developers share data more safely by providing workflows to understand and make informed decisions about data safety.

Company
Gretel.ai

Date published
Aug. 24, 2020

Author(s)
John Myers

Word count
923

Hacker News points
1

Language
English


By Matt Makai. 2021-2024.