Detect human names in logs with ML in Sensitive Data Scanner
Blog post from Datadog
Modern applications often log more information than necessary, including personally identifiable information (PII) such as human names, which can lead to privacy and compliance issues. Datadog's Sensitive Data Scanner (SDS) offers a solution by introducing human name detection using a machine learning (ML) pipeline to identify and optionally obfuscate human names in logs. This approach reduces the need for fragile regex patterns and helps organizations comply with privacy regulations like GDPR. The system allows for actions such as redaction, hashing, or masking of detected names, ensuring that sensitive information is managed appropriately without disrupting non-sensitive data. Additionally, it provides a centralized findings page for reviewing detections and supports tuning and suppressing rules to avoid false positives. By incorporating human name detection into their processes, organizations can enhance their PII compliance, apply consistent governance, and streamline the management of sensitive data in their logs.