Home / Companies / Ploomber / Blog / Post Details
Content Deep Dive

Removing PII Data from OpenAI API Calls with Presidio and FastAPI

Blog post from Ploomber

Post Details
Company
Date Published
Author
-
Word Count
1,497
Language
English
Hacker News Points
-
Summary

Presidio, an open-source framework from Microsoft, is utilized to anonymize personally identifiable information (PII) data in conjunction with OpenAI's API, but its implementation requires meticulous configuration and compliance checks for each application using the API. To address these operational challenges, a reverse proxy employing FastAPI and Presidio is proposed as a more centralized solution. This proxy intercepts all OpenAI API calls, anonymizes sensitive data using Presidio, and forwards the sanitized requests to OpenAI, ensuring consistent privacy protection across an organization without requiring individual application changes. Although the setup efficiently sanitizes requests, it currently only supports PII removal for the /chat/completions/ endpoint, and using Presidio's default settings may not align with every company's data policy, leading to potential information loss. An enterprise-grade solution is suggested for enhanced customization and auditing capabilities, providing unique identifiers for redacted data and a user interface for PII rule customization. Deployment involves using Ploomber Cloud to facilitate the process, which ensures seamless integration with OpenAI's API while maintaining data privacy.