Retrieving Privacy-Safe Documents Over A Network
Blog post from LllamaIndex
A recent blog post introduces the llama-index-networks library extension, which enables the creation of networks of retrieval-augmented generation (RAG) systems for querying diverse knowledge stores. The post emphasizes the importance of privacy when sharing data across such networks and demonstrates how to turn private data into privacy-safe versions using Privacy-Enhancing Techniques, specifically Differential Privacy. The narrative follows three fictional characters—Alex, Bob, and Beth—to illustrate a scenario where Bob and Beth must protect their sensitive data before sharing it with Alex. The blog explains how Differential Privacy adds mathematical guarantees to prevent adversaries from identifying individuals in a dataset through differential private algorithms, thus enabling the creation of synthetic data that can be safely shared. The post uses an example involving the Symptom2Disease dataset, demonstrating the generation of privacy-safe synthetic observations and the advantages of a NetworkRetriever over individual contributors' retrievers, showcasing the balance between privacy protection and data utility.