Home / Companies / Unstructured / Blog / Post Details
Content Deep Dive

RAG vs. Long-Context Models. Do we still need RAG?

Blog post from Unstructured

Post Details
Company
Date Published
Author
Maria Khalusova
Word Count
1,660
Language
English
Hacker News Points
-
Summary

Retrieval augmented generation (RAG) is a technique that enhances large language models (LLMs) by augmenting their text generation with relevant information from external knowledge bases, addressing limitations like hallucinations common in models trained solely on publicly available data. While the expansion of context windows in LLMs, such as Gemini 1.5 Pro's 2 million-token capacity and the possibility of models with infinite context windows, offers potential advantages, RAG remains crucial for its efficiency, scalability, and cost-effectiveness. It provides transparency and accountability by allowing LLMs to trace information back to its source, which is critical in sectors like finance, healthcare, and law. RAG also facilitates role-based access control by retrieving only necessary information for specific queries, further enhancing data security. Despite the promise of long-context models, RAG's ability to efficiently retrieve and manage diverse data sources, coupled with its computational efficiency and transparency, ensures its continued relevance, even in a future where infinite context models might exist.