Less is More: Why Use Retrieval Instead of Larger Context Windows

Post Details

Company

Pinecone

Date Published

July 20, 2023

Author

Amnon Catav

Word Count

1,052

Language

English

Hacker News Points

-

Source URL

www.pinecone.io/blog/why-use-retrieval-instead-of-larger-context

Summary

Large Language Models (LLMs) have recently seen interest in expanding context windows, with companies like Anthropic and OpenAI releasing models with larger capacities, yet studies show that increasing context size can lead to decreased accuracy and higher computation costs. Research, including a Stanford paper titled "Lost in the Middle," highlights that LLMs struggle to extract relevant information from large, incoherent contexts, often leading to increased hallucination risks. Experiments demonstrate that LLMs perform better when provided with fewer, more relevant documents rather than numerous unfiltered ones. Retrieval systems, which have been optimized over decades, offer a more efficient solution by providing focused, relevant contexts through a method known as Retrieval Augmented Generation (RAG). This approach improves accuracy and reduces costs compared to using large context windows, even when handling single extensive documents. Hence, retrieval systems are crucial for enhancing model accuracy and efficiency, translating to lower operating costs and reduced hallucination risks in generative AI applications.