What is a Retriever?

Post Details

Company

LllamaIndex

Date Published

Jan. 30, 2024

Author

bstadt

Word Count

1,003

Language

English

Hacker News Points

-

Source URL

www.llamaindex.ai/blog/building-a-fully-open-source-retriever-with-nomic-embed-and-llamaindex-fc3d7f36d3e4

Summary

Retrieval augmented generation (RAG) enhances language models by incorporating a retriever and a database to reduce hallucinations and improve response quality without retraining the models. The process involves using an embedding model to convert database documents into vector representations, which are then matched with a query converted to a vector to retrieve relevant documents. This approach is exemplified by a tutorial on building a fully open-source retriever using LlamaIndex and Nomic Embed, the latter being an open-source model surpassing OpenAI Ada's performance. Open-source models like Nomic Embed offer complete auditability and adaptability, crucial for safe AI deployment in high-impact fields like defense and finance, and prevent vendor lock-in associated with closed-source models. The tutorial demonstrates setting up a retriever using LlamaIndex and the Nomic Embed model to handle document and query embeddings, thereby enabling effective search capabilities within RAG systems.