Improving Ask Astro: The Journey to Enhanced Retrieval Augmented Generation (RAG) with Cohere Rerank, Part 4

Company

Astronomer

Date Published

March 7, 2024

Author

David Xue

Word count

1081

Language

English

Hacker News points

None

URL

www.astronomer.io/blog/ask-astro-enhanced-retrieval-augmented-generation-cohere-rerank

Summary

Ask Astro, a RAG-based chat assistant leveraging large language models, has been a popular tool for inquiries related to Astronomer products and Apache Airflow, but as its usage grew, challenges related to document retrieval and answer accuracy emerged. Initially, the system utilized a combination of LangChain, GPT-3.5, GPT-4, and OpenAI's text-embedding-ada-002 for document retrieval, but it occasionally faced issues with irrelevant document retrieval, particularly when certain queries didn't align well with the embedding model. To address these challenges, a hybrid search approach was introduced, combining BM25 sparse vector search with dense vector search, which improved retrieval accuracy by leveraging both keyword-based and vector search techniques. This hybrid search allowed for a larger pool of candidate documents, enhancing the robustness of the system and enabling more accurate re-ranking of documents using Cohere Rerank. With Cohere Rerank, the system processes user prompts and document content to calculate relevancy scores, ultimately retaining the top eight highest-scoring documents, significantly improving the accuracy and relevance of the responses. The integration of hybrid search and Cohere Rerank resulted in a 13.5% improvement in response accuracy and effectively addressed previous issues, setting a new standard for Ask Astro's answer accuracy and document relevance.