Company
Date Published
Author
David Xue
Word count
1081
Language
English
Hacker News points
None

Summary

Ask Astro, a RAG-based chat assistant leveraging large language models, has been a popular tool for inquiries related to Astronomer products and Apache Airflow, but as its usage grew, challenges related to document retrieval and answer accuracy emerged. Initially, the system utilized a combination of LangChain, GPT-3.5, GPT-4, and OpenAI's text-embedding-ada-002 for document retrieval, but it occasionally faced issues with irrelevant document retrieval, particularly when certain queries didn't align well with the embedding model. To address these challenges, a hybrid search approach was introduced, combining BM25 sparse vector search with dense vector search, which improved retrieval accuracy by leveraging both keyword-based and vector search techniques. This hybrid search allowed for a larger pool of candidate documents, enhancing the robustness of the system and enabling more accurate re-ranking of documents using Cohere Rerank. With Cohere Rerank, the system processes user prompts and document content to calculate relevancy scores, ultimately retaining the top eight highest-scoring documents, significantly improving the accuracy and relevance of the responses. The integration of hybrid search and Cohere Rerank resulted in a 13.5% improvement in response accuracy and effectively addressed previous issues, setting a new standard for Ask Astro's answer accuracy and document relevance.