Hybrid Search Using Reciprocal Rank Fusion in SQL
Blog post from SingleStore
Developers building AI applications often utilize hybrid search techniques, such as combining vector similarity and full-text search, to enhance the relevance of search results. Reciprocal Rank Fusion (RRF) is a preferred method for blending these different ranking systems, as it normalizes relevance information and is not dependent on the magnitude or range of scores. RRF employs a smoothing factor to ensure gradual weight differences between ranks, allowing for more balanced scoring. In SQL implementations, such as SingleStore, RRF can be efficiently executed using Common Table Expressions (CTEs) and the ROW_NUMBER window function to maintain indexed searches and enhance performance. This method involves computing separate ranked lists for vector and text searches, then combining them with a FULL OUTER JOIN, which considers all items from both lists. This approach is advantageous as it allows the query processor to handle the ranking, saving development time and resources. Additionally, the flexibility of SQL supports implementing more advanced reranking models, making hybrid search a robust solution for AI systems.