Home / Companies / Tiger Data / Blog / Post Details
Content Deep Dive

From ts_rank to BM25. Introducing pg_textsearch: True BM25 Ranking and Hybrid Retrieval Inside Postgres

Blog post from Tiger Data

Post Details
Company
Date Published
Author
Todd
Word Count
3,741
Language
English
Hacker News Points
-
Summary

pg_textsearch is a new PostgreSQL extension designed to enhance AI-native applications by providing a modern BM25 ranking system, combining vector and keyword search capabilities within a single database. This extension addresses the limitations of Postgres' native full-text search by introducing improvements like inverse document frequency weighting, term frequency saturation, and length normalization to ensure high-quality search results. It is particularly beneficial for systems such as Retrieval-Augmented Generation (RAG) and chat agents that rely on precise and contextually relevant information retrieval. The extension integrates seamlessly with Postgres and provides a hybrid search approach that combines the conceptual similarity of vector search with the precision of keyword matching, enhancing the performance and relevance of search results for AI applications. The preview release focuses on a memtable layer for fast in-memory operations, with future plans to incorporate disk-based segments and advanced query optimizations.