Efficient personal search at large scale
Blog post from Vespa
Vespa introduces a cost-effective approach to personal search at a massive scale through a method called streaming search, which eliminates the need for maintaining expensive global indexes by utilizing separate small indexes per user. This approach significantly reduces the cost associated with index updates and queries, which is a major issue in traditional methods for handling large personal data stores like Gmail. Vespa's streaming search mode operates by storing raw user data in a log-level store, distributing data across nodes for efficient query handling, and implementing a full search engine over the raw data without global indexing. The solution offers benefits such as reduced costs and stable latencies over time, making it suitable for applications like email search, personal suggestions, and private content searches. Vespa's implementation allows for structured and text search, advanced relevance, and features like faceting, providing a scalable and proven framework for personal search solutions.