Introducing Document Enrichment with Large Language Models in Vespa
Blog post from Vespa
Vespa has introduced a new capability for document enrichment using Large Language Models (LLMs) to enhance search applications by transforming raw text into structured data and adding contextual information, thereby improving search relevance. This process traditionally required dedicated natural language processing pipelines or third-party APIs, but with LLMs, tasks such as named entity extraction, categorization, keyword generation, anonymization, translation, and summarization can be accomplished without custom code. Vespa supports both local LLMs and external OpenAI-compatible APIs, and has integrated a new indexing expression called "generate" that uses LLMs during document ingestion to create enriched fields for search without additional latency. This approach differs from retrieval-augmented generation (RAG) as it does not rely on LLMs at query time but rather enriches documents during ingestion, allowing them to be more effectively indexed and searched. The performance and cost of document enrichment depend on the chosen LLM, with smaller models offering cost-effective solutions for less complex tasks. Vespa's document enrichment is scalable and can be extended with custom components for specific applications, offering a practical and powerful method for large-scale document enrichment in search applications.