Home / Companies / Vespa / Blog / Post Details
Content Deep Dive

Enhancing Vespa’s Embedding Management Capabilities

Blog post from Vespa

Post Details
Company
Date Published
Author
Jo Kristian Bergum
Word Count
833
Language
English
Hacker News Points
-
Summary

Vespa has announced upgrades to its embedding management capabilities, enhancing its support for inference with text embedding models by integrating Huggingface models, including multilingual options and GPU acceleration for faster processing. The updates allow developers to efficiently implement semantic search applications without the need for separate systems for embedding inference and vector search. Vespa now supports embedding models in ONNX format, enabling streamlined deployment and improved scalability, while the Vespa Model Hub offers a wider selection of state-of-the-art text embedding models for developers to explore. These improvements reduce latency and cost while supporting cross-lingual applications, allowing developers to leverage powerful models with minimal configuration. Vespa Cloud further simplifies scaling by automatically handling changes in inference traffic volume, and GPU acceleration is available for instances utilizing GPU resources, enhancing performance and cost-effectiveness.