Home / Companies / Vespa / Blog / Post Details
Content Deep Dive

Accelerating stateless model evaluation on Vespa

Blog post from Vespa

Post Details
Company
Date Published
Author
Lester Solbakken
Word Count
2,164
Language
English
Hacker News Points
-
Summary

Vespa.ai enhances its capabilities by accelerating stateless model evaluation in its container cluster using ONNX Runtime, allowing for efficient processing and transformation of documents or queries before storage or execution. This development introduces new use cases, such as generating vector representations for natural language text and enabling nearest neighbor retrieval, which were traditionally performed in the content cluster. The Vespa platform supports both stateless and stateful model evaluations, with container nodes handling stateless evaluations that involve single model evaluations per query or document, while content nodes manage stateful evaluations that require multiple evaluations using combined query and document data. The platform's architecture allows for ease of deployment and flexibility, automatically enabling a REST API for model discovery and evaluation and facilitating the integration of custom request handlers, document processors, and searchers, which leverage machine-learned models to enhance query processing and document transformation tasks. The introduction of stateless model evaluation acceleration in Vespa.ai's container cluster further solidifies its position as a robust platform for executing low-latency computations over large datasets without the need for external model servers, thereby reducing system complexity and improving scalability.