Home / Companies / Vespa / Blog / Post Details
Content Deep Dive

Leveraging frozen embeddings in Vespa with SentenceTransformers

Blog post from Vespa

Post Details
Company
Date Published
Author
Andrii Yurkiv
Word Count
1,666
Language
English
Hacker News Points
-
Summary

Leveraging frozen embeddings within the Vespa search application using SentenceTransformers offers a streamlined approach to managing the complexity of hybrid search systems, particularly in dynamic environments like e-commerce where search patterns frequently change. By freezing document vector representations and updating only query representations, this method reduces the need for frequent recalculation of embeddings when models are retrained, thus easing the maintenance burden. The article details the implementation of a bi-encoder model with asymmetric dense layers to achieve frozen embeddings, utilizing the sentence-transformers library for training, and integrating these models into Vespa through ONNX format exportation and custom embedding components. This approach not only facilitates efficient memory usage by sharing transformer weights between document and query models but also offers a plug-and-play training procedure for embedding generation, ultimately enhancing the manageability and scalability of Vespa applications.