Leveraging frozen embeddings in Vespa with SentenceTransformers

Post Details

Company

Vespa

Date Published

July 10, 2023

Author

Andrii Yurkiv

Word Count

1,666

Language

English

Hacker News Points

-

Source URL

blog.vespa.ai/leveraging-frozen-embeddings-in-vespa-with-sentence-transformers

Summary

Leveraging frozen embeddings within the Vespa search application using SentenceTransformers offers a streamlined approach to managing the complexity of hybrid search systems, particularly in dynamic environments like e-commerce where search patterns frequently change. By freezing document vector representations and updating only query representations, this method reduces the need for frequent recalculation of embeddings when models are retrained, thus easing the maintenance burden. The article details the implementation of a bi-encoder model with asymmetric dense layers to achieve frozen embeddings, utilizing the sentence-transformers library for training, and integrating these models into Vespa through ONNX format exportation and custom embedding components. This approach not only facilitates efficient memory usage by sharing transformer weights between document and query models but also offers a plug-and-play training procedure for embedding generation, ultimately enhancing the manageability and scalability of Vespa applications.