Home / Companies / Weaviate / Blog / Post Details
Content Deep Dive

More efficient multi-vector embeddings with MUVERA

Blog post from Weaviate

Post Details
Company
Date Published
Author
Roberto Esposito, Joon-Pil (JP) Hwang
Word Count
4,261
Language
English
Hacker News Points
-
Summary

Weaviate 1.31 introduces the MUVERA encoding algorithm, which converts multi-vector embeddings into single fixed-size vectors, significantly reducing memory and computational costs. This innovation addresses the challenges of multi-vector models, such as high memory usage and slower import and search speeds, by transforming complex multi-vector embeddings into simpler, fixed-dimensional encodings. In tests using the LoTTE dataset, MUVERA reduced memory footprint by approximately 70% and improved import times from over 20 minutes to 3-6 minutes, albeit with a slight loss in recall quality. This trade-off can be mitigated by adjusting HNSW ef values, which, while increasing recall, may reduce query throughput. MUVERA is particularly suited for large-scale deployments where memory costs are substantial, and applications that can tolerate minor recall degradation. The algorithm's implementation in Weaviate 1.31+ offers configuration options to balance these trade-offs, providing a practical solution for managing extensive datasets efficiently.