MUVERA: Making Multivectors More Performant

Post Details

Company

Qdrant

Date Published

Sept. 5, 2025

Author

Kacper Łukawski

Word Count

1,696

Language

English

Hacker News Points

-

Source URL

qdrant.tech/articles/muvera-embeddings

Summary

MUVERA embeddings, developed by Google Research, address the challenge of slow multi-vector searches by transforming multi-vector representations into single vectors for faster initial retrieval and then using the original multi-vectors for reranking the top results. This approach combines the speed of single-vector searches with the accuracy of multi-vector retrieval, significantly improving search efficiency. MUVERA embeddings are created by clustering vector spaces and using Locality-Sensitive Hashing (LSH) techniques like SimHash to transform variable-length sequences into fixed-dimensional representations. The FastEmbed 0.7.2 version supports MUVERA, offering approximately 7x speed improvements while maintaining the quality of search, making it a practical solution for multi-vector retrieval applications. However, the larger size of MUVERA embeddings compared to traditional single-vector embeddings necessitates careful consideration of storage and retrieval efficiency.