Home / Companies / Weaviate / Blog / Post Details
Content Deep Dive

8-bit Rotational Quantization: How to Compress Vectors by 4x and Improve the Speed-Quality Tradeoff of Vector Search

Blog post from Weaviate

Post Details
Company
Date Published
Author
Tobias Christiani
Word Count
6,624
Language
English
Hacker News Points
-
Summary

Vector quantization is an effective technique for compressing vectors in databases, significantly reducing storage requirements and enhancing search speed by speeding up distance computations. Through quantization, memory usage can be decreased by 4x to 32x, leading to accelerated searches, although with a potential trade-off in search quality. Techniques such as 8-bit Rotational Quantization, developed by Weaviate, offer a balance between speed and quality, improving memory efficiency and maintaining high recall rates. This method uses random rotations to ensure vectors are well-suited for scalar quantization, effectively handling high-dimensional embeddings. Weaviate's approach also incorporates fast pseudorandom rotations based on the Fast Walsh-Hadamard Transform, providing a practical solution for high-dimensional vector computation without the need for extensive pre-processing or training. This makes 8-bit Rotational Quantization a promising default option for optimizing vector search in large-scale databases, outperforming traditional methods in both speed and resource usage.