8-bit Rotational Quantization: How to Compress Vectors by 4x and Improve the Speed-Quality Tradeoff of Vector Search

Post Details

Company

Weaviate

Date Published

Aug. 26, 2025

Author

Tobias Christiani

Word Count

6,624

Language

English

Hacker News Points

-

Source URL

weaviate.io/blog/8-bit-rotational-quantization

Summary

Vector quantization is an effective technique for compressing vectors in databases, significantly reducing storage requirements and enhancing search speed by speeding up distance computations. Through quantization, memory usage can be decreased by 4x to 32x, leading to accelerated searches, although with a potential trade-off in search quality. Techniques such as 8-bit Rotational Quantization, developed by Weaviate, offer a balance between speed and quality, improving memory efficiency and maintaining high recall rates. This method uses random rotations to ensure vectors are well-suited for scalar quantization, effectively handling high-dimensional embeddings. Weaviate's approach also incorporates fast pseudorandom rotations based on the Fast Walsh-Hadamard Transform, providing a practical solution for high-dimensional vector computation without the need for extensive pre-processing or training. This makes 8-bit Rotational Quantization a promising default option for optimizing vector search in large-scale databases, outperforming traditional methods in both speed and resource usage.