TurboQuant in Qdrant
Blog post from Qdrant
Qdrant 1.18 introduces TurboQuant, a new vector quantization method developed by Google Research, which enhances compression for production embeddings while maintaining high recall rates. TurboQuant offers a rotation-based quantization approach with different operating points, such as 4-bit, 2-bit, and 1-bit, allowing it to outperform existing methods like Scalar Quantization (SQ) and Binary Quantization (BQ) in storage efficiency and recall. Benchmarks demonstrate that TurboQuant 4-bit provides competitive recall with SQ but at half the memory usage, while TurboQuant 2-bit and 1-bit achieve significantly higher recall than BQ at the same storage levels. The implementation includes enhancements like length renormalization and per-coordinate calibration, ensuring robust performance across various datasets. The migration to TurboQuant requires a simple configuration change and re-indexing, offering a seamless upgrade path for users of SQ and BQ.