Home / Companies / ScyllaDB / Blog / Post Details
Content Deep Dive

ScyllaDB Vector Search: 1B Vectors with 2ms P99s and 250K QPS Throughput

Blog post from ScyllaDB

Post Details
Company
Date Published
Author
Szymon Wasik
Word Count
1,462
Language
English
Hacker News Points
-
Summary

ScyllaDB Vector Search is a high-performance solution designed to handle billion-scale datasets with ultra-low latency and high throughput, as validated by a benchmark using the yandex-deep_1b dataset containing 1 billion vectors of 96 dimensions. The system achieves this through an architecture that separates storage and indexing duties while maintaining a unified user perspective, with nodes storing structured data and vector embeddings in a distributed table. A dedicated Vector Store service, implemented in Rust and powered by the USearch engine, builds approximate-nearest-neighbour indexes in memory to ensure predictable single-digit millisecond latencies. Two usage scenarios were tested: one prioritized ultra-low latency with moderate recall, achieving 252,000 queries per second, while the other focused on high recall with slightly higher latency, maintaining 6,500 queries per second. ScyllaDB integrates structured and unstructured data retrieval, simplifying operational complexity by eliminating the need for separate systems and reducing network costs. With planned enhancements, including scalar quantization and sharding, ScyllaDB aims to further boost performance for real-time AI applications, offering a scalable and reliable solution for latency-critical tasks such as fraud detection and recommendation systems.