How databases handle 10 million devices in high-cardinality benchmarks
Blog post from QuestDB
High-cardinality, a concept often encountered in databases, particularly those handling large data sets like IoT and monitoring systems, refers to the number of unique elements in a set, which can affect database performance. This article explores how high-cardinality manifests in time series databases, using examples to illustrate how quickly cardinality can grow with the addition of new data categories or tags. It discusses the use of the Time Series Benchmark Suite (TSBS) to measure database performance under varying degrees of cardinality, highlighting QuestDB's capabilities in managing high-cardinality data through parallelized operations and architectural choices that optimize read and write processes. QuestDB demonstrates consistent performance across varying cardinality levels, thanks to its ability to efficiently process indexed columns with many unique values, a common challenge in other systems that often experience performance degradation at high cardinality. The article also outlines configuration options available in QuestDB to optimize data ingestion, such as adjusting commit lag and maximum uncommitted rows, emphasizing the importance of schema planning for high-cardinality datasets to ensure efficient resource allocation and maintain system stability.