Large Partition Support in ScyllaDB

Post Details

Company

ScyllaDB

Date Published

Aug. 25, 2016

Author

Paweł Dziepak

Word Count

1,171

Language

English

Hacker News Points

-

Source URL

www.scylladb.com/2016/08/25/large-partitions

Summary

ScyllaDB 1.3 introduced enhanced support for large partitions, addressing the challenges they pose to data modeling and performance. Previously, large partitions could lead to memory issues, cache eviction, and performance degradation, particularly when partitions exceeded the 2 billion clustering rows limit or when dealing with unpredictable data inputs. The update allows the database engine to handle partitions at a clustering row granularity, enabling processes like SSTable compaction and streaming to operate without needing the entire partition in memory. While the new version improves performance by reading only data relevant to query results, there are still limitations, such as handling partitions larger than 128 kB during data streaming and maintaining cache efficiency for large partitions. Tools like nodetool cfstats and cfhistograms aid in monitoring partition sizes, and future updates are expected to further enhance capabilities, addressing current constraints like reversed queries needing entire partitions in memory.