Exploring How the ScyllaDB Data Cache Works

Post Details

Company

ScyllaDB

Date Published

July 26, 2018

Author

Tomasz Grabiec

Word Count

2,074

Language

English

Hacker News Points

-

Source URL

www.scylladb.com/2018/07/26/how-scylla-data-cache-works

Summary

The blog post delves into the evolution of ScyllaDB's data caching mechanisms, highlighting improvements from version 1.7 to 2.4 to address read latency and cache management issues. Initially, ScyllaDB's cache was partition-based, causing inefficiencies with large partitions due to read amplification and cache pollution. Version 2.0 introduced row-level granularity for population, mitigating these inefficiencies by allowing for partial partition caching. However, eviction remained partition-based, leading to latency spikes. Version 2.2 further refined caching by switching to row-level eviction, thus enhancing efficiency by freeing individual rows based on usage, which aids in maintaining more relevant data in cache. Additionally, version 2.4 improved latency by enabling preemptive merging of in-memory partition versions, reducing CPU blocking during such processes. Performance tests compared ScyllaDB's advancements against previous versions and Cassandra, demonstrating significant improvements in read latency and cache management, particularly under conditions where partitions exceed cache size.