Home / Companies / ScyllaDB / Blog / Post Details
Content Deep Dive

ScyllaDB’s Compaction Strategies Series: Space Amplification in Size-Tiered Compaction

Blog post from ScyllaDB

Post Details
Company
Date Published
Author
Nadav Har'El
Word Count
2,266
Language
English
Hacker News Points
-
Summary

The blog post introduces the first of a four-part series on ScyllaDB's compaction strategies, focusing on the Space Amplification issue within the Size-Tiered Compaction Strategy (STCS), which is also applicable to Apache Cassandra. It explains that STCS, though the default strategy for both databases, suffers from space amplification, where disk space usage exceeds a perfectly compacted single sstable, leading to inefficiencies, particularly with SSDs. Experiments demonstrate that space amplification can lead to needing twice the disk space or more due to temporary duplication of data during compaction. The post highlights the severity of the problem when data is overwritten repeatedly and mentions that upcoming articles will explore other strategies like Leveled and Hybrid Compaction, which aim to address these issues while maintaining low write amplification.