Faster, Safer Node Operations with Repair vs Streaming
Blog post from ScyllaDB
ScyllaDB 5.4 introduces Repair-Based Node Operations (RBNO) as the default method for handling data streaming during node operations, such as adding, removing, and replacing nodes, marking a significant shift from the traditional streaming approach inherited from Apache Cassandra. RBNO utilizes row-level repair to synchronize data between nodes, enhancing performance, consistency, and reliability by ensuring that the latest data is always replicated accurately. This update eliminates the need for post-operation repairs and is complemented by off-strategy compaction and gossip-free node operations, which further optimize the node operation process. Off-strategy compaction speeds up operations by deferring the integration of SSTables until the node operation completes, while gossip-free node operations ensure cluster-wide consistency and allow automatic reversion to previous states in case of errors. These innovations collectively aim to improve data integrity, efficiency, and safety, addressing previous shortcomings in data durability and consistency.