Why We Changed ScyllaDB’s Data Streaming Approach
Blog post from ScyllaDB
ScyllaDB has transitioned from mutation-based streaming to file-based streaming, resulting in a 25-fold increase in streaming speed and a 10-time improvement in network bandwidth. This shift involves streaming entire SSTable files directly between nodes without the need for deserialization and re-serialization, reducing CPU usage significantly, especially for data models with small cells. The compact nature of SSTable files compared to mutation fragments reduces the amount of data transmitted over the network, enhancing efficiency and performance. Tests conducted on ScyllaDB nodes demonstrated much lower CPU usage and faster data transfer with the new method, marking a significant improvement in ScyllaDB's data streaming capabilities. This new streaming approach is available in ScyllaDB Cloud and the ScyllaDB 2025.1 release.