Database Internals: Working with IO
Blog post from ScyllaDB
The blog post, an excerpt from the book "Database Performance at Scale," explores various Linux I/O methods and their implications for database performance, particularly when leveraging modern SSDs. It outlines the trade-offs between traditional read/write, memory-mapped I/O (mmap), direct I/O (DIO), and asynchronous I/O (AIO/DIO), highlighting how these methods differ in terms of cache management, I/O scheduling, thread management, and application complexity. The post emphasizes the impact of storage choices—such as using filesystems versus raw block devices—and the nuances of appending writes versus in-place updates. Additionally, it discusses the role of SSD characteristics, specifically IOPS and throughput, in optimizing database performance and introduces tools like Diskplorer for understanding disk behavior under load. The io_uring API is presented as a modern approach to asynchronous I/O, offering improved performance and documentation over legacy methods.