Different I/O Access Methods for Linux, What We Chose for ScyllaDB, and Why
Blog post from ScyllaDB
Server application developers often focus on network I/O, but database developers must also consider file I/O, particularly when choosing access methods on Linux servers. The text outlines four primary methods: traditional read/write, mmap, Direct I/O (DIO), and asynchronous direct I/O (AIO/DIO), each with distinct tradeoffs regarding cache control, copying, MMU activity, I/O scheduling, thread scheduling, I/O alignment, and application complexity. Traditional methods rely heavily on the kernel for caching and scheduling, while DIO and AIO/DIO offer more control to the application, although they increase complexity. ScyllaDB opts for AIO/DIO to maximize performance and control, utilizing the Seastar framework to manage the complexity and optimize I/O operations such as compaction and queries. This approach allows ScyllaDB to bypass the kernel's limitations, improve cache management, and align small reads for efficiency, showcasing the potential of directly driving NVMe drives in the future.