Investigating Linux Phantom Disk Reads
Blog post from QuestDB
QuestDB, an open-source time-series database designed for high-performance workloads, encountered an unexpected issue involving disk reads during a write-only workload. This anomaly was traced back to the Linux kernel's readahead behavior, which led to unnecessary disk read operations due to memory pressure from handling a large number of column files. The investigation involved using Linux utilities like blktrace and debugfs to track disk read events and identify their source. The problem was resolved by disabling readahead using the madvise system call, which prevented redundant reads. This case highlighted the importance of understanding Linux's buffered I/O and using system-level tools to address performance issues, ultimately leading to improvements in QuestDB's functionality. The experience reinforced the value of user feedback in enhancing the database's performance and functionality.