The author, a performance engineer at SingleStore, encountered an uncommon Linux performance issue while running a synthetic workload on a columnstore table. The workload was using 16 threads to execute a simple query against the table, but the CPU cores were spending about 50% of their time idle. Using `perf_events` and an `awk` script, the author created an off-cpu flamegraph that revealed that every `mmap` syscall was taking around 10-20ms due to contention on the `mm->mmap_sem` lock. The issue was caused by SingleStore's use of `mmap`, which was inadvertently benchmarking Linux's `mmap` functionality, leading to significant performance overhead. To resolve the issue, the author switched from using `mmap` to the traditional file `read` interface, resulting in nearly doubled throughput and becoming CPU bound as expected.