Reducing latency spikes by tuning the CPU scheduler
Blog post from ScyllaDB
In an investigation of latency spikes in ScyllaDB 1.0.x, it was found that the scylla-jmx service, running concurrently with the ScyllaDB server, contributed to significant request latency due to scheduling delays. This was particularly noticeable under a light, read-only workload that fit entirely in memory. By tuning the CPU scheduler, including configuring parameters like sched_latency_ns and sched_min_granularity_ns, and adjusting the cgroup settings for process management, these latency issues were largely mitigated. These changes resulted in a marked improvement in latency, reducing it to a maximum of around 5 ms, compared to tens of milliseconds previously observed. The upcoming release of ScyllaDB 1.2 incorporates these fixes, ensuring that scylla-jmx no longer interferes with server performance when idle, and automatically applies the necessary scheduler configurations to minimize the impact of other processes on system latency.
| Trend | Post Mentions | Total Month Mentions | Posts | Companies | MoM |
|---|---|---|---|---|---|
| Observability | 1 | 9 | 6 | 3 | -25% |