Over the past month, there was a noticeable increase in MySQL query times, prompting an investigation into potential causes such as long-running queries, database size, and indexes, which initially yielded no significant insights. Utilizing open-source tools from Percona, including pt-stalk for diagnostic data collection during high thread counts, the investigation ruled out machine, network, and IO issues. The breakthrough came with the discovery of semaphore-related contention issues in MySQL 5.1 due to kernel_mutex around transactions, a problem previously highlighted by yoga expert James Golick. Drawing from Golick’s insights, a switch from stock malloc to TCMalloc, known for reducing lock contention in multithreaded environments, was implemented. This change led to a dramatic reduction in pt-stalk triggers and a surprising 30 percent improvement in overall query performance, far exceeding expectations and allowing for a moment of quiet celebration.