The Big OOM Theory
Blog post from Sysdig
The blog post narrates a troubleshooting story about a Docker container experiencing Out of Memory (OOM) errors, which was investigated by a Sysdig customer engineer. The issue involved a major hosting provider's LAMP stack, where a MySQL container was frequently killed by the kernel due to OOM events. Initial suspicions pointed to Sysdig's agent as the cause, but further investigation using Sysdig Monitor and Sysdig open-source tools revealed that the PHP-FPM container was leaking memory. By analyzing system calls and memory allocation patterns with tools like gdb, the engineer traced the problem to the ionCube PHP extension. Disabling this extension resolved the memory leak, underscoring the importance of using appropriate tools for container management and resource allocation to prevent such issues from affecting other containers. The post highlights the necessity of tools like Sysdig for gaining visibility and understanding container behaviors, as well as the value of setting resource limits to manage container resource usage effectively.