Home / Companies / Datadog / Blog / Post Details
Content Deep Dive

The trouble with mounting

Blog post from Datadog

Post Details
Company
Date Published
Author
Greg Meyer, Yann Mahe
Word Count
663
Language
English
Hacker News Points
-
Summary

The Datadog team discovered a bug in their agent that caused it to hang indefinitely on systems with certain types of network file systems, specifically NFS. The issue was traced to the `os.statvfs` function, which is a Linux system call that can cause the system to hang if used in an unkillable state. This happens when trying to stat a remote directory mounted with NFS, and it's a known issue due to the way NFS handles connections. To fix this, the Datadog team had to implement a workaround by running the `statvfs` call on a separate thread and using a timeout to prevent the main thread from hanging. This solution comes with trade-offs, such as slightly increased memory usage on systems with hard-mounted NFS disks, but it allows the agent to operate in a wide range of heterogeneous environments.