Company
Date Published
Author
Yael Goldstein, Paul Gottschling
Word count
981
Language
English
Hacker News points
None

Summary

Datadog's Live Processes provides insight into workloads by tracking resource consumption metrics, traces, and network data for each process running in the infrastructure. It correlates multiple data types by PID to identify issues such as network bandwidth saturation, application errors, infrastructure latency, and other problems in the system. This feature helps teams quickly find the scope of an issue, notify relevant teams, and investigate next steps before end-users are affected. By correlating processes with distributed tracing and APM data, Datadog enables users to easily determine which applications are facing resource constraints or using more resources than expected. The Live Processes view also allows users to inspect traces generated by a process, view logs for applications on problematic hosts, and monitor the performance of related processes in real-time. This comprehensive visibility into process-level activity helps teams respond quickly to issues and prevent similar problems in the future.