The Why, How, and What of Metrics and Observability

Post Details

Company

DigitalOcean

Date Published

Nov. 29, 2017

Author

Kunju Perath

Word Count

1,552

Language

English

Hacker News Points

-

Source URL

www.digitalocean.com/blog/observability-and-metrics

Summary

DigitalOcean, a cloud infrastructure company, leverages Prometheus and Alertmanager for whitebox monitoring of its services and container clusters. Observability is crucial, comprising logging, metrics, and tracing. The company uses four golden signals (latency, saturation, traffic, and error) to monitor request-based microservices, while also utilizing the USE method (utilization, saturation, and errors) to monitor Kubernetes clusters. This monitoring setup enables the identification of long-term trends, analysis of performance issues, and setting up visualizations, ultimately improving observability and server reliability.