How Dropbox rebuilt its logging stack with Grafana Loki after a data center went dark

Post Details

Company

Grafana Labs

Date Published

June 27, 2025

Author

Colin Steele

Word Count

934

Company Posts That Month

23

Language

English

Hacker News Points

-

Post removed?

No

Source URL

grafana.com/blog/how-dropbox-rebuilt-its-logging-stack-with-grafana-loki-after-a-data-center-went-dark

Summary

Dropbox encountered a significant challenge when a power outage rendered its sole data center hosting Grafana Loki inaccessible, prompting the company to enhance its logging infrastructure. This incident led to the development of a petabyte-scale, multi-region logging platform that can handle up to 6 GB of logs per second with a 30-day retention policy, ensuring availability even during data center failures. The transition involved addressing issues such as high cardinality and memory crashes by imposing strict label control and implementing stream-level controls to prevent one service from overwhelming the system. Gradual deployment and testing were crucial, as was collaboration with Grafana Labs to improve performance with a switch to a Prometheus-style database. This new, robust system has become an integral part of Dropbox's observability stack, allowing the company to phase out its legacy logging system and ensuring high availability and reliability across its operations.

Trends Found in this Post

Trend	Post Mentions	Total Month Mentions	Posts	Companies	MoM
Observability	3	1,870	422	128	+10%
Kubernetes	2	1,613	282	85	+4%

Use This Data

Use this post, company, and trend context to find content marketing opportunities, perform competitive analysis, or address product feature gaps via the Plushcap MCP server or the Plushcap API.