Company
Date Published
Author
Karl-Martin Karlson
Word count
1268
Language
English
Hacker News points
None

Summary

Pipedrive, a company known for its CRM platform, faced challenges with the scalability of their Prometheus-based observability setup as their infrastructure expanded across multiple data centers and AWS regions. The limitations of Prometheus became evident when the system began to crash and run out of memory under high load, leading to delays and lost observability data. To address these issues, Pipedrive implemented Grafana Mimir, which offered enhanced scalability and performance improvements, such as fast query performance and per-tenant limits to manage data ingestion and queries more effectively. The migration to Mimir was seamless and transparent for users, and it allowed Pipedrive to handle between 12 and 15 million active series while fine-tuning high cardinality metrics. The transition reduced downtime and false alerts, improved overall system stability, and set the stage for further optimization and expansion of their observability stack. As a result, Grafana Mimir has become a cornerstone of Pipedrive's monitoring strategy, with plans to explore additional Grafana tools like Loki and Tempo in the future.