Benchmarking Grafana Enterprise Metrics for horizontally scaling Prometheus up to 500 million active series
Blog post from Grafana Labs
Grafana Labs conducted performance tests on Grafana Enterprise Metrics (GEM), a self-hosted Prometheus service, to evaluate its horizontal scalability and performance under high loads. By developing the benchtool, they tested GEM with up to 500 million active series, demonstrating its ability to handle this scale with linear hardware usage. The tests showed that GEM maintained a write path latency of less than 3.1 seconds at the 99th percentile and successfully handled over 99% of query requests, with median query latency remaining under 1.35 seconds. Additionally, the alternative KV store, memberlist, proved capable of managing over 500 ring members, offering flexibility in environments without Consul. Future testing will focus on the GEM compactor and store-gateway throughput to further enhance performance for queries involving older data. These findings underscore GEM's robustness and scalability in handling large-scale metrics workloads.