Home / Companies / Coralogix / Blog / Post Details
Content Deep Dive

Beyond a Billion Spans: Using Highlights for High-Speed Root Cause Analysis at Scale

Blog post from Coralogix

Post Details
Company
Date Published
Author
Jonny Steiner
Word Count
1,611
Language
English
Hacker News Points
-
Summary

In "Beyond a Billion Spans: Using Highlights for High-Speed Root Cause Analysis at Scale," the introduction of Trace Highlight Comparison in late 2025 addresses the challenge of managing vast amounts of telemetry spans in microservices architectures. This innovation aims to avoid the high costs and delays associated with comprehensive indexing by focusing on identifying trends and integrating them into active incident responses. During a Priority 1 alert caused by a critical latency spike in an eCommerce flow, the methodology involves a top-down workflow, starting with high-level performance analysis to isolate the latency issue in the web-app service. By utilizing tools like the RED metrics graph and Highlights panel, the process narrows down from a macro-level aggregation to code-level evidence, identifying a 264ms latency increase during an HTTP GET action as the root cause. This efficient approach allows organizations to transition from large-scale telemetry data to specific incident resolution without the need for exhaustive manual searches, highlighting a shift in operational governance and promoting a pattern-first resolution strategy.