Customer Journeys at Capital One: Observability from Business Impact to Root Cause
Blog post from Observe
Capital One's approach to observability, as detailed in their blog post, emphasizes starting investigations from the customer interaction layer and using automated customer journey graphs to understand the impact on business operations before delving into technical details. This method reverses traditional observability practices, which begin at the service layer, by focusing first on how incidents affect customer experiences and then tracing issues back through service and infrastructure layers to identify root causes. The integration of journey graphs in their observability platform, Observe, allows for a visual representation of customer interactions, with health signals and traffic overlaid to quickly assess impact. This approach is further enhanced by AI-powered SRE tools that streamline the investigation process by navigating the observability context graph in the telemetry datalake, linking business impacts to technical causes with greater efficiency. By defining the relationships between systems and business entities, Capital One's method enables faster incident resolution by maintaining the context of business impact throughout the troubleshooting process, shifting focus directly to mitigating customer-facing issues.