Home / Companies / New Relic / Blog / Post Details
Content Deep Dive

Preventing network outages: How we use New Relic to monitor our multi-cloud infrastructure

Blog post from New Relic

Post Details
Company
Date Published
Author
Mehreen Tahir, Software Engineer
Word Count
1,744
Language
English
Hacker News Points
-
Summary

New Relic developed an internal network monitoring system called Weather Station to ensure continuous observability across its multi-cloud infrastructure, which supports thousands of customer applications. By building Weather Station using New Relic's own platform, the company can perform over 100,000 connectivity checks per hour to detect network issues instantly within availability zones, between regions, and across cloud providers. Weather Station employs a combination of dedicated monitoring networks and Kubernetes pods within production cells to provide a comprehensive view of network health, significantly improving mean time to detect (MTTD) and mean time to resolve (MTTR) network issues. This system automates the validation of network paths and alerts teams with precise context, thereby preventing costly outages and enhancing operational confidence during infrastructure changes. The implementation of Weather Station highlights New Relic's use of its own observability tools to address complex network challenges, demonstrating the importance of collecting the right data from the right places and integrating network monitoring into every aspect of infrastructure management.