Home / Companies / Datadog / Blog / Post Details
Content Deep Dive

2023-03-08 Incident: A Deep Dive into the Platform-level Impact

Blog post from Datadog

Post Details
Company
Date Published
Author
Laurent Bernaille
Word Count
3,864
Language
English
Hacker News Points
1
Summary

On March 8, 2023, Datadog experienced an outage that affected all services across multiple regions due to a systemd update in Ubuntu 22.04. The new systemd-networkd behavior led to the flushing of IP rules and loss of network connectivity for both host and pod traffic on AWS and Azure, while only affecting host traffic on Google Cloud. This incident impacted multiple regions across distinct cloud providers and delayed the recovery process due to the different actions required by each provider.