/plushcap/analysis/datadog/engineering-2023-03-08-deep-dive-into-platform-level-impact

2023-03-08 Incident: A Deep Dive into the Platform-level Impact

What's this blog post about?

On March 8, 2023, Datadog experienced an outage that affected all services across multiple regions due to a systemd update in Ubuntu 22.04. The new systemd-networkd behavior led to the flushing of IP rules and loss of network connectivity for both host and pod traffic on AWS and Azure, while only affecting host traffic on Google Cloud. This incident impacted multiple regions across distinct cloud providers and delayed the recovery process due to the different actions required by each provider.

Company
Datadog

Date published
May 24, 2023

Author(s)
Laurent Bernaille

Word count
3864

Hacker News points
1

Language
English


By Matt Makai. 2021-2024.