Home / Companies / Datadog / Blog / Post Details
Content Deep Dive

Maintain observability during cloud outages with Datadog Disaster Recovery

Blog post from Datadog

Post Details
Company
Date Published
Author
Noman Hamlani, Ron Hay, Michael Richey
Word Count
1,063
Language
English
Hacker News Points
-
Summary

Datadog Disaster Recovery (DDR) is designed to ensure continuous observability during widespread infrastructure disruptions by allowing teams to configure a secondary Datadog site that mirrors the primary one and activates on demand. This approach addresses the challenges of cloud provider outages, as demonstrated by historical incidents involving AWS and Google Cloud, which led to significant financial losses and operational blind spots. DDR supports both active-active and active-passive configurations, with the latter being more cost-effective for most organizations. It utilizes managed resource synchronization to keep the secondary site prepared, enabling seamless failover activation without the need to rebuild resources like dashboards and monitors. Failover can be triggered through agent-based or DNS-based methods, with the flexibility for teams to decide the optimal timing based on operational needs, ensuring that telemetry data routes efficiently to the secondary site when necessary.