Home / Companies / Astronomer / Blog / Post Details
Content Deep Dive

Cross-Region Disaster Recovery on Astro Is Now Generally Available: Here's How We Built It

Blog post from Astronomer

Post Details
Company
Date Published
Author
-
Word Count
2,532
Company Posts That Month
12
Language
English
Hacker News Points
-
Summary

Cross-region disaster recovery (DR) on Astro is now generally available for AWS data planes, allowing customers to seamlessly fail over their Airflow workloads to a secondary region with just a click. The DR solution was developed to meet business-critical demands from industries like financial services and healthcare, providing an essential backup for enterprise-scale Airflow operations. This innovation alleviates the burden of building parallel infrastructure for DR, which traditionally required significant engineering effort. The system operates by provisioning a secondary EKS cluster in a warm standby mode, ensuring continuity through data replication across three categories: Airflow metadata, task logs, and container images. The architecture relies on AWS Aurora Global Clusters for efficient cross-region replication, bi-directional S3 replication for task logs, and a headless database setup that optimizes costs by running compute instances only when necessary. Programmatic control is available via the Astro API and Terraform, facilitating automated DR operations. Observability and health monitoring are maintained across both primary and secondary clusters, with a focus on centralizing DR awareness in the manifest system to simplify maintenance. Future developments include extending DR support to GCP and Azure, along with enhancing the self-service migration experience for existing clusters.

Trends Found in this Post
Trend Post Mentions Total Month Mentions Posts Companies MoM
Observability 4 4,496 812 176 +40%