Improving Reliability with a More Resilient Auth Proxy Architecture
Blog post from Astronomer
Astro has implemented significant architectural changes to its authentication layer to enhance the reliability, resilience, and scalability of its Airflow platform. By introducing dataplane-based forward authentication, Astro eliminates the centralized Auth Proxy, allowing each dataplane to run its own forward-auth service with URI-aware authentication logic. The rollout incorporates backend controls and UI feature flags to manage the transition smoothly, and deployment-scoped API tokens are introduced to ensure operational continuity during control plane outages. The deployment of Direct Access Tokens further decouples workloads from control plane dependencies, providing organization, workspace, and deployment-level access even during outages. These improvements reduce the risk of authentication-related DAG failures, minimize the impact of control plane incidents, and enhance recovery times, addressing customer reliability concerns and optimizing platform efficiency. While most customers do not need to take action, those with specific network configurations may need to update their filters to accommodate new IP ranges.