How Render Services Stayed Up During the AWS October Outage
Blog post from Render
On October 20, 2025, a significant AWS outage impacted numerous cloud providers, but Render's infrastructure remained largely operational due to strategic architectural decisions made over the years. Despite facing degraded service in the us-east-1 region, no customers experienced complete downtime, thanks to Render's reliance on low-level AWS primitives and self-managed Kubernetes clusters, which provided greater control during the incident. Render's infrastructure is distributed across regions, minimizing the impact on their customer base, with most services operating outside of the heavily affected us-east-1 region. The incident highlighted the effectiveness of Render's infrastructure strategy, which avoids over-reliance on higher-level AWS services, allowing for targeted interventions and continued service during the outage. However, it also exposed areas for improvement, such as operationalization and communication during incidents. The experience underscored the importance of architectural foresight and the need for continual learning and adaptation to enhance resilience in cloud infrastructure.