Home / Companies / WorkOS / Blog / Post Details
Content Deep Dive

Service disruption on October 20, 2025

Blog post from WorkOS

Post Details
Company
Date Published
Author
-
Word Count
955
Language
English
Hacker News Points
-
Summary

Between October 20 and 21, WorkOS experienced significant service outages due to an initial AWS us-east-1 region failure and subsequent cascading outages at critical vendors. The first outage, from 06:50 to 09:20 UTC, severely impacted Single Sign-On and AuthKit services, while Directory Sync saw minimal disruptions. The root cause was traced to AWS database proxy failures which led to varying error rates across products. A second outage occurred from 18:55 UTC on October 20 to 01:50 UTC on October 21, primarily affecting AuthKit, Admin Portal, and the WorkOS Dashboard, due to a feature-flag provider incident compounded by issues with a hosting infrastructure provider. This led to long request latencies and intermittent timeouts, with 70% of requests failing at the peak of the incident. WorkOS identified flaws in their integration with the feature-flag provider's SDK, contributing to the issues. To mitigate future risks, WorkOS plans to enhance system resiliency by implementing timeouts, circuit breakers, fallback logic, and deploying services across multiple AWS regions, including a multi-region deployment of all services by Q1 2026.