Company
Date Published
Author
Jake Cooper
Word count
650
Language
English
Hacker News points
None

Summary

Railway experienced an outage on June 6th, 2025, affecting its GitHub Login and Backend API. The primary database faced imminent connection exhaustion, leading to request delays. Railway scaled rapidly, applying pressure to the database, which was not adequately prepared for the influx of users. When circuit breakers triggered, aggressive websocket reconnect logic overwhelmed the already stressed database, causing millions of requests to the backend. This led to a flood of traffic to GitHub's OAuth endpoints, triggering secondary rate limits on Railway's OAuth app. To mitigate the issue, Railway rolled out changes to reduce exponential backoff and implemented its own login OAuth rate limiting on top of GitHub's, gradually rolling out fixes within 24 hours. The incident highlighted the need for improved scaling, WAF configuration, and real-time logic to prevent similar issues in the future.