Home / Companies / Railway / Blog / Post Details
Content Deep Dive

Incident Report: November 20th, 2025

Blog post from Railway

Post Details
Company
Date Published
Author
Noah Dunnagan
Word Count
573
Language
-
Hacker News Points
-
Summary

Railway recently experienced a major outage affecting deployments due to an issue with their deployment task queue, which was triggered by a sudden surge in GitHub webhook events. Despite the delay in deployments, all running deployments and platform-level features remained online, and users who did not push new code or trigger redeployments experienced no disruption. The incident began on November 20th, 2025, when engineers identified unusually low GitHub webhooks for push events, followed by a 10x surge in webhook traffic, resulting in a backlog of deployment initializations. To manage the situation, Railway temporarily disabled Free, Trial, Hobby, and eventually Pro deployments to reduce pressure on the queue, prioritizing Enterprise and Pro users. The company managed recovery by increasing and restarting workers, gradually re-enabling all affected deployments, and confirming full recovery later that day. To prevent future occurrences, Railway plans to improve its alert systems, enhance internal monitoring for deployment spikes, and address the root cause of worker lock-ups under memory pressure, emphasizing their commitment to providing a reliable cloud experience.