Home / Companies / Railway / Blog / Post Details
Content Deep Dive

Incident Report: November 25th, 2025

Blog post from Railway

Post Details
Company
Date Published
Author
Brody Over
Word Count
604
Language
-
Hacker News Points
-
Summary

An outage at Railway on November 25, 2025, disrupted deployments and parts of the dashboard, primarily affecting Free, Trial, and Hobby users by pausing their deployments, while Pro deployments experienced delays. The incident, traced back to issues with the task queue system backed by Temporal, was exacerbated by increased latency in GitHub API calls, leading to a backlog of tasks and resource overconsumption that caused Out-Of-Memory failures among workers. Engineers were promptly alerted and implemented several fixes, including reallocating resources and adjusting worker parameters, which gradually resolved the issue by clearing the task backlog and re-enabling deployments in stages. As a preventative measure, Railway plans to introduce an auto-tuning algorithm, scale task queue resources, and minimize dependencies on external APIs to avoid similar outages in the future, acknowledging the importance of reliable deployments for users.