Incident Report: July 2, 2026 — US East Services Outage
Blog post from Railway
On July 2, 2026, Railway experienced a significant outage in one of its US East availability zones due to a network degradation in an ISP, leading to increased latency and packet loss between US regions. Efforts to reroute traffic inadvertently left the zone without a stable internet route, exposing hidden bugs and causing degraded disk performance and private networking issues for approximately two hours. The incident unfolded as a combination of ISP degradation, mismanaged routing changes, and system behavior flaws, which resulted in traffic being diverted onto slower backup networks. Despite the initial response to reroute traffic, issues persisted due to lingering connections on incorrect paths and misconfigured private networking tunnels. Corrective actions included terminating stuck connections, restarting mesh networking agents, and ensuring future resiliency through infrastructure updates and improved alert systems. The incident highlighted the need for updated designs in older sites and the importance of proactive routing management to prevent similar disruptions in the future.
No tracked trend matches for this post yet.