Company
Date Published
Author
Jakub Oleksy
Word count
587
Language
English
Hacker News points
None

Summary

In October 2025, GitHub faced four significant incidents resulting in service disruptions due to various technical failures and external dependencies. On October 9, a network device prematurely reintroduced into production caused increased latency and error rates for authenticated users and API requests, prompting GitHub to improve its device repair validation processes. On October 17, an incorrect configuration change resulted in a 70-minute failure of mobile push notifications across all regions, leading to a review of cloud resource management procedures. October 20 saw a cascading failure in the Codespaces service due to a third-party dependency outage, with error rates peaking at 71% for new codespace creations; GitHub plans to mitigate such dependencies in the future. The most severe incident occurred on October 29, when a widespread third-party provider outage led to major disruptions in Codespaces, GitHub Actions, and other services, with error rates peaking at 100% for some users; GitHub is now focusing on reducing reliance on external providers and enhancing service resilience. Throughout these incidents, mitigations were applied to minimize impact, and efforts are underway to prevent similar issues in the future.