Company
Date Published
Author
Jakub Oleksy
Word count
638
Language
English
Hacker News points
None

Summary

In August, a significant incident affected the availability of GitHub's Codespaces, with an alert on August 29 indicating widespread customer impact and ongoing investigations to determine the root cause, promising more details in October's report. A related incident from July 27 involved failures in creating virtual machines (VMs) for 2-core and 4-core machine types in the East US and West US regions, triggered by a cloud provider update incompatible with GitHub's host VM image building process. This led to resource exhaustion and delayed or failed codespace startups. To remedy the situation, GitHub applied temporary mitigations, adjusted its image build pipeline per the provider's recommendations, and implemented enhancements to its VM creation process, monitoring, and alerting systems to prevent future occurrences. The company is committed to ongoing improvements in service reliability, with updates available on their status page and engineering blog.