Home / Companies / GitHub / Blog / Post Details
Content Deep Dive

Addressing GitHub’s recent availability issues

Blog post from GitHub

Post Details
Company
Date Published
Author
Mike Hanley
Word Count
1,339
Language
English
Hacker News Points
-
Summary

Last week, GitHub experienced a series of availability incidents affecting several services, which have since been resolved. These incidents, occurring on May 9, 10, and 11, had various root causes, such as a configuration change that degraded Git databases, inefficiencies in GitHub App authentication token issuance, and a primary database cluster crash that led to a loss of read replicas. Each incident significantly impacted GitHub's core functionalities, causing service degradation and hampering operations like GitHub Actions workflows and pull request updates. GitHub has pledged to investigate these disruptions thoroughly, improve internal processes, enhance observability for high-cost query patterns, and ensure more resilient failover mechanisms. Moving forward, GitHub is committed to transparency and reliability, as communicated by Mike Hanley, GitHub's Chief Security Officer, who emphasized ongoing efforts to bolster site reliability and accountability.