Home / Companies / GitHub / Blog / Post Details
Content Deep Dive

An update on recent service disruptions

Blog post from GitHub

Post Details
Company
Date Published
Author
Keith Ballinger
Word Count
761
Language
English
Hacker News Points
-
Summary

GitHub has encountered multiple service disruptions recently due to resource contention in its mysql1 database cluster, affecting the performance and availability of various services such as git operations, webhooks, API requests, and GitHub Actions. The issues were primarily linked to peak load times and suboptimal query performance, leading to several outages that required failover to healthy replicas for recovery. Despite ongoing efforts to partition the main database and add clusters, the problem persisted, prompting GitHub to conduct audits of load patterns and implement performance fixes, including traffic redistribution and increased monitoring. Proactive measures such as throttling webhook traffic and further database optimizations are being pursued to prevent future incidents, while an upcoming Availability Report aims to provide more insights into these challenges and the steps being taken to address them.