Company
Date Published
Author
Shlomi Noach
Word count
2268
Language
English
Hacker News points
None

Summary

GitHub employs MySQL as its primary datastore, with metadata such as issues and comments stored in MySQL while repository data resides in Git. To manage high-load operations, GitHub uses MySQL replication with a single-writer-multiple-readers design and HAProxy for load balancing. This setup involves context-aware MySQL pools where backend servers autonomously decide their participation based on replication lag, reported via HTTP status codes. HAProxy interprets these codes to include or exclude servers from handling read requests. The system dynamically manages server pools to address replication lag and operational tasks, ensuring minimal downtime. A backup pool is utilized when the main pool's capacity is compromised, allowing for continued service albeit with potentially stale data. This approach leverages shell scripts and ChatOps integration for monitoring and manual intervention, offering a flexible and scalable solution to maintain data availability and integrity.