In recent weeks, GitHub experienced service interruptions due to instability in its high-availability load balancer setup, primarily caused by excessive load on Xen servers, which led to repeated Heartbeat check timeouts and triggered unnecessary failover actions. To address this, GitHub reduced the frequency and increased the timeouts of Heartbeat checks, significantly lowering average load and variance across their Xen cluster, and eliminating false alerts. Additionally, they are implementing a dedicated high-availability pair for load balancers to isolate them from other virtual server failures and are working on improving load balancer configurations to reduce mean time to recovery. As part of ongoing infrastructure improvements, GitHub has also hired new system administrators to enhance stability and is committed to improving the overall user experience.