Today's Downtime - Plushcap

Post Details

Company

GitHub

Date Published

March 9, 2011

Author

Tim Sharpe

Word Count

227

Company Posts That Month

21

Language

English

Hacker News Points

-

Post removed?

No

Source URL

github.blog/news-insights/the-library/today-s-downtime

Summary

Earlier today, a monitoring system alert indicated media errors on a disk in the RAID10 array of an active MySQL server, which was deemed non-critical and scheduled for replacement post-work hours. At around 7:30 PM PST, during the disk removal process, an unforeseen kernel lock coincided with a health check by the high availability (HA) system, causing it to erroneously mark the server as problematic. The HA system intervened by shutting down the affected server and activating MySQL on a standby machine, a transition that was executed smoothly. However, due to the size of certain tables, the subsequent InnoDB recovery process following the unclean MySQL shutdown extended service restoration to 7:45 PM PST. Adjustments have since been made to the procedure to prevent similar occurrences in the future, as reported by Tim Sharpe.

Trends Found in this Post

No tracked trend matches for this post yet.

Use This Data

Use this post, company, and trend context to find content marketing opportunities, perform competitive analysis, or address product feature gaps via the Plushcap MCP server or the Plushcap API.