Company
Date Published
Author
Greg Foster
Word count
720
Language
English
Hacker News points
None

Summary

On December 6, 2023, Graphite's web application experienced a brief outage lasting 124 seconds, affecting the pull request functionality and causing minor disruptions for nearly a dozen users who encountered 404 errors. The incident was detected by CTO Greg Foster after a user report on Slack, and the engineering team, including Alyssa and Brendan, quickly identified and resolved the issue by addressing a misconfiguration in the S3 bucket settings, restoring the site to normal functionality. The root cause was traced back to an unintended change in the S3 production bucket configuration, which was promptly reverted. In response, Graphite plans to implement new policies, such as testing configuration changes in a staging environment before deploying to production, and adding linter warnings to prevent jinxing statements about system reliability. The incident report, provided by Foster, highlights the swift response and collaboration within the team and the supportive role of the community during the service disruption.