The incident occurred when an AWS Elastic Block Storage (EBS) volume used by Datadog's Postgres database started acting up, causing the database to slow down noticeably. The faulty volume scenario led to a manual failover process, which was time-consuming and error-prone due to relying heavily on Chef for automation. Additionally, the use of EBS in critical functions, such as storage for the Postgres database and configuration management server running Chef, contributed to the outage. Datadog's multi-zone deployment, limited use of EBS, and continued data intake during the outage also played a role in mitigating the impact of the incident. However, lessons learned highlight the challenges of shared storage, the importance of having sufficient capacity for recovery, and the need to replace addictive technologies like EBS with more robust alternatives.