/plushcap/analysis/cloudflare/cloudflare-incident-on-october-30-2023

Cloudflare incident on October 30, 2023

What's this blog post about?

A service disruption occurred in Workers KV, a distributed key-value store used by many applications including first-party Cloudflare products like Pages, Access, and Zero Trust. This was due to an issue with the deployment tool which resulted in production traffic being directed to a version that was not authorized for production access, leading to HTTP 401 errors. The issue affected parts of the Cloudflare dashboard and other services dependent on Workers KV. The incident lasted approximately one hour before being resolved. The team has identified the root cause and is taking several steps to prevent similar incidents in the future. These include improving the deployment tooling, enhancing the rollback process, adding pre-checks to deployments, hardening progressive deployment scripts, and ensuring compatibility between applications and their environments during deployments. The company apologizes for any inconvenience caused by this incident.

Company
Cloudflare

Date published
Nov. 1, 2023

Author(s)
Matt Silverlock, Kris Evans

Word count
1670

Hacker News points
None found.

Language
English


By Matt Makai. 2021-2024.