Company
Date Published
Author
Matthew Prince
Word count
2919
Language
English
Hacker News points
None

Summary

On November 18, 2025, Cloudflare experienced a significant network outage that affected its core services, not due to a cyber attack, but because of an internal database permission change. This change inadvertently caused a critical feature file used by Cloudflare's Bot Management system to double in size, surpassing the software's capacity and leading to system failures. This resulted in widespread HTTP 5xx errors affecting Cloudflare's CDN, Workers KV, and Access services, among others. The problem was compounded by the distribution of inconsistent configuration files across the network, which initially led the team to suspect a DDoS attack. By identifying and rolling back the faulty feature file, Cloudflare managed to restore traffic flow by 14:30 UTC, with full functionality achieved by 17:06 UTC. The incident prompted Cloudflare to initiate measures to prevent similar occurrences, emphasizing the outage's impact on its reputation and the internet ecosystem.