Incident report on June 22, 2026

Post Details

Company

Trigger.dev

Date Published

June 24, 2026

Author

Matt Aitken

Word Count

3,933

Company Posts That Month

1

Language

English

Hacker News Points

-

Source URL

trigger.dev/blog/incident-report-jun-22-2026

Summary

An incident caused significant disruptions in cloud services between June 22 and June 23, affecting thousands of organizations due to periods of slow dequeuing and outages in the us-east-1 and eu-central-1 regions. The root cause was a temporary AWS capacity shortage that overwhelmed the Kubernetes control plane, leading to a failure in scheduling new runs. To address the issue, the provider is implementing several measures, including deploying more than one isolated cluster per region with automatic failover, establishing a new us-west-2 region, and enabling self-serve bulk run migrations between regions. Additional improvements include enhancing backpressure mechanisms, transitioning to managed Kubernetes to relieve operational complexity, and instituting faster cluster recovery procedures. These steps aim to prevent similar incidents and ensure a more reliable service in the future.

Trends Found in this Post

Trend	Post Mentions	Total Month Mentions	Posts	Companies	MoM
Kubernetes	8	1,993	294	100	+1%