Octopus Cloud connectivity disruption - report and learnings
Blog post from Octopus Deploy
On January 25, 2023, Octopus Cloud customers experienced network connectivity issues due to an outage in Azure's internal network, impacting Octopus Cloud, Octopus.com/OctopusID, and the Cloud Control Center. The disruption, caused by a command on a Microsoft Wide Area Network router that led to re-computation issues across routers, resulted in long latencies and service timeouts from 07:05am to 12:43am UTC. Octopus engineers detected and declared the incident, updating the status page and pausing processes to mitigate impact while Azure worked on resolving the issue. The incident, resolved after approximately four and a half hours, highlighted the challenges of upstream cloud provider outages and prompted Octopus to enhance their monitoring, refine escalation processes with Azure, and improve communication strategies to better inform customers of service impacts.