Upgrade EKS Clusters across Multiple Versions in Less Than a Day — using Automated Workflows
Blog post from Orkes
Upgrading Kubernetes clusters, particularly with managed services like Amazon EKS, can be a labor-intensive process fraught with technical challenges, as illustrated by Orkes' experience. Despite Amazon EKS handling core upgrade tasks, cloud engineers must still manage several high-level tasks, such as initiating upgrades for multiple cluster components, troubleshooting errors, and conducting thorough pre- and post-upgrade checks. With Kubernetes issuing three releases annually and offering only 14 months of support per release, maintaining cloud infrastructure can become increasingly burdensome, especially for organizations managing numerous clusters with varying configurations. Orkes faced difficulties with manual upgrades, which required sequential updates and extensive manual intervention, highlighting the limitations of default CLI or Amazon console processes. To streamline this operation, Orkes utilized an automated workflow through Conductor, an enterprise-grade orchestration platform, allowing them to efficiently manage upgrades across hundreds of clusters. This approach facilitated a seamless upgrade process by incorporating task sequencing, planned delays, Slack notifications, scheduling, and robust failure handling, ultimately transforming their upgrade procedure into a time-saving, error-minimizing, and highly efficient automated workflow.