Reliable Deployments for Large Kubernetes Fleet
Blog post from Fastly
Fastly's journey in optimizing Kubernetes Continuous Delivery (CD) reveals the challenges and solutions encountered while managing multi-cluster systems at scale. Initially, GitOps provided consistency but fell short due to its lack of orchestration and validation functionalities, leading Fastly to develop a lightweight orchestration layer atop ArgoCD. As the platform expanded, a more automated and standardized workflow became essential to handle progressive rollouts and validation gates across multiple environments. Fastly relied on various tools like ArgoCD for manifest management and Terratest for infrastructure testing, alongside developing custom solutions to post ArgoCD diffs to GitHub. Issues with Progressive Syncs in ArgoCD prompted Fastly to create their own sync orchestrator, a Python CLI that ensured applications were synced in the desired order. To further streamline their pipeline, Fastly integrated Argo Workflows as the automation engine, allowing for flexible, reliable, and human-intervened application promotions. This customized approach provided the control and predictability needed to safely deliver software while adapting to Fastly's specific requirements.