Orchestrating GPU workloads on Runpod with dstack
Blog post from RunPod
Orchestration in machine learning (ML) teams involves automating the provisioning and management of computing resources to reduce costs and improve efficiency. dstack is a lightweight, open-source alternative to traditional orchestration tools like Kubernetes and Slurm, designed with a GPU-native focus and integration with modern cloud providers, including Runpod. It simplifies day-to-day operations by providing interactive development environments, task scheduling, and persistent service endpoints, all controlled through a declarative YAML configuration. By optimizing resource utilization and implementing policies like auto-shutdown and utilization-based termination, dstack helps ML teams avoid overpaying for GPU usage, as demonstrated by Electronic Arts, which reported significant cost savings. The platform's support for multi-cloud and hybrid environments allows for flexible job routing to cost-effective backends, making it a comprehensive solution for managing the entire ML lifecycle from development through training to inference.