Partner Spotlight: Orchestrating large-scale agent training on Lambda with dstack and RAGEN

Post Details

Company

Lambda

Date Published

June 5, 2025

Author

dstack

Word Count

1,011

Language

English

Hacker News Points

-

Source URL

lambda.ai/blog/agent-training-on-lambda-with-dstack-and-ragen

Summary

Lambda and dstack provide a streamlined alternative to Kubernetes and Slurm for teams running on Lambda, with native support for training workloads, development environments, and persistent services. By using Lambda's 1-Click Clusters and dstack's orchestration, developers can focus on building rather than setting up their machine learning infrastructure. The RAGEN framework is used for training large language models as reasoning agents in complex, multi-turn environments, introducing fine-grained reward signals to improve agent reliability and performance. With dstack, users can launch a Ray cluster with the RAGEN environment, submit Ray tasks from their local machine, and recover training in case of a failure or cluster restart.