Introducing Instant Clusters: On-Demand Multi-Node AI Compute
Blog post from RunPod
Runpod has launched Instant Clusters, an on-demand service for deploying networked multi-node GPU clusters, allowing users to connect up to 8 nodes for a total of 64 NVIDIA H100 GPUs. This service addresses the growing demand for scalable infrastructure driven by large-scale models like DeepSeek R1 and LLaMA 405B, which require more computational power than a single server can provide. Instant Clusters enable rapid deployment without the need for sales negotiations or lengthy integration processes, offering flexibility with no long-term contracts and billing by the second. Users can manage their clusters through Runpod's UI, utilize existing frameworks like Slurm and PyTorch for distributed jobs, and benefit from high-speed interconnects for efficient node-to-node communication. The service aims to support a range of applications, including running inference on massive models, fine-tuning foundational models, and accelerating research in various scientific fields, providing a nimble alternative to traditional bare metal setups.