Serverless GPUs: RTX Pro 6000, H200, and B200 Now Available on Koyeb
Blog post from Koyeb
NVIDIA RTX Pro 6000, H200, and B200 GPUs are now available on Koyeb's serverless platform, offering high-performance computing for memory-bound, latency-sensitive, or throughput-constrained workloads such as long-context and large-model serving. The GPUs, ranging in price from $2.20/hr to $5.50/hr, are billed per second, enabling users to build, experiment, and autoscale without the need for infrastructure management. The RTX Pro 6000 is suitable for AI-driven rendering and analytics with 96 GB of VRAM and FP4 support, while the H200 offers 75% more GPU memory than its predecessor for large-scale AI models. The B200, with 180 GB of HBM3e memory and 8 TB/s bandwidth, is designed for ultra-large model inference. Koyeb's platform allows easy deployment through a one-click catalog, pre-built containers, or GitHub integration, providing a scalable and cost-efficient environment with built-in observability and reactive autoscaling. Recent expansions include a 24% price reduction and increased GPU stock, underscoring Koyeb's commitment to accessible AI infrastructure.