Introducing Next-Gen Training and Inference at Scale on Lambda Instances with NVIDIA Blackwell
Blog post from Lambda
NVIDIA Blackwell GPUs, featured in the NVIDIA HGX B200, are now available on-demand via Lambda Instances, designed to enhance the training and inference of AI models, particularly trillion-parameter foundation models. With a significant performance boost over previous generations, including up to 2.25 times the FP8 throughput of the NVIDIA HGX H100 and faster training and inference capabilities, these GPUs offer 180GB of HBM3e memory and FP4 support, making them ideal for modern AI workloads. Users can launch 8x NVIDIA Blackwell GPUs instantly and pay per use, benefiting from production-ready infrastructure that supports high-throughput inference pipelines and model hosting with minimal latency. The innovative Blackwell architecture includes technologies like a second-generation Transformer Engine and advanced interconnects, enabling real-time inference on large language models. This infrastructure, available without long-term commitments, provides a scalable, cost-efficient solution for enterprises engaged in large-scale model training and deployment.