Meet the New Standard for High-Performance, Low-Cost Inference: NVIDIA Dynamo 1.0 is now available to DigitalOcean Customers
Blog post from DigitalOcean
NVIDIA Dynamo 1.0 has been released as a high-performance inference service framework aimed at enhancing large-scale generative AI and inference models, now available to DigitalOcean customers to boost performance and cost efficiency. This release offers a sevenfold increase in inference performance on NVIDIA GB200 NVL systems and enables significant cost reductions when integrated with DigitalOcean's Agentic Inference Cloud. Key technical advancements include KV-aware routing, disaggregated serving, and memory offloading, which together optimize GPU utilization and reduce latency. The collaboration between NVIDIA and DigitalOcean has already resulted in substantial cost savings and performance improvements for customers like Workato, who achieved 67% higher throughput and significantly reduced latency using Dynamo on DigitalOcean's Kubernetes platform. The partnership promises further advancements in inference optimization, supported by new product releases and updates from NVIDIA GTC, including enhancements to DigitalOcean’s infrastructure and additional AI tools.