Vantage Launches Kubernetes GPU Idle Costs: Calculate Efficiency for AI-Intensive Workloads
Blog post from Vantage
Vantage has introduced a new feature that allows users to collect and report on Kubernetes GPU idle costs, enhancing their ability to identify underutilized resources within AI-intensive workloads. This feature is available to users with Vantage Kubernetes agent version 1.0.26 or later, and it requires the installation of the NVIDIA operator on their clusters. The new capability incorporates GPU memory usage into Kubernetes efficiency reports, which previously only calculated idle costs using CPU and RAM, thus providing a more comprehensive view of resource utilization. The data is gathered using the NVIDIA DCGM Exporter, and reports are updated within 48 hours as the costs from the infrastructure provider are ingested. This feature does not incur additional costs and supports whole GPU requests, with current compatibility limited to NVIDIA GPUs on AWS infrastructure.