Nemotron 3 Super Provider Pricing Comparison (2026)
Blog post from Deepinfra
DeepInfra's Nemotron 3 Super, an open-weight reasoning model by NVIDIA with 120 billion parameters, is designed for reasoning, tool use, and instruction following, making it suitable for production workloads. The cost of using Nemotron 3 Super varies significantly depending on the provider and workload requirements, with OpenRouter offering the lowest token pricing at $0.09/$0.45 per million input/output tokens, and DeepInfra providing competitive pricing at $0.10/$0.50 with additional features like prompt caching, JSON and function calling support, and private endpoint options. The model's verbose output nature and a 5x output-to-input cost ratio necessitate careful management of token generation to avoid escalating expenses. DeepInfra is highlighted as a preferred choice for production deployments requiring structured outputs and secure, controlled environments, with its infrastructure offering lower latency and predictable costs. The document emphasizes the importance of choosing a provider based on specific application needs and production requirements rather than solely on token cost.