NVIDIA Nemotron 3 Super
Blog post from Daily
NVIDIA has introduced the open-source Nemotron 3 Super, a large language model (LLM) designed to enhance voice AI development by providing an alternative to proprietary API services. This model, along with the previously released Nemotron 3 Nano and Nemotron Speech ASR, forms a comprehensive toolkit for voice AI applications. Nemotron 3 Super has demonstrated performance on par with the latest GPT-5.4 models in long-conversation benchmarks and outperforms widely-used models like GPT-4.1 and Gemini 2.5 Flash. Although it excels in various benchmarks, its inference speed is not yet fully optimized due to its unique architecture involving both transformer and Mamba layers. Voice agents, which rely on low latency and high accuracy for tasks like tool calling, face challenges with reasoning-heavy models, but Nemotron 3 Super offers fine-tuning possibilities given its open-source nature. The model's architecture allows for improvements in tool calling, potentially addressing latency issues that are critical for real-time AI systems. NVIDIA's DGX B200 hardware and Modal AI cloud support the deployment and testing of this model, with the open-source community poised to further refine and optimize its capabilities.