Together AI Brings NVIDIA Nemotron 3 to Developers on Day 0
Blog post from Together AI
NVIDIA's Nemotron 3 Super is a 120-billion-parameter hybrid AI model designed for complex reasoning and multi-agent orchestration, combining Transformer and Mamba architectures. It is optimized for high-throughput inference workloads and is available on Together AI's Dedicated Inference platform, which handles infrastructure requirements such as GPU provisioning for users. The model features several innovations, including a Hybrid Mixture-of-Experts architecture that efficiently manages active parameters, a 1-million-token context window for processing large data sets, and multi-token prediction to accelerate output generation. It is particularly suited for applications involving large document analysis, multi-step planning, and agent coordination. Nemotron 3 Super is trained using reinforcement learning and synthetic data, with open weights and customization options for engineering teams, ensuring flexibility to adapt the model to specific needs.