NVIDIA Launches Nemotron 3 Super: Open 120B Model Delivers 5x Higher Throughput for Agentic AI Systems
Key Takeaways
- ▸Nemotron 3 Super achieves 5x higher throughput and 2x higher accuracy than previous Nemotron models through hybrid MoE architecture combining Mamba and transformer layers
- ▸1-million-token context window and latent MoE techniques address key multi-agent challenges including context explosion and goal drift in long-running workflows
- ▸Open-weight release with comprehensive training data, methodologies, and recipes enables widespread adoption and customization across enterprises and AI-native companies
Summary
NVIDIA has launched Nemotron 3 Super, a 120-billion-parameter open-weight model with only 12 billion active parameters, designed specifically for complex multi-agent AI systems at scale. The model achieves up to 5x higher throughput and up to 2x higher accuracy compared to its predecessor through a hybrid mixture-of-experts architecture combining Mamba and transformer layers, along with novel techniques like latent MoE and multi-token prediction. Nemotron 3 Super features a 1-million-token context window that prevents goal drift in long-running agent workflows and ranks top on multiple efficiency benchmarks including Artificial Analysis leaderboards.
The open-weight model is already being integrated by AI-native companies including Perplexity, CodeRabbit, Factory, and Greptile, as well as enterprise software platforms like Palantir, Amdocs, and Siemens for applications ranging from search and code generation to semiconductor design and telecom automation. NVIDIA is releasing the model with complete training methodology, including over 10 trillion tokens of datasets, 15 reinforcement learning environments, and evaluation recipes, enabling developers to deploy and customize it across workstations, data centers, and cloud environments.
- Model ranks #1 on Artificial Analysis for efficiency and openness, and powers NVIDIA's AI-Q research agent to top positions on DeepResearch Bench leaderboards
Editorial Opinion
Nemotron 3 Super represents a significant shift toward practical, efficient AI agents that can handle real-world complexity at reasonable cost. By open-sourcing not just the model weights but the entire training methodology and datasets, NVIDIA is democratizing agentic AI development and setting a new standard for transparency in frontier model releases. The focus on solving concrete pain points—context explosion, goal drift, and inference efficiency—suggests the company understands that agent viability depends on economics and reliability, not raw capability.



