BotBeat
...
← Back

> ▌

NVIDIANVIDIA
PRODUCT LAUNCHNVIDIA2026-03-11

NVIDIA Launches Nemotron 3 Super: Open 120B Model Delivers 5x Higher Throughput for Agentic AI Systems

Key Takeaways

  • ▸Nemotron 3 Super achieves 5x higher throughput and 2x higher accuracy than previous Nemotron models through hybrid MoE architecture combining Mamba and transformer layers
  • ▸1-million-token context window and latent MoE techniques address key multi-agent challenges including context explosion and goal drift in long-running workflows
  • ▸Open-weight release with comprehensive training data, methodologies, and recipes enables widespread adoption and customization across enterprises and AI-native companies
Sources:
Hacker Newshttps://blogs.nvidia.com/blog/nemotron-3-super-agentic-ai/↗
Hacker Newshttps://developer.nvidia.com/blog/introducing-nemotron-3-super-an-open-hybrid-mamba-transformer-moe-for-agentic-reasoning/↗
Hacker Newshttps://research.nvidia.com/labs/nemotron/Nemotron-3-Super/↗
Hacker Newshttps://twitter.com/ctnzr/status/2031762077325406428↗
Hacker Newshttps://www.thedeepview.com/articles/nvidia-boosts-open-models-with-nemotron-3-super↗

Summary

NVIDIA has launched Nemotron 3 Super, a 120-billion-parameter open-weight model with only 12 billion active parameters, designed specifically for complex multi-agent AI systems at scale. The model achieves up to 5x higher throughput and up to 2x higher accuracy compared to its predecessor through a hybrid mixture-of-experts architecture combining Mamba and transformer layers, along with novel techniques like latent MoE and multi-token prediction. Nemotron 3 Super features a 1-million-token context window that prevents goal drift in long-running agent workflows and ranks top on multiple efficiency benchmarks including Artificial Analysis leaderboards.

The open-weight model is already being integrated by AI-native companies including Perplexity, CodeRabbit, Factory, and Greptile, as well as enterprise software platforms like Palantir, Amdocs, and Siemens for applications ranging from search and code generation to semiconductor design and telecom automation. NVIDIA is releasing the model with complete training methodology, including over 10 trillion tokens of datasets, 15 reinforcement learning environments, and evaluation recipes, enabling developers to deploy and customize it across workstations, data centers, and cloud environments.

  • Model ranks #1 on Artificial Analysis for efficiency and openness, and powers NVIDIA's AI-Q research agent to top positions on DeepResearch Bench leaderboards

Editorial Opinion

Nemotron 3 Super represents a significant shift toward practical, efficient AI agents that can handle real-world complexity at reasonable cost. By open-sourcing not just the model weights but the entire training methodology and datasets, NVIDIA is democratizing agentic AI development and setting a new standard for transparency in frontier model releases. The focus on solving concrete pain points—context explosion, goal drift, and inference efficiency—suggests the company understands that agent viability depends on economics and reliability, not raw capability.

Large Language Models (LLMs)Natural Language Processing (NLP)Generative AIAI AgentsMachine LearningMLOps & InfrastructureAI HardwareProduct LaunchOpen Source

More from NVIDIA

NVIDIANVIDIA
RESEARCH

Nvidia Pivots to Optical Interconnects as Copper Hits Physical Limits, Plans 1,000+ GPU Systems by 2028

2026-04-05
NVIDIANVIDIA
PRODUCT LAUNCH

NVIDIA Introduces Nemotron 3: Open-Source Family of Efficient AI Models with Up to 1M Token Context

2026-04-03
NVIDIANVIDIA
PRODUCT LAUNCH

NVIDIA Claims World's Lowest Cost Per Token for AI Inference

2026-04-03

Comments

Suggested

AnthropicAnthropic
RESEARCH

Inside Claude Code's Dynamic System Prompt Architecture: Anthropic's Complex Context Engineering Revealed

2026-04-05
OracleOracle
POLICY & REGULATION

AI Agents Promise to 'Run the Business'—But Who's Liable When Things Go Wrong?

2026-04-05
Google / AlphabetGoogle / Alphabet
RESEARCH

Deep Dive: Optimizing Sharded Matrix Multiplication on TPU with Pallas

2026-04-05
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us