NVIDIA Releases Nemotron-Cascade 2: 30B Open Model Achieves IMO Gold Medal with Remarkable Parameter Efficiency

Key Takeaways

▸Nemotron-Cascade 2 achieves IMO 2025 and IOI 2025 gold-medal performance with only 3B activated parameters—demonstrating 20× better efficiency than comparable frontier models
▸NVIDIA's Cascade RL framework, enhanced with multi-domain on-policy distillation, enables effective post-training across complex reasoning and agentic domains with minimal performance regression
▸Full model weights, SFT data, and RL training data are open-sourced, democratizing access to state-of-the-art reasoning capabilities

Source:

Hacker Newshttps://research.nvidia.com/labs/nemotron/nemotron-cascade-2/↗

Summary

NVIDIA has released Nemotron-Cascade 2, a 30-billion parameter mixture-of-experts model with only 3 billion activated parameters that achieves gold-medal performance on the 2025 International Mathematical Olympiad (IMO) and International Olympiad in Informatics (IOI) competitions. The model is only the second open-weight LLM to reach this level of performance, demonstrating that advanced reasoning capabilities can be achieved with 20× fewer parameters than expected. This breakthrough challenges the conventional wisdom that frontier-level reasoning requires massive model sizes.

The technical advancement builds on NVIDIA's Cascade RL framework, which has been substantially expanded to cover broader reasoning and agentic domains. A key innovation is multi-domain on-policy distillation, which uses intermediate teacher models during training to recover performance regressions while maintaining strong gains across diverse reasoning tasks. NVIDIA has released the complete model weights, supervised fine-tuning (SFT) dataset, and reinforcement learning training data, enabling community researchers to build upon this work. Across mathematics, code reasoning, and instruction-following benchmarks, Nemotron-Cascade-2 outperforms both the larger Qwen3.5-35B and Nemotron-3-Super-120B models.

The model outperforms significantly larger competitors on key benchmarks, proving that intelligence density depends more on post-training methodology than parameter count alone

Editorial Opinion

Nemotron-Cascade 2 is a watershed moment for open AI research, proving that gold-medal mathematical reasoning doesn't require 100+ billion parameters. By focusing on smarter training methodology rather than scale, NVIDIA has challenged the cost barriers that have gatekept frontier-level reasoning models. The decision to open-source the full training data—not just weights—is commendable and raises the bar for reproducibility in the field.

NVIDIA Releases Nemotron-Cascade 2: 30B Open Model Achieves IMO Gold Medal with Remarkable Parameter Efficiency

Key Takeaways

▸Nemotron-Cascade 2 achieves IMO 2025 and IOI 2025 gold-medal performance with only 3B activated parameters—demonstrating 20× better efficiency than comparable frontier models
▸NVIDIA's Cascade RL framework, enhanced with multi-domain on-policy distillation, enables effective post-training across complex reasoning and agentic domains with minimal performance regression
▸Full model weights, SFT data, and RL training data are open-sourced, democratizing access to state-of-the-art reasoning capabilities

Summary

The model outperforms significantly larger competitors on key benchmarks, proving that intelligence density depends more on post-training methodology than parameter count alone

Editorial Opinion

Nemotron-Cascade 2 is a watershed moment for open AI research, proving that gold-medal mathematical reasoning doesn't require 100+ billion parameters. By focusing on smarter training methodology rather than scale, NVIDIA has challenged the cost barriers that have gatekept frontier-level reasoning models. The decision to open-source the full training data—not just weights—is commendable and raises the bar for reproducibility in the field.

NVIDIA Releases Nemotron-Cascade 2: 30B Open Model Achieves IMO Gold Medal with Remarkable Parameter Efficiency

Key Takeaways

Summary

Editorial Opinion

More from NVIDIA

NVIDIA Introduces Dynamic Persistent Tile Scheduling with Cluster Launch Control on Blackwell

NVIDIA and Intel Partner on Custom AI Chips, NVIDIA Invests $5 Billion

Researchers Achieve Sub-1% Error in GPU Performance Modeling for NVIDIA Blackwell and AMD CDNA3

Comments

Suggested

Anthropic Releases Prempti: Open-Source Guardrails for AI Coding Agents

mm-ctx: Open-Source Multimodal CLI Toolkit Brings Vision Capabilities to AI Agents

SpaceX Backs Anthropic with Massive Data Centre Deal Amidst Musk's OpenAI Legal Battle

NVIDIA Releases Nemotron-Cascade 2: 30B Open Model Achieves IMO Gold Medal with Remarkable Parameter Efficiency

Key Takeaways

Summary

Editorial Opinion

More from NVIDIA

NVIDIA Introduces Dynamic Persistent Tile Scheduling with Cluster Launch Control on Blackwell

NVIDIA and Intel Partner on Custom AI Chips, NVIDIA Invests $5 Billion

Researchers Achieve Sub-1% Error in GPU Performance Modeling for NVIDIA Blackwell and AMD CDNA3

Comments

Suggested

Anthropic Releases Prempti: Open-Source Guardrails for AI Coding Agents

mm-ctx: Open-Source Multimodal CLI Toolkit Brings Vision Capabilities to AI Agents

SpaceX Backs Anthropic with Massive Data Centre Deal Amidst Musk's OpenAI Legal Battle