Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation

ArXiv ID: 2603.19220

Summary: Nemotron-Cascade 2 is an open 30B MoE model with 3B activated parameters that delivers best-in-class reasoning and strong agentic capabilities. Key technical advancements include Cascade RL across a broad spectrum of reasoning and agentic domains, and multi-domain on-policy distillation from strong intermediate teacher models. It achieves competitive performance with much larger frontier models on mathematical and coding tasks.

Read on ArXiv