All Models

Nemotron-3-Super-120B-A12B

Reasoning Tool Calling Open Weights Structured Output

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute efficiency and accuracy in complex multi-agent applications. Built on a hybrid Mamba-Transformer Mixture-of-Experts architecture with multi-token prediction (MTP), it delivers over 50% higher token generation compared to leading open models. The model features a 1M token context window for long-term agent coherence, cross-document reasoning, and multi-step task planning. Latent MoE enables calling 4 experts for the inference cost of only one, improving intelligence and generalization. Multi-environment RL training across 10+ environments delivers leading accuracy on benchmarks including AIME 2025, TerminalBench, and SWE-Bench Verified. Fully open with weights, datasets, and recipes under the NVIDIA Open License, Nemotron 3 Super allows easy customization and secure deployment anywhere — from workstation to cloud.

Providers 1
Released Mar 11, 2026
Input Modalities text
Output Modalities text
Tarsk Use coding

Available Providers (1)

Provider Model ID Input Cost Output Cost Context Max Output Docs
Nebius Token Factory nvidia/nemotron-3-super-120b-a12b $0.30/MTok $0.90/MTok 256K 32.8K

Capabilities

Reasoning
Tool Calling
Attachments
Open Weights
Structured Output