NVIDIA Releases Nemotron 3 Ultra: 550B Open Model
NVIDIA has open-released Nemotron 3 Ultra, a 550B-parameter Mixture-of-Experts model (55B active) built on a hybrid Mamba-2 Transformer with a 1M token context window. Pretrained on 20 trillion tokens in NVFP4, it delivers 5× faster inference and up to 30% lower cost versus comparable open frontier models. Full weights, training recipes, base/post-trained/reward checkpoints, and an NVFP4 quantized version are available on Hugging Face under the OpenMDW-1.1 open model license.
Why It Matters
Nemotron 3 Ultra is the most capable fully-open model released to date, with Day 0 support from LangChain, HuggingFace Transformers, and the Nemotron Coalition — directly reshaping the economics of large-scale agentic deployments at a fraction of closed-frontier pricing.