NVIDIA Releases Nemotron 3 Ultra: 550B Open Model

NVIDIA releases Nemotron 3 Ultra — a 550B-parameter MoE model with 1M token context, 5x faster inference, and full open weights under OpenMDW-1.1 on Hugging Face.

1 min read|agenticonsult Intelligence

NVIDIA Releases Nemotron 3 Ultra: 550B Open Model

NVIDIA has open-released Nemotron 3 Ultra, a 550B-parameter Mixture-of-Experts model (55B active) built on a hybrid Mamba-2 Transformer with a 1M token context window. Pretrained on 20 trillion tokens in NVFP4, it delivers 5× faster inference and up to 30% lower cost versus comparable open frontier models. Full weights, training recipes, base/post-trained/reward checkpoints, and an NVFP4 quantized version are available on Hugging Face under the OpenMDW-1.1 open model license.

Why It Matters

Nemotron 3 Ultra is the most capable fully-open model released to date, with Day 0 support from LangChain, HuggingFace Transformers, and the Nemotron Coalition — directly reshaping the economics of large-scale agentic deployments at a fraction of closed-frontier pricing.

This breaking-news item was assembled from the cited primary source with AI assistance. It is intended for rapid situational awareness — refer to the original publication for the definitive statement.