DeepSeek V4 Released: 1.6T Parameters, 1M Context, Open-Source

DeepSeek V4 has been released as a fully open-source model with 1.6 trillion parameters and 1 million token context. Using hybrid attention (CSA+HCA+sliding window), Manifold-Constrained Hyperconnections, and the Muon optimizer, it achieves 3.7× fewer FLOPs than V3.2 and a 10× smaller KV cache.

1 min read|agenticonsult Intelligence

DeepSeek V4 Released: 1.6T Parameters, 1M Context, Open-Source

DeepSeek has released V4, a fully open-source language model with 1.6 trillion parameters and native 1 million token context. The architecture combines three-pathway hybrid attention (Compressed Sparse Attention, Heavily Compressed Attention, and a sliding window), Manifold-Constrained Hyperconnections to prevent signal explosion at trillion-parameter scale, and the Muon two-phase optimizer. The result: 3.7× fewer FLOPs and a 10× smaller KV cache versus V3.2. V4 scored a perfect 120/120 on Putnam 2025 and currently ranks #2 on the Artificial Analysis open-source leaderboard, matching or outperforming Opus 4.6 Max.

Why It Matters

A compute-constrained team releasing a frontier-tier model as fully open-source infrastructure—including kernel code—continues to accelerate the capability-access gap between the open and closed ecosystems.

Primary source

DeepSeek

This breaking-news item was assembled from the cited primary source with AI assistance. It is intended for rapid situational awareness — refer to the original publication for the definitive statement.

DeepSeek V4 Released: 1.6T Parameters, 1M Context, Open-Source

DeepSeek V4 Released: 1.6T Parameters, 1M Context, Open-Source

Why It Matters

Live Intel Feed