Google DeepMind AI Co-Mathematician Hits 48% on FrontierMath Tier 4

Google DeepMind's 'AI Co-Mathematician' — a stateful, multi-workstream agentic research workbench — scored 48% on FrontierMath Tier 4, the hardest mathematical reasoning benchmark, setting a new high-water mark.

1 min read|agenticonsult Intelligence

Google DeepMind AI Co-Mathematician Hits 48% on FrontierMath Tier 4

Google DeepMind's "AI Co-Mathematician" — a stateful, asynchronous, multi-workstream agentic research workbench — scored 48% on FrontierMath Tier 4, the hardest mathematical reasoning benchmark, setting a new high-water mark. Active sessions produced solved open problems and recovered overlooked citations, demonstrating generalization to expert research workflows where sessions span days rather than minutes.

Why It Matters

At 48% on the hardest FrontierMath tier (versus 39–54% for turn-based competitors), AI research assistance has reached expert-mathematician territory on formal tasks. The multi-workstream, session-persistent design is the architecture pattern to watch for long-horizon agentic work.

This breaking-news item was assembled from the cited primary source with AI assistance. It is intended for rapid situational awareness — refer to the original publication for the definitive statement.