DeepMind AI Co-Mathematician Scores 48% on FrontierMath Tier 4
Google DeepMind has unveiled an AI co-mathematician achieving 48% accuracy on FrontierMath Tier 4 problems — the highest score any AI system has recorded on this benchmark, spanning group theory, Hamiltonian systems, and algebraic combinatorics. The multi-agent system operates in both autonomous evaluation mode and collaborative mode alongside human researchers.
Why It Matters
FrontierMath Tier 4 consists of formally verified, novel problems that cannot be pattern-matched from training data. Crossing the 48% threshold signals AI is entering mathematical territory previously exclusive to specialist researchers — a capability boundary being crossed in real time, ahead of most timelines.