GPT-5.2 Reaches Expert Level in Scientific Peer Review, Study Finds

A 45-scientist, 469-hour study across 82 papers found GPT-5.2 competitive with top-rated reviewers in Nature's peer review. Researchers recommend combining AI and human reviewers, noting AI reviewers improve while human reviewers do not.

1 min read|agenticonsult Intelligence

GPT-5.2 Reaches Expert Level in Scientific Peer Review, Study Finds

A preprint study involving 45 scientists and 469 hours of evaluation across 82 papers found that GPT-5.2 is competitive even with top-rated reviewers in Nature's official peer review process, though the model retains identifiable weaknesses. The recommended practice is a combined AI–human reviewer approach. The authors note that AI reviewer quality continues to improve while human reviewer quality demonstrates no comparable improvement over time — suggesting the performance gap will widen in AI's favor. Paper at arxiv.org/abs/2605.20668.

Why It Matters

Expert-level peer review performance at a major scientific journal is a concrete professional competence threshold. The "AI improves, humans don't" dynamic across repeated review cycles has structural implications for how scientific vetting will function as AI review quality compounds.

This breaking-news item was assembled from the cited primary source with AI assistance. It is intended for rapid situational awareness — refer to the original publication for the definitive statement.