GPT-5.2 Reaches Expert Level in Scientific Peer Review, Study Finds
A preprint study involving 45 scientists and 469 hours of evaluation across 82 papers found that GPT-5.2 is competitive even with top-rated reviewers in Nature's official peer review process, though the model retains identifiable weaknesses. The recommended practice is a combined AI–human reviewer approach. The authors note that AI reviewer quality continues to improve while human reviewer quality demonstrates no comparable improvement over time — suggesting the performance gap will widen in AI's favor. Paper at arxiv.org/abs/2605.20668.
Why It Matters
Expert-level peer review performance at a major scientific journal is a concrete professional competence threshold. The "AI improves, humans don't" dynamic across repeated review cycles has structural implications for how scientific vetting will function as AI review quality compounds.