Cursor Composer 2.5: 79.8% SWE-Bench at Under $1 per Task
Cursor has shipped Composer 2.5, scoring 79.8% on SWE-Bench Multilingual at approximately $1 per task — versus roughly $11 for comparable competitors. The model uses the open Kimi K2.5 base trained on 25× more synthetic tasks with mid-task feedback rather than final-output-only reward. IDE-only, no public API. Cursor is also separately training a from-scratch model on SpaceXAI's Colossus cluster (1M H100-equivalent GPUs).
Why It Matters
Composer 2.5 represents a cost-performance inflection: frontier-level coding benchmark results at commodity cost. The SpaceXAI from-scratch training signals Cursor's intent to own its model stack rather than depend on foundation lab APIs long-term.