Claude Code Regression: Three Harness Issues, One Public Post-Mortem
Anthropic acknowledged a Claude Code quality regression running from approximately March 26 to April 10, traced to three distinct harness changes that interacted to suppress output quality. Fixes shipped in v2.1.116+ with a usage limit reset for all affected subscribers. A public post-mortem was published at anthropic.com/engineering/april-23-postmortem naming each root cause explicitly.
What the Source Actually Says
The technical analysis surfaced by the AI development community identifies three separate root causes. First, default reasoning was downgraded from high to medium — a configuration change that reduced the model's thinking budget per turn. Second, a bug in the thinking block eviction logic caused Claude to evict reasoning blocks every session turn rather than only blocks older than one hour (the intended cache optimization threshold); this bug ran undetected from March 26 to April 10. Third, a system prompt change designed to reduce verbose responses had the unintended side effect of reducing code quality alongside verbosity.
The three changes are individually subtle but compounded during the window they overlapped. Users reported "quality slippage over the past month" — consistent with the March 26 start date — and Anthropic's post-mortem published the root cause breakdown explicitly rather than defaulting to a vague resolution notice. The regression triggered a broader debate in the developer community about open harnesses: if every harness change were publicly visible, community inspection could have caught the regression earlier than internal investigation after user reports.
Strategic Take
The post-mortem practice matters as much as the fix. Publishing specific root causes — not just "issues were identified and resolved" — is the difference between transparency as a marketing exercise and transparency as an engineering norm. For teams evaluating AI coding tools, this incident reinforces that harness configuration is as load-bearing as model capability: a top-tier model with a misconfigured harness underperforms a smaller model with a well-tuned one.