VT Preprint: AI's Own Skills Beat Human-Defined Skill Files in SFT

A Virginia Tech preprint (April 19) shows that 'model-native skills' extracted from LLM activations via sparse autoencoders outperform human-curated skill data on supervised fine-tuning benchmarks, directly challenging the skill-MD paradigm pushed by Anthropic and Google.

1 min read|agenticonsult Intelligence

VT Preprint: AI's Own Skills Beat Human-Defined Skill Files in SFT

A Virginia Tech preprint dated April 19, 2026 extracts "model-native skills" — latent axes of behavioral variation — directly from LLM residual-stream activations using sparse autoencoders, then uses those directions to select supervised fine-tuning data. On Llama-3 8B: model-native SFT scores 39.6 vs best human-skill SFT at 38.4. On Qwen 2.5 3B: 57.7 vs 56. The bigger finding is a 20% gain on MATH-1 and 41% on AMC by selecting SFT data via "representation error curriculum" — prioritizing activation directions where the model is currently weakest — rather than textual diversity.

Why It Matters

The Anthropic/Google skill-MD paradigm — which this publication and many of our peers build on — is a local maximum according to this paper. The successor approach works in activation space, not prompt space. Code is publicly available; this is a reproducible challenge to current best practice, not speculative theory.

This breaking-news item was assembled from the cited primary source with AI assistance. It is intended for rapid situational awareness — refer to the original publication for the definitive statement.

VT Preprint: AI's Own Skills Beat Human-Defined Skill Files in SFT

VT Preprint: AI's Own Skills Beat Human-Defined Skill Files in SFT

Why It Matters

Live Intel Feed