Alibaba's Happy Horse Ranks #1 on Artificial Analysis But Fails Tests

Alibaba's Happy Horse video generation model ranks approximately 100 points above Seedance 2.0 on the Artificial Analysis leaderboard but visibly breaks physics and prompt adherence in independent head-to-head tests, raising benchmark contamination concerns.

1 min read|agenticonsult Intelligence

Alibaba's Happy Horse Ranks #1 on Artificial Analysis But Fails Tests

Alibaba's Happy Horse video generation model ranks approximately 100 points above Seedance 2.0 on the Artificial Analysis video leaderboard but breaks physics adherence and prompt fidelity in independent real-world test scenarios. Available for free on Alibaba's platform, independent reviewers ran direct head-to-head comparisons with Seedance 2.0 on princess and zoom-shot prompts and found Seedance 2.0 clearly superior. The gap between leaderboard ranking and real-world performance raises benchmark contamination as a plausible explanation.

Why It Matters

Leaderboard-first model releases are becoming a reliable anti-pattern in 2026; treating self-reported Artificial Analysis rankings as marketing rather than ground truth is now the default stance for practitioners.

This breaking-news item was assembled from the cited primary source with AI assistance. It is intended for rapid situational awareness — refer to the original publication for the definitive statement.