Endless Terminals RL Task Dataset Hits 73k HF Downloads in One Month

Endless Terminals, an internship project by Kanishk Gandhi, autonomously generates terminal tasks for reinforcement learning training using simple PPO on scaled environments — with zero human annotation required. The dataset reached 73,000+ Hugging Face downloads in its first month and shows consistent downstream improvements on TerminalBench 2.0.

Why It Matters

Autonomous RL task generation without human annotation directly addresses the scaling bottleneck for RL-trained agents. If pipelines can generate diverse, high-quality training tasks autonomously, the "data wall" that limits RL agent scope becomes a solvable engineering problem — with implications for any team training agents on terminal or code execution tasks.