LeWorldModel: First Pixel-Native JEPA — 15M Params, 48x Faster Planning

LeWorldModel is the first JEPA trained end-to-end from raw pixels — 15M parameters, single laptop GPU, achieving 48x faster planning than foundation-model-based world models while remaining competitive on 2D and 3D benchmarks, from LeCun's AMI Labs and collaborators.

1 min read|agenticonsult Intelligence

LeWorldModel: First Pixel-Native JEPA — 15M Params, 48x Faster Planning

LeWorldModel, developed by Mila, NYU, Samsung SAIL, and Brown University (with no Meta authors), is the first JEPA (Joint Embedding Predictive Architecture) trained end-to-end from raw pixels. It uses just 15 million parameters, trains on a single GPU in a few hours, and achieves 48× faster planning than foundation-model-based world models while remaining competitive on 2D and 3D planning benchmarks. The architecture eliminates the need for exponential moving averages or pretrained encoders that caused previous JEPAs to collapse, reducing six hyperparameters to one and fitting on a laptop GPU.

Why It Matters

If a 15M-parameter pixel-native world model can plan 48× faster than foundation-model baselines at competitive accuracy, the argument for JEPA-based architectures as the substrate for physical AI agents becomes significantly more concrete — and accessible to researchers without hyperscale compute budgets.

This breaking-news item was assembled from the cited primary source with AI assistance. It is intended for rapid situational awareness — refer to the original publication for the definitive statement.

LeWorldModel: First Pixel-Native JEPA — 15M Params, 48x Faster Planning

LeWorldModel: First Pixel-Native JEPA — 15M Params, 48x Faster Planning

Why It Matters

Live Intel Feed