talkie-lm Released: 13B LLM Trained Solely on Pre-1931 Text

Researchers Nick Levine, Alec Rad, and David Duvenaud have released talkie-lm — a 13B-parameter language model trained exclusively on text predating 1931 — as a research tool for probing LLM generalization. The model is on-device capable and exhibits behavior consistent with its training cutoff: it defends the luminiferous aether hypothesis, expresses distrust of special relativity, and produces a helpless response when asked to arrange a sushi delivery in Philadelphia. The central research question is whether a model trained before the existence of computers can nonetheless learn to code.

Why It Matters

talkie-lm is a rare direct experiment on the separability of LLM reasoning from world knowledge — probing whether the capability to generalize comes from language patterns, knowledge content, or their interaction. It opens research threads that matter for understanding what in-context learning actually learns, independent of factual currency.