GGUF Ecosystem Hits 176K Models; Monthly Growth Nearly Doubled Since March
Hugging Face CEO Clément Delangue reports that the platform now hosts 176,000 public GGUF models — the quantized local-inference format used primarily with llama.cpp. Monthly new-model creation averaged 5,100 between October 2025 and February 2026, then jumped to 9,200 in March (+55% month-over-month) and sustained at 9,700 in April, confirming a new baseline rather than a spike. Drivers include a wave of new open-weight model releases being quantized immediately, llama.cpp infrastructure improvements, and automated quantization pipelines. Independently, Ethan Mollick reported observing 10% local AI installation penetration among a room of senior accountants — non-tech professionals at a non-tech firm.
Why It Matters
Independent signals from the supply side (GGUF model counts doubling) and the demand side (10% professional penetration in a non-tech sector) arriving simultaneously confirms that local AI deployment is no longer a developer phenomenon — it's a mainstream enterprise pattern.