GLM 5.1: First Open-Source ~1 Trillion Parameter Frontier Model Ships
Z.ai (Zhipu AI) released GLM 5.1, the first open-source model to reach approximately one trillion parameters in full FP16 precision (roughly 1.5TB of weights). EXO Labs demonstrated the model running across a cluster of four Mac Studios (512GB each) connected via RDMA-over-Thunderbolt using freshly converted MLX 4-bit weights (~400GB) at approximately 20 tokens per second. The hardware cost — roughly $40,000 in consumer Apple Silicon — confirms the trillion-parameter regime is now accessible to teams without hyperscaler access.
Why It Matters
A trillion-parameter open-source model runnable on a $40K consumer hardware cluster marks a fundamental opening of frontier-class AI to teams outside Big Tech. EXO's RDMA-over-Thunderbolt demo establishes multi-Mac clustering as a viable inference architecture for large open-source models.