Google Gemma 4 E2B/E4B Enables Agent Skills on Edge Devices via LiteRT-LM

Google's Gemma 4 E2B/E4B are the first models to bundle function calling and thinking at 2–4B effective parameters, enabling progressive-disclosure agent skills on mobile devices.

1 min read|agenticonsult Intelligence

Google Gemma 4 E2B/E4B Enables Agent Skills on Edge Devices via LiteRT-LM

Google's Gemma 4 E2B and E4B are the first models to bundle function calling and thinking within a 2–4B effective-parameter footprint. Presented at AI Engineer by Google AI Edge tech lead Cormac Brick, the models run agent skills on Android, iOS, macOS, Linux, Windows, and IoT via the open-source LiteRT-LM runtime, using progressive-disclosure skill loading to preserve reasoning quality in constrained contexts. Apache 2.0 licensed; companion app Google AI Edge Gallery is open source.

Why It Matters

On-device agentic AI just crossed a meaningful capability threshold. The 2–4B range now supports real skill workflows — not just summarization — enabling privacy-preserving, latency-free agent tasks without cloud calls. NPU acceleration on Qualcomm delivers ~10x CPU throughput.

This breaking-news item was assembled from the cited primary source with AI assistance. It is intended for rapid situational awareness — refer to the original publication for the definitive statement.

Google Gemma 4 E2B/E4B Enables Agent Skills on Edge Devices via LiteRT-LM

Google Gemma 4 E2B/E4B Enables Agent Skills on Edge Devices via LiteRT-LM

Why It Matters

Live Intel Feed