Google Gemma 4 Runs Natively on iPhone with Full Offline AI Inference

Published: (April 15, 2026 at 01:19 AM EDT)
2 min read

Source: Hacker News

Overview

On‑device AI has been a talking point for years, but Google’s latest move makes it harder to dismiss. Gemma 4, Google’s open‑source model family, now runs directly on iPhones with full local inference and offline capability. This signals that edge AI deployment is no longer a future priority—it’s happening right now.

Benchmark Comparison

Early benchmarks place the 31 B‑parameter variant of Gemma 4 alongside Qwen 3.5’s 27 B model. The two are a reasonably close matchup, with Gemma carrying roughly 4 B additional parameters. Both models have trade‑offs, and neither dominates every task.

Model Variants for Mobile

The more compelling story is the smaller variants—E2B and E4B. These are clearly engineered for mobile deployment, prioritizing efficiency over raw capability. Google’s own app nudges users toward the E2B variant because it is faster, lighter, and better suited for real‑world on‑device conditions where memory and thermal limits matter.

Getting Started

  1. Download the Google AI Edge Gallery from the App Store.
  2. Open the app, select your preferred model variant, and start running inference directly on your device.
    • No API calls. No cloud dependency.
  • Text interface for prompt‑based generation.
  • Integrated image recognition and voice interaction.
  • Extensible Skills framework, positioning the app as a platform for on‑device AI experimentation rather than a simple demo.

Technical Details

Gemma 4 routes inference through the iPhone’s GPU. In practice, responses arrive with notably low latency, indicating that consumer hardware can sustain this class of workload without visible performance degradation. This low‑latency, offline capability is a strong argument for the commercial viability of local AI deployment.

Implications for Enterprise

Offline capability changes the calculus for enterprise use cases such as:

  • Field applications where connectivity is unreliable.
  • Healthcare settings with strict data‑privacy requirements.
  • Any scenario where sending data to the cloud is prohibited.

Conclusion

Gemma 4 on iPhone is more than a technical proof‑of‑concept; it’s a clear signal that the on‑device AI era has arrived. For Google, the Gemma family is definitely out of the bottle.

0 views
Back to Blog

Related posts

Read more »

Google launches a Gemini AI app on Mac

Google is launching a new Gemini app on Mac that lets you interact with the AI assistant without switching windows on your desktop. Using the Option + Space sho...