Accelerating on-device AI: A look at Arm and Google AI Edge optimization

Published: (May 26, 2026 at 03:35 AM EDT)
1 min read

Source: Google Developers Blog

Integration Overview

The integration of Arm Scalable Matrix Extension 2 (SME2) with the Google AI Edge software stack turns the CPU into a powerful matrix‑compute accelerator, enabling high‑performance, on‑device generative AI.

Convert, Optimize, and Deploy Pipeline

Using Stability AI’s stable-audio-open-small model as a case study, the pipeline follows three main steps:

  1. Convert – Transform the model into a format compatible with the Edge stack.
  2. Optimize – Apply acceleration tools such as LiteRT, XNNPACK, and KleidiAI to generate hardware‑specific kernels.
  3. Deploy – Run the optimized model on Arm‑powered devices.

Performance Results

  • Speedup: Over faster audio generation.
  • Memory reduction: lower memory usage.
  • Quality: High audio quality is maintained on Arm‑based mobile devices and laptops.
0 views
Back to Blog

Related posts

Read more »