Accelerating on-device AI: A look at Arm and Google AI Edge optimization
Source: Google Developers Blog
Integration Overview
The integration of Arm Scalable Matrix Extension 2 (SME2) with the Google AI Edge software stack turns the CPU into a powerful matrix‑compute accelerator, enabling high‑performance, on‑device generative AI.
Convert, Optimize, and Deploy Pipeline
Using Stability AI’s stable-audio-open-small model as a case study, the pipeline follows three main steps:
- Convert – Transform the model into a format compatible with the Edge stack.
- Optimize – Apply acceleration tools such as LiteRT, XNNPACK, and KleidiAI to generate hardware‑specific kernels.
- Deploy – Run the optimized model on Arm‑powered devices.
Performance Results
- Speedup: Over 2× faster audio generation.
- Memory reduction: 4× lower memory usage.
- Quality: High audio quality is maintained on Arm‑based mobile devices and laptops.