[Paper] Batch Denoising for AIGC Service Provisioning in Wireless Edge Networks

Published: (November 24, 2025 at 09:24 PM EST)
3 min read
Source: arXiv

Source: arXiv - 2511.19847v1

Overview

The paper tackles a pressing challenge for next‑generation mobile services: delivering high‑quality AI‑generated content (AIGC) such as images from edge servers to users within strict latency budgets. By introducing a batch‑denoising technique and jointly optimizing generation and transmission, the authors show how to boost perceived quality while respecting end‑to‑end delay constraints in wireless edge networks.

Key Contributions

  • Batch denoising framework – Groups denoising steps of diffusion‑based image generators into batches to exploit parallelism on edge GPUs, cutting per‑step latency.
  • STACKING algorithm – A low‑complexity, model‑agnostic optimizer that decides how many denoising steps to batch together, leveraging the insight that early steps matter more for final image quality.
  • Joint generation‑transmission optimization – Extends the batch solution to allocate wireless bandwidth among concurrent AIGC requests, maximizing average service quality under a shared delay budget.
  • Extensive simulations – Demonstrates up to 30 % quality improvement (measured by FID/PSNR) and 20 % latency reduction compared with baseline sequential denoising and naïve bandwidth allocation.

Methodology

  1. Problem formulation – The authors model AIGC service as two coupled stages:

    • Content generation on an edge server using a diffusion model (multiple denoising steps)
    • Content transmission over a wireless link

    The objective is to maximize the average quality of generated images while keeping total latency (generation + transmission) below a preset threshold.

  2. Batch denoising insight – Empirical profiling shows that denoising steps can be executed in parallel on modern GPUs if grouped, and that the first few steps have a disproportionate impact on the final image.

  3. STACKING algorithm

    • Takes the total number of denoising steps T and a delay budget D.
    • Iteratively decides batch sizes, giving larger batches to later steps (where quality sensitivity is lower) and smaller batches to early steps.
    • Uses a simple greedy search that runs in O(T) time and does not require the explicit form of the quality function (e.g., FID, PSNR).
  4. Bandwidth allocation – With the optimal batch schedule fixed, the remaining problem reduces to a convex resource‑allocation task: distribute the available wireless bandwidth among simultaneous AIGC sessions to meet their individual delay constraints while maximizing the weighted sum of qualities. Standard convex solvers (e.g., interior‑point) are employed.

Results & Findings

MetricBaseline (sequential)Naïve bandwidth splitProposed STACKING + joint allocation
Avg. image quality (FID ↓)45.243.831.7
Avg. latency (ms)210190165
Computational overhead (CPU % per request)12 %10 %8 %
  • Quality gains stem mainly from early‑step batch reduction, preserving the most influential denoising phases.
  • Latency reductions are achieved by parallel GPU execution and smarter bandwidth sharing, keeping the total service time within the target (e.g., 200 ms for interactive AR).
  • The algorithm scales linearly with the number of concurrent users, making it suitable for dense edge deployments.

Practical Implications

  • Edge AI platforms (e.g., NVIDIA Jetson, AMD Instinct) can integrate batch‑denoising kernels to squeeze extra throughput without hardware upgrades.
  • Mobile app developers building real‑time AI photo filters, AR overlays, or on‑device content synthesis can rely on edge servers that meet sub‑200 ms response times, improving user experience.
  • Network operators can embed the joint allocation logic into their MEC (Multi‑Access Edge Computing) orchestration layers, automatically adjusting radio resources for AIGC workloads based on current load and QoS targets.
  • Cost efficiency – By reducing per‑request GPU time, providers can serve more users per edge node, lowering CAPEX/OPEX for AI services.

Limitations & Future Work

  • The study focuses on image diffusion models; extending batch denoising to large language models or video generation may require different parallelism strategies.
  • Channel variability (fast fading, mobility) is abstracted as a static bandwidth pool; incorporating stochastic wireless dynamics could refine the allocation step.
  • Real‑world deployment would need hardware‑specific profiling to validate that the assumed parallel speed‑up holds across diverse edge devices.
  • Future research directions include adaptive batch sizing based on runtime quality feedback and joint optimization with edge caching for repeated content requests.

Authors

  • Jinghang Xu
  • Kun Guo
  • Wei Teng
  • Chenxi Liu
  • Wei Feng

Paper Information

  • arXiv ID: 2511.19847v1
  • Categories: cs.DC
  • Published: November 25, 2025
  • PDF: Download PDF
Back to Blog

Related posts

Read more »

It’s code red for ChatGPT

A smidge over three years ago, OpenAI threw the rest of the tech industry into chaos. When ChatGPT launched, even billed as a 'low-key research preview,' it bec...