[Paper] Deep Reinforcement Learning-driven Edge Offloading for Latency-constrained XR pipelines

Published: 3 days ago (March 17, 2026 at 01:30 PM EDT)

4 min read

Source: arXiv

Source: arXiv - 2603.16823v1

Overview

Extended reality (XR) applications—think AR glasses, VR headsets, and mixed‑reality collaboration tools—must render frames within a few milliseconds to avoid motion sickness, all while running on battery‑limited wearables. This paper proposes a battery‑aware edge‑offloading framework that decides, in real time, whether a XR workload should be processed locally or sent to a nearby edge server. By using a lightweight deep‑reinforcement‑learning (DRL) controller, the system continuously balances latency constraints with battery consumption, delivering smoother user experiences without draining the device.

Key Contributions

Joint latency‑energy model that captures motion‑to‑photon (MTP) latency, workload quality, and battery dynamics in a single decision‑making objective.
Online DRL policy (≈ 0.5 ms inference cost) that adapts execution placement on‑the‑fly under varying network bandwidth and device power states.
Battery‑life extension of up to 163 % compared with a latency‑optimal local‑only baseline, while keeping ≥ 90 % of frames within the MTP latency budget in stable networks.
Robustness to network degradation: compliance stays above 80 % even when bandwidth is severely limited.
Extensive experimental validation on a prototype XR pipeline (camera capture → SLAM → rendering) using commodity edge hardware and off‑the‑shelf headsets.

Methodology

System Model – The XR pipeline is split into three stages: sensor capture, compute‑heavy perception (e.g., SLAM, AI‑based object detection), and rendering. Each stage can run locally or be offloaded to an edge node.
Latency‑Energy Objective – The authors formulate a cost function that penalizes missed MTP deadlines and battery drain, weighted by user‑defined preferences (e.g., “favor battery” vs. “favor latency”).
State Representation – The DRL agent observes a compact state vector: current battery level, recent MTP latency, estimated network throughput, and workload size.
Action Space – Two actions: Local (process everything on device) or Offload (send compute‑intensive stages to edge).
Learning Algorithm – A lightweight Deep Q‑Network (DQN) with a few fully‑connected layers is trained online using experience replay. The reward reflects the objective function, encouraging actions that keep latency under the 20 ms MTP threshold while preserving battery.
Implementation – The policy runs on the XR device’s CPU (≈ 2 % utilization) and communicates with an edge server over Wi‑Fi or 5G. The edge node executes the offloaded workload in a containerized environment to keep latency predictable.

Results & Findings

Scenario	Battery Lifetime (hrs)	MTP Compliance (%)
Local‑only (latency‑optimal)	1.0 (baseline)	95
Proposed DRL‑offload (stable Wi‑Fi)	2.63 (+163 %)	92
Proposed DRL‑offload (5 Mbps limit)	2.1	84
Heuristic offload (static rule)	1.7	78

Latency: The DRL policy keeps average MTP latency under 20 ms in > 90 % of frames when bandwidth ≥ 10 Mbps; degradation is graceful as bandwidth drops.
Overhead: Policy inference adds < 0.5 ms per decision, negligible compared to the XR frame budget.
Adaptivity: When the battery dips below 20 %, the agent automatically shifts to more local processing to avoid sudden shutdowns, demonstrating closed‑loop energy awareness.

Practical Implications

Longer Field Sessions: AR/VR developers can ship devices that stay operational for 2–3 × longer without sacrificing interactive smoothness—critical for enterprise training, remote assistance, or gaming marathons.
Network‑Aware Apps: The DRL controller can be embedded in SDKs (e.g., Unity, Unreal) to let apps automatically adapt to Wi‑Fi/5G fluctuations, reducing the need for manual QoS tuning.
Edge‑First Architecture: Service providers can design lightweight edge functions (SLAM, AI inference) knowing that a smart offloading layer will keep latency guarantees, making edge compute a viable alternative to on‑device accelerators.
Battery‑Centric UX Metrics: Product managers now have a concrete metric (battery‑latency trade‑off) to benchmark XR experiences, moving beyond “average FPS” or “peak power” alone.

Limitations & Future Work

Simplified Action Space – The current binary decision (local vs. offload) does not explore partial offloading (e.g., offload only SLAM but render locally).
Network Model – Experiments focus on Wi‑Fi and a single 5G slice; more heterogeneous networks (cellular handover, congested edge) could affect stability.
Generalization – The DRL policy is trained on a specific XR pipeline; transferring to drastically different workloads (e.g., volumetric video) may require retraining or meta‑learning techniques.
Security & Privacy – Offloading raw sensor data raises privacy concerns that the paper does not address; future work could integrate encrypted inference or on‑device preprocessing.

Overall, the paper demonstrates that a modest DRL‑based offloading engine can dramatically stretch battery life while keeping XR latency within human‑perceptible bounds, paving the way for more immersive, untethered experiences.

Authors

Sourya Saha
Saptarshi Debroy

Paper Information

arXiv ID: 2603.16823v1
Categories: cs.CV
Published: March 17, 2026
PDF: Download PDF

[Paper] Deep Reinforcement Learning-driven Edge Offloading for Latency-constrained XR pipelines

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding

[Paper] Matryoshka Gaussian Splatting

[Paper] Cubic Discrete Diffusion: Discrete Visual Generation on High-Dimensional Representation Tokens

[Paper] MonoArt: Progressive Structural Reasoning for Monocular Articulated 3D Reconstruction