[Paper] CABLE: Cloud-Assisted Bandwidth-efficient LMM-based Encoding for V2X Systems
Source: arXiv - 2606.19258v1
Overview
Cloud-hosted large multimodal models (LMMs) can provide strong open-vocabulary perception for Vehicle-to-Everything systems, but naively transmitting full-resolution frames from edge to cloud causes severe communication overhead and high cloud-side prefill latency. We present CABLE, a cloud-assisted bandwidth-efficient LMM-based encoding framework for edge-cloud perception. CABLE propagates the previous cloud segmentation mask on the edge using ego-motion compensation, refines it with residual-motion cues, and consolidates disconnected regions via a corridor envelope to form a robust region of interest (ROI). Only ROI-masked images are uploaded, while the cloud segmentation output is fed back as the prior for the next frame, forming a mask-to-ROI-to-LMM feedback loop. Experiments on five datasets (nuScenes, WOD-ZB, Waymo, KITTI, and CADC) show consistent communication savings while largely preserving perception, achieving $73$—$87%$ ROI pixel-coverage reduction with $5$—$8\times$ estimated LMM prefill speedup at a modest detection-quality trade-off relative to full-frame inference.
Key Contributions
This paper presents research in the following areas:
- cs.CV
- cs.RO
Methodology
Please refer to the full paper for detailed methodology.
Practical Implications
This research contributes to the advancement of cs.CV.
Authors
- Haohua Que
- Zhipeng Bao
- Qianyi Wu
- Handong Yao
Paper Information
- arXiv ID: 2606.19258v1
- Categories: cs.CV, cs.RO
- Published: June 17, 2026
- PDF: Download PDF