[Paper] RedunCut: Measurement-Driven Sampling and Accuracy Performance Modeling for Low-Cost Live Video Analytics

Published: 1 month ago (December 30, 2025 at 01:01 PM EST)

4 min read

Source: arXiv

Source: arXiv - 2512.24386v1

Overview

Live video analytics (LVA) powers everything from traffic‑monitoring dashboards to drone‑based inspection pipelines, but running state‑of‑the‑art vision models on every frame quickly becomes prohibitively expensive. The paper RedunCut proposes a smarter way to pick the “right‑sized” model for each video segment on the fly, cutting compute costs by up to two‑thirds while keeping accuracy guarantees intact.

Key Contributions

Measurement‑driven sampling planner – a runtime component that decides whether and how many models to sample based on a cost‑benefit analysis, avoiding wasteful over‑sampling.
Lightweight data‑driven accuracy model – a fast predictor that estimates per‑segment accuracy for each candidate model size, improving the selection decision without needing ground‑truth labels.
Robustness to diverse workloads – demonstrated on road‑vehicle, drone, and surveillance footage, covering multiple model families (e.g., YOLO, EfficientDet) and tasks (object detection, classification).
Empirical savings of 14‑62 % in compute at fixed accuracy across all tested scenarios, even when only a small history of past runs is available or when video content drifts over time.
No model retraining required – RedunCut works with existing black‑box models, making it drop‑in compatible with current LVA pipelines.

Methodology

Segment‑wise decision loop – The video stream is broken into short segments (e.g., a few seconds). For each segment RedunCut must pick a model size (small, medium, large, …).
Planner stage – Before sampling, a lightweight planner estimates the expected reduction in compute if a cheaper model is chosen versus the overhead of sampling a few models to gather statistics. The planner uses recent runtime measurements (latency, confidence distributions) to decide the optimal number of samples.
Sampling stage – If the planner decides sampling is worthwhile, RedunCut runs a small subset of candidate models on a few frames, collects confidence scores, and feeds them to the accuracy predictor.
Accuracy predictor – Trained offline on a modest labeled dataset, this model learns the relationship between observable statistics (e.g., average confidence, entropy) and the true accuracy of each candidate model for the current video domain. It runs in microseconds, so it does not add noticeable overhead.
Model selection – The predictor outputs an estimated accuracy for each candidate; RedunCut then picks the smallest model that meets the user‑specified accuracy target. The chosen model processes the rest of the segment, and the loop repeats for the next segment.

The whole pipeline is designed to be measurement‑driven: every decision is grounded in actual runtime data rather than static heuristics, which lets the system adapt to changing lighting, motion, or scene composition.

Results & Findings

Dataset / Task	Accuracy Target	Compute Reduction vs. Baseline	Observations
Road‑vehicle (YOLO‑v5) – object detection	90 % mAP	62 % lower FLOPs	Sampling overhead stayed < 5 % of total cost
Drone footage (EfficientDet) – detection	85 % mAP	48 % lower FLOPs	Predictor remained accurate despite rapid viewpoint changes
Surveillance (ResNet‑50) – classification	92 % top‑1	14 % lower FLOPs	Gains modest but consistent; planner avoided unnecessary sampling
Limited history (≤ 5 min)	90 % mAP	30‑55 % reduction	System quickly converged to reliable estimates
Concept drift (weather change)	90 % mAP	≈ 40 % reduction	Planner re‑evaluated sampling frequency, keeping cost low

Overall, RedunCut kept the target accuracy within ±1 % of the baseline while delivering sizable compute savings across all tested scenarios.

Practical Implications

Cost‑effective edge deployments – Operators of smart‑city cameras or drone fleets can run heavier models only when needed, extending battery life and reducing cloud‑ingress bandwidth.
Simplified pipeline integration – Because RedunCut treats models as black boxes, existing inference services (TensorRT, ONNX Runtime, etc.) can be wrapped with the planner without code changes.
Dynamic SLAs – Service providers can expose “accuracy‑as‑a‑service” contracts; RedunCut automatically throttles compute to meet the promised precision while minimizing spend.
Rapid prototyping – Data scientists can experiment with new model families without re‑engineering the runtime; RedunCut will automatically discover the most cost‑effective size for each video domain.
Scalable cloud billing – For SaaS video analytics platforms, per‑frame compute reductions translate directly into lower GPU hours and more predictable billing for customers.

Limitations & Future Work

Reliance on short‑term statistics – In highly erratic scenes (e.g., sudden flashes), the confidence‑based predictor may mis‑estimate accuracy, leading to occasional over‑aggressive model downsizing.
Initial warm‑up cost – The planner needs a brief observation window to gather reliable measurements; during this period compute savings are modest.
Model family granularity – RedunCut assumes a discrete set of pre‑trained model sizes; extending it to continuous scaling (e.g., dynamic channel pruning) is left for future research.
Broader task coverage – Experiments focused on detection and classification; applying the same ideas to segmentation, pose estimation, or multimodal video‑audio pipelines remains an open avenue.

The authors suggest exploring adaptive learning of the accuracy predictor on‑the‑fly and integrating reinforcement‑learning‑based planners to further tighten the cost‑accuracy trade‑off.

Authors

Gur‑Eyal Sela
Kumar Krishna Agrawal
Bharathan Balaji
Joseph Gonzalez
Ion Stoica

Paper Information

arXiv ID: 2512.24386v1
Categories: cs.CV, cs.DC
Published: December 30, 2025
PDF: Download PDF

[Paper] RedunCut: Measurement-Driven Sampling and Accuracy Performance Modeling for Low-Cost Live Video Analytics

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] AdaGaR: Adaptive Gabor Representation for Dynamic Scene Reconstruction

[Paper] Two Deep Learning Approaches for Automated Segmentation of Left Ventricle in Cine Cardiac MRI

[Paper] Fusion-SSAT: Unleashing the Potential of Self-supervised Auxiliary Task by Feature Fusion for Generalized Deepfake Detection

[Paper] FedHypeVAE: Federated Learning with Hypernetwork Generated Conditional VAEs for Differentially Private Embedding Sharing