[Paper] Continual Error Correction on Low-Resource Devices
Source: arXiv - 2511.21652v1
Overview
The paper introduces a lightweight “continual error‑correction” system that lets end‑users fix AI misclassifications on devices such as smartphones, wearables, or IoT gadgets. By combining a server‑side knowledge‑distillation pipeline with an on‑device prototype‑based classifier, the authors achieve fast, few‑shot corrections without the heavy compute or memory demands of full model retraining.
Key Contributions
- Prototype‑based on‑device correction: Replaces costly weight updates with tiny prototype vectors that can be added/modified in a single shot.
- Server‑side distillation framework: Transfers rich feature representations from large foundation models to compact, device‑friendly architectures.
- Ultra‑low resource footprint: Demonstrates < 0.5 MB memory increase and < 10 ms latency for a correction step on typical Android hardware.
- Empirical validation on vision tasks: Shows > 50 % error reduction after a single user correction on Food‑101 and Flowers‑102, with forgetting ≤ 0.02 %.
- End‑to‑end demo app: Provides a publicly available Android prototype that showcases real‑time correction for both image classification and object detection.
Methodology
-
Server‑side training
- A large foundation model (e.g., CLIP, ViT) is fine‑tuned on the target domain.
- Knowledge distillation compresses the model into a lightweight CNN/Transformer that can run on low‑power CPUs/NPUs.
- The distilled model outputs a high‑dimensional embedding for each input image.
-
On‑device prototype classifier
- The device stores a small set of class prototypes: one embedding per class (or per object instance for detection).
- Classification is performed by nearest‑prototype search (e.g., cosine similarity).
-
Few‑shot error correction
- When a user flags a misprediction, the corrected image is passed through the on‑device encoder to obtain its embedding.
- The system either (a) creates a new prototype for the correct class or (b) updates the existing prototype via a simple moving‑average rule.
- No gradient‑based back‑propagation or model weight changes are required, keeping the operation CPU‑light.
-
Continual learning safeguards
- A tiny replay buffer holds a handful of recent prototypes to avoid catastrophic forgetting.
- Prototype updates are regularized to keep the overall embedding space stable.
Results & Findings
| Dataset | Task | One‑shot correction gain | Forgetting (Δ accuracy on unchanged classes) | Avg. correction latency (Android) |
|---|---|---|---|---|
| Food‑101 | Image classification | +52 % reduction in top‑1 error | 0.018 % | 8 ms |
| Flowers‑102 | Image classification | +48 % reduction in top‑1 error | 0.015 % | 9 ms |
| COCO‑subset | Object detection (bbox) | +45 % reduction in false positives | 0.022 % | 12 ms |
Key takeaways: a single user‑provided example can halve the error rate for the affected class, while the rest of the model’s performance stays virtually unchanged. The memory overhead for storing prototypes is under 200 KB even for 100 classes.
Practical Implications
- On‑device personalization: Apps can let users “teach” the model their own visual vocabularies (e.g., custom food dishes, brand logos) without sending data back to the cloud.
- Reduced bandwidth & privacy risk: Corrections are performed locally; only the distilled encoder needs to be downloaded once.
- Fast rollout of bug fixes: Manufacturers can ship a base model and rely on prototype updates to address edge‑case failures discovered post‑release.
- Edge AI for low‑cost hardware: The approach works on mid‑range Android phones and even microcontroller‑class NPUs, opening doors for smart cameras, AR glasses, and industrial sensors.
- Developer‑friendly API: The prototype‑update logic can be wrapped in a few lines of code (e.g.,
addPrototype(image, label)), making integration trivial for mobile SDKs.
Limitations & Future Work
- Prototype scalability: The method assumes a modest number of classes; very large vocabularies may require hierarchical or compressed prototype structures.
- Embedding drift: Over long periods, the distilled encoder’s representation may shift, necessitating periodic server‑side re‑distillation and OTA updates.
- Detection granularity: Current object‑detection correction works at the class level but not for fine‑grained bounding‑box adjustments.
- User experience studies: The paper reports technical metrics but leaves systematic UX evaluation (e.g., how often users correct errors) for future research.
Authors
- Kirill Paramonov
- Mete Ozay
- Aristeidis Mystakidis
- Nikolaos Tsalikidis
- Dimitrios Sotos
- Anastasios Drosou
- Dimitrios Tzovaras
- Hyunjun Kim
- Kiseok Chang
- Sangdok Mo
- Namwoong Kim
- Woojong Yoo
- Jijoong Moon
- Umberto Michieli
Paper Information
- arXiv ID: 2511.21652v1
- Categories: cs.CV, cs.AI, cs.LG
- Published: November 26, 2025
- PDF: Download PDF