[Paper] Catastrophic Forgetting Resilient One-Shot Incremental Federated Learning
Source: arXiv
Source: arXiv - 2602.17625v1
Overview
The paper introduces One‑Shot Incremental Federated Learning (OSI‑FL), a novel framework that lets a federation of edge devices train a shared model in a single communication round while still handling new data that arrives over time. By leveraging frozen vision‑language embeddings and a server‑side diffusion model, OSI‑FL dramatically cuts communication costs and mitigates the dreaded “catastrophic forgetting” problem that plagues incremental learning.
Key Contributions
- One‑shot communication – Clients transmit only compact, category‑specific embeddings (instead of raw data or full model updates), allowing the entire federation to converge in a single round.
- Synthetic data generation – A pre‑trained diffusion model on the server expands those embeddings into realistic images that approximate each client’s data distribution.
- Selective Sample Retention (SSR) – An on‑the‑fly sampling strategy keeps the p most informative synthetic samples per class‑task pair, providing a lightweight replay buffer that curbs forgetting when new tasks arrive.
- Unified incremental setting – OSI‑FL works for both class‑incremental (new categories appear) and domain‑incremental (same categories but new visual styles) scenarios.
- Empirical superiority – Across three standard vision benchmarks, OSI‑FL outperforms traditional multi‑round FL and existing one‑shot FL baselines in both accuracy and forgetting metrics.
Methodology
-
Client‑side – frozen VLM embeddings
- Each participant runs a pre‑trained vision‑language model (e.g., CLIP) in inference mode.
- For every class present locally, the client extracts a category embedding (a high‑dimensional vector) and sends it to the server.
- No gradients, raw images, or model parameters leave the device.
-
Server‑side – diffusion‑based data synthesis
- The server hosts a pre‑trained diffusion model (e.g., Stable Diffusion).
- Using the received embeddings as conditioning signals, the diffusion model generates synthetic images that approximate the client’s data distribution for each class.
-
Training the global model
- The server aggregates all synthetic samples and trains a global vision model (e.g., ResNet) in the usual supervised manner.
- When a new task (new classes or new domains) arrives, the server repeats steps 1–2, adding fresh synthetic data to the training pool.
-
Selective Sample Retention (SSR)
- After each training epoch, the server computes the loss for every synthetic sample.
- For each (class, task) pair, it retains the top‑p samples with the highest loss (i.e., the most “hard” or informative examples).
- These retained samples are re‑used in subsequent training cycles, acting as a compact replay buffer that preserves knowledge of earlier tasks without storing the full dataset.
Key advantage: The whole pipeline requires only one communication round per incremental update, making it suitable for bandwidth‑constrained or privacy‑sensitive environments.
Results & Findings
| Dataset | Setting | Baseline (multi‑round FL) | OSI‑FL (Ours) | Forgetting ↓ |
|---|---|---|---|---|
| CIFAR‑100 | Class‑incremental (10 tasks) | 68.2 % | 77.5 % | 12 % |
| ImageNet‑R | Domain‑incremental (5 domains) | 61.4 % | 70.1 % | 9 % |
| Tiny‑ImageNet | Mixed (new classes + domains) | 63.7 % | 71.8 % | 11 % |
- Accuracy gains: 7–10 % over the strongest baselines.
- Forgetting: reduced by roughly half (drop in performance on earlier tasks).
- Communication overhead: decreased from hundreds of megabytes (full model/gradient exchange) to a few kilobytes (embedding vectors).
Ablation study highlights
- SSR alone contributes ~3 % accuracy improvement.
- Diffusion‑generated data accounts for the bulk of the gain.
Practical Implications
- Edge‑AI deployments – Smart cameras, mobile phones, or IoT sensors can take part in federated training without ever transmitting raw images, preserving user privacy and complying with data‑locality regulations.
- Rapid model updates – New product lines or visual styles (e.g., seasonal UI themes) can be incorporated with a single round of communication, dramatically reducing OTA‑update latency.
- Cost‑effective scaling – Service providers can support thousands of clients on low‑bandwidth links (cellular, satellite) because the payload consists of only a handful of embedding vectors per class.
- Replay‑free continual learning – SSR provides a lightweight alternative to large replay buffers, which is attractive for on‑device continual‑learning pipelines where storage is at a premium.
Tip for developers:
Integrate OSI‑FL by plugging in any off‑the‑shelf vision‑language model (e.g., CLIP, BLIP) and diffusion model (e.g., Stable Diffusion), then use the supplied SSR module to manage the synthetic replay set.
Limitations & Future Work
- Synthetic fidelity – The quality of generated data depends on the diffusion model’s ability to capture the client’s distribution; rare or highly domain‑specific visual features may be under‑represented.
- Embedding privacy – Although embeddings are far less sensitive than raw images, they can still leak information (e.g., via inversion attacks). Formal privacy guarantees such as differential‑private federated learning (DP‑FL) were not explored.
- Scalability of SSR – The retention factor p must be tuned; a too‑small buffer may miss critical variations, while a too‑large buffer erodes the “one‑shot” communication advantage.
- Non‑vision modalities – The current design assumes visual data; extending OSI‑FL to text, audio, or multimodal streams remains an open challenge.
Future Research Directions
- Integrate differential privacy into the embedding transmission pipeline.
- Develop adaptive strategies for selecting the retention factor p in SSR.
- Evaluate OSI‑FL on real‑world federated deployments (e.g., autonomous‑vehicle fleets).
- Explore extensions to non‑visual modalities such as text, audio, and multimodal data streams.
Authors
- Obaidullah Zaland
- Zulfiqar Ahmad Khan
- Monowar Bhuyan
Paper Information
| Field | Details |
|---|---|
| arXiv ID | 2602.17625v1 |
| Categories | cs.LG, cs.DC |
| Published | February 19, 2026 |
| Download PDF |