[Paper] FedSIR: Spectral Client Identification and Relabeling for Federated Learning with Noisy Labels

Published: (April 22, 2026 at 01:49 PM EDT)
5 min read
Source: arXiv

Source: arXiv - 2604.20825v1

Overview

Federated learning (FL) lets many devices train a shared model without ever sending raw data to a central server. However, when the local datasets contain mislabeled examples—a common issue in real‑world edge deployments—the global model can quickly deteriorate. The paper FedSIR introduces a multi‑stage framework that detects which clients are likely to have noisy labels, automatically relabels suspect samples, and adapts the federated training loop to be robust against the remaining noise.

Key Contributions

  • Spectral client diagnostics – Uses the eigen‑structure of class‑wise feature embeddings to separate “clean” from “noisy” clients with only a few extra communication rounds.
  • Cross‑client relabeling – Clean clients supply reliable class direction vectors; noisy clients project their data onto these directions and residual subspaces to generate corrected labels.
  • Noise‑aware training pipeline – Combines a logit‑adjusted loss, knowledge‑distillation from clean client models, and a distance‑aware aggregation rule to stabilize FL updates under label corruption.
  • Comprehensive empirical validation – Shows consistent gains over the strongest baselines on popular FL benchmarks (CIFAR‑10/100, FEMNIST) across a range of noise rates (20‑60 %).
  • Open‑source implementation – Full code released, enabling reproducibility and easy integration into existing FL pipelines.

Methodology

FedSIR operates in three stages, each designed to keep the communication budget low:

  1. Spectral Consistency Check

    • After an initial round of local training, each client extracts class‑wise feature matrices (e.g., the output of the penultimate layer).
    • The server performs a lightweight singular‑value decomposition (SVD) on the aggregated class subspaces.
    • Clients whose subspaces align well with the global dominant directions are flagged as clean; large deviations indicate potential label noise.
  2. Reference‑Based Relabeling

    • Clean clients broadcast the dominant eigen‑vectors (the “spectral references”) for each class.
    • Noisy clients project their samples onto these references. If a sample’s projection aligns poorly with its original label but strongly with another class direction, the label is flipped.
    • Residual components (orthogonal to the dominant subspace) are also examined to catch subtle mis‑alignments, providing a second‑chance correction.
  3. Noise‑Aware Federated Optimization

    • Logit‑adjusted loss: Shifts the classifier logits according to estimated class‑wise noise rates, reducing bias toward noisy classes.
    • Knowledge distillation: Clean clients act as teachers; their softened predictions are used to regularize noisy client updates.
    • Distance‑aware aggregation: The server weights each client’s model update by the spectral distance measured in stage 1, giving more influence to cleaner participants.

These steps repeat for a few communication rounds, progressively refining both the label quality and the global model.

Results & Findings

Dataset / NoiseBaseline (FedAvg)State‑of‑the‑art (e.g., FedAvg‑Robust)FedSIR
CIFAR‑10, 40 % symmetric noise58.2 %66.7 %73.4 %
CIFAR‑100, 30 % asymmetric noise42.1 %48.9 %55.6 %
FEMNIST, 20 % client‑wise noise71.5 %77.2 %82.0 %
  • Robustness to high noise: Even at 60 % symmetric noise, FedSIR retains >60 % accuracy, whereas competing methods drop below 45 %.
  • Communication efficiency: The extra spectral diagnostics add <0.5 MB per round (tiny compared to typical model updates).
  • Ablation studies confirm that each component (spectral detection, relabeling, logit‑adjusted loss) contributes roughly 3–5 % absolute gain, and the combination yields the full boost.

Practical Implications

  • Edge AI deployments – Devices (smartphones, IoT sensors) often receive crowdsourced or user‑generated labels that are noisy. FedSIR can be dropped into existing FL stacks (TensorFlow Federated, PySyft) to automatically clean up those labels without requiring a central data audit.
  • Reduced need for manual curation – By flagging noisy clients early, operators can decide whether to exclude problematic devices or request additional supervision, saving costly data‑labeling cycles.
  • Improved model reliability – Applications such as federated medical imaging or autonomous‑vehicle perception, where mislabeled samples can have safety implications, benefit from the extra robustness.
  • Scalable to heterogeneous environments – The spectral check works even when clients have different model architectures or data distributions, making it suitable for cross‑silo FL scenarios (e.g., multiple hospitals).

Limitations & Future Work

  • Assumption of class‑wise feature linearity – The spectral method presumes that clean data form coherent low‑dimensional subspaces; highly non‑linear class manifolds may weaken detection accuracy.
  • Extra local computation – Performing SVD on feature matrices can be costly for very large models or limited‑resource devices; the authors suggest approximate methods as a next step.
  • Static noise rates – The current logit‑adjusted loss uses a fixed estimate of noise per class; adapting this estimate online could further improve performance.
  • Broader noise models – Experiments focus on symmetric and class‑dependent noise; extending to instance‑dependent or adversarial label attacks remains open.

Overall, FedSIR offers a pragmatic, communication‑light solution for making federated learning resilient to noisy labels—an increasingly common hurdle as FL moves from research labs into production‑grade ecosystems.

Authors

  • Sina Gholami
  • Abdulmoneam Ali
  • Tania Haghighi
  • Ahmed Arafa
  • Minhaj Nur Alam

Paper Information

  • arXiv ID: 2604.20825v1
  • Categories: cs.LG, cs.AI, cs.CV, cs.DC, eess.SP
  • Published: April 22, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »