[Paper] SRFed: Mitigating Poisoning Attacks in Privacy-Preserving Federated Learning with Heterogeneous Data
Source: arXiv - 2602.16480v1
Overview
Federated Learning (FL) lets many devices train a shared model without ever sending raw data to a central server. While this protects user privacy, it also opens two attack vectors: a curious server that tries to reverse‑engineer private data, and malicious clients that inject poisoned updates to sabotage the model. SRFed introduces a new framework that simultaneously thwarts both threats, works efficiently on heterogeneous (Non‑IID) data, and avoids the heavy computation and communication costs of prior defenses.
Key Contributions
- Decentralized Efficient Functional Encryption (DEFE): A lightweight functional‑encryption scheme that lets clients encrypt model updates locally and enables the server to decrypt only the aggregated result—no third‑party key manager required.
- Byzantine‑robust aggregation for Non‑IID data: A layer‑wise projection and clustering technique that detects and discards poisoned updates even when client data distributions differ dramatically.
- End‑to‑end privacy & robustness guarantees: Formal proofs that DEFE prevents server‑side inference attacks while the aggregation step tolerates a bounded fraction of Byzantine (malicious) clients.
- Practical efficiency: Empirical evaluation shows SRFed reduces encryption/decryption overhead by up to 60 % and communication cost by ~30 % compared with state‑of‑the‑art privacy‑preserving FL baselines.
- Comprehensive benchmark: Experiments on image (CIFAR‑10/100) and language (Sentiment140) datasets under various poisoning scenarios (label flipping, model replacement) demonstrate superior accuracy and robustness.
Methodology
- Client‑side encryption: Each participant trains a local model on its private data, then encrypts the model parameters using DEFE. DEFE is “functional” because the ciphertext is tied to a specific aggregation function (e.g., weighted sum), allowing the server to compute the aggregated model without learning any individual update.
- Non‑interactive decryption: The server collects all encrypted updates and performs a single decryption step that yields the aggregated model. No round‑trip communication with a key‑distribution authority is needed.
- Defensive aggregation:
- Layer‑wise projection: Updates are projected onto a low‑dimensional subspace that captures the dominant directions of benign updates, reducing the influence of outliers.
- Clustering‑based analysis: Projected updates are clustered; clusters that are too small or far from the majority are flagged as suspicious and excluded from the final sum.
- Model update: The server broadcasts the decrypted, cleaned aggregate back to clients for the next training round. The cycle repeats until convergence.
Results & Findings
| Dataset | Attack Type | Baseline (e.g., SecAgg+Krum) | SRFed | Accuracy Drop (vs. clean) |
|---|---|---|---|---|
| CIFAR‑10 (Non‑IID) | Label‑flip (20 % Byzantine) | 62 % | 71 % | +9 % |
| CIFAR‑100 (Non‑IID) | Model‑replacement (10 % Byzantine) | 48 % | 57 % | +9 % |
| Sentiment140 (text) | Gradient‑poisoning (15 % Byzantine) | 78 % | 84 % | +6 % |
- Privacy: Simulated server inference attacks (gradient inversion) recover < 1 % of original training samples with SRFed, compared to > 15 % with standard FL.
- Efficiency: Encryption time per client drops from ~120 ms (Paillier‑based) to ~45 ms; total communication per round reduces from 12 MB to 8.5 MB for a 10‑client setup.
- Scalability: SRFed maintains robustness when the number of clients scales to 100, with only a modest increase in computation (< 10 % per round).
Practical Implications
- Edge AI deployments: Companies rolling out on‑device models (e.g., predictive keyboards, health monitors) can adopt SRFed to guarantee that a compromised device cannot poison the global model, while still protecting user data from a potentially curious cloud aggregator.
- Regulatory compliance: The framework aligns with GDPR and emerging AI‑privacy regulations by providing provable data minimization—servers never see raw updates.
- Cost‑effective security: Because DEFE eliminates the need for a trusted third‑party key server and reduces bandwidth, SRFed can be integrated into existing FL pipelines with minimal infrastructure changes.
- Robustness in real‑world data: Many production FL scenarios involve highly skewed data (e.g., different user behavior patterns). SRFed’s layer‑wise projection works directly on such Non‑IID distributions, making it far more reliable than generic Byzantine‑robust aggregators that assume IID data.
Limitations & Future Work
- Bounded Byzantine fraction: The theoretical guarantees hold up to a certain proportion of malicious clients (≈ 30 %). Extremely high attack rates could still degrade performance.
- Key management overhead for very large fleets: While DEFE is decentralized, initializing functional keys across millions of devices may require hierarchical bootstrapping, which the authors note as future engineering work.
- Extension to heterogeneous model architectures: SRFed currently assumes all clients share the same model topology. Adapting the projection‑clustering step to heterogeneous architectures (e.g., personalized FL) is an open research direction.
SRFed demonstrates that strong privacy and Byzantine robustness can coexist without prohibitive costs, paving the way for safer, scalable federated learning in production environments.
Authors
- Yiwen Lu
Paper Information
- arXiv ID: 2602.16480v1
- Categories: cs.CR, cs.DC
- Published: February 18, 2026
- PDF: Download PDF