[Paper] PRISM-FCP: Byzantine-Resilient Federated Conformal Prediction via Partial Sharing
Source: arXiv - 2602.18396v1
Overview
A new framework called PRISM‑FCP tackles a long‑standing blind spot in federated learning: how to keep uncertainty estimates reliable when some participants act maliciously (Byzantine attacks). By combining partial model sharing with a robust conformal calibration step, the authors show that you can preserve tight prediction intervals and defend against poisoned updates—something prior methods only handled during calibration.
Key Contributions
- End‑to‑end Byzantine resilience: protects both the training phase and the conformal calibration stage, closing a gap in existing federated conformal prediction (FCP) pipelines.
- Partial parameter sharing: each client uploads only M out of D model parameters per round, reducing the adversary’s influence by a factor of M/D and cutting communication bandwidth.
- Statistical‑margin based calibration: converts raw nonconformity scores into “characterization vectors,” computes distance‑based maliciousness scores, and selectively down‑weights or discards suspicious contributions before estimating the quantile.
- Theoretical guarantees: derives mean‑square‑error (MSE) bounds that shrink proportionally with the sharing ratio M/D, and proves that coverage guarantees remain intact under bounded Byzantine fractions.
- Extensive empirical validation: experiments on synthetic benchmarks and the UCI Superconductivity dataset demonstrate nominal coverage (≈95 %) and significantly tighter intervals compared with vanilla FCP, even when up to 30 % of clients are Byzantine.
Methodology
1. Partial Sharing during Training
- The global model has D parameters. In each communication round, a client randomly selects M of them (e.g., 20 % of the total) and sends only those updates to the server.
- The server aggregates the sparse updates (e.g., via robust mean or median) and broadcasts the reconstructed full model back to clients.
- Because an attacker can only tamper with the M parameters it sends, the expected energy of its perturbation is scaled down by M/D, leading to lower overall MSE.
2. Robust Conformal Calibration
- After training, each client computes a nonconformity score for its local validation set (e.g., absolute residual).
- Scores are embedded into a low‑dimensional “characterization vector” (e.g., using a simple kernel or PCA).
- Pairwise distances between vectors from different clients are used to assign a maliciousness score: outliers receive higher scores.
- The server then down‑weights or filters scores with high maliciousness before estimating the conformal quantile (the threshold that yields the desired coverage).
3. Prediction Phase
- The final global model produces point predictions.
- The calibrated quantile is added/subtracted to form a prediction interval that, by construction, covers the true outcome with the target probability (e.g., 95 %) even in the presence of Byzantine participants.
Results & Findings
| Dataset | Byzantine % | Coverage (target 95 %) | Avg. Interval Width |
|---|---|---|---|
| Synthetic (linear) | 0 % | 95.2 % | 0.84 |
| Synthetic (linear) | 30 % | 94.8 % | 0.87 |
| UCI Superconductivity | 0 % | 95.1 % | 1.12 |
| UCI Superconductivity | 30 % | 94.9 % | 1.15 |
- Coverage stays on target across all Byzantine levels, confirming the theoretical guarantee.
- Interval inflation (a common symptom of Byzantine attacks) is limited to <5 % increase, whereas vanilla FCP’s intervals blow up by >30 % under the same attack strength.
- Communication savings: with M/D = 0.2, total transmitted bytes drop by 80 % without sacrificing predictive performance.
- Robustness to attack strategies: the authors test both random noise injection and targeted gradient poisoning; PRISM‑FCP consistently outperforms baselines.
Practical Implications
- Edge‑AI & IoT deployments: Devices with limited bandwidth can participate in federated learning while still receiving trustworthy uncertainty estimates—critical for safety‑critical applications (e.g., predictive maintenance, medical diagnostics).
- Model‑as‑a‑service platforms: Service providers can offer calibrated confidence intervals to downstream users without exposing the system to malicious clients that might otherwise inflate risk metrics.
- Regulatory compliance: In domains where calibrated uncertainty is mandated (e.g., finance, healthcare), PRISM‑FCP provides a provably robust way to meet coverage requirements in a distributed setting.
- Simplified engineering: The partial‑sharing scheme integrates with existing federated optimization pipelines (FedAvg, FedProx) and only adds a lightweight calibration step, making adoption straightforward for ML engineers.
Limitations & Future Work
- Random partial selection may discard informative parameters, potentially slowing convergence on highly non‑convex models (e.g., deep nets). Adaptive selection strategies could mitigate this.
- The maliciousness scoring relies on distance metrics that assume roughly homogeneous data distributions across clients; heterogeneous data (non‑IID) could blur the distinction between honest outliers and attacks.
- Experiments focus on regression tasks; extending PRISM‑FCP to classification (e.g., conformal prediction sets) remains an open direction.
- Formal analysis of adaptive Byzantine attackers that learn the partial‑sharing pattern over time is not covered; future work could explore game‑theoretic defenses.
Bottom line: PRISM‑FCP offers a practical, communication‑efficient recipe for delivering reliable uncertainty quantification in federated environments—even when some participants try to sabotage the system. For developers building distributed AI services, it’s a compelling addition to the robustness toolbox.
Authors
- Ehsan Lari
- Reza Arablouei
- Stefan Werner
Paper Information
- arXiv ID: 2602.18396v1
- Categories: cs.LG, eess.SP, math.PR, stat.AP, stat.ML
- Published: February 20, 2026
- PDF: Download PDF