[Paper] FedRandom: Sampling Consistent and Accurate Contribution Values in Federated Learning

Published: (February 5, 2026 at 09:19 AM EST)
4 min read
Source: arXiv

Source: arXiv - 2602.05693v1

Overview

Federated Learning (FL) lets multiple organizations train a shared model without moving their raw data, but figuring out how much each participant actually contributed remains a thorny problem. The new paper FedRandom proposes a statistically‑grounded sampling technique that dramatically stabilises contribution estimates, making it easier to reward collaborators, detect free‑riders, and maintain trust in real‑world FL deployments.

Key Contributions

  • FedRandom algorithm: a lightweight sampling wrapper that generates many “virtual” training runs from a single FL round, turning contribution estimation into a high‑precision statistical problem.
  • Stability boost: empirical results show a >30 % reduction in error to the ground‑truth contribution values in half of the tested scenarios and >90 % improvement in consistency across all experiments.
  • Broad evaluation: tested on four classic vision datasets (CIFAR‑10, CIFAR‑100, MNIST, FMNIST) under heterogeneous data splits that mimic real‑world client imbalance.
  • Practical cost‑analysis: demonstrates that FedRandom adds negligible computational overhead compared with standard FL training while delivering far more reliable contribution scores.

Methodology

  1. Treat contribution scoring as estimation – Instead of relying on a single run of the FL algorithm (which can be noisy due to random client selection, data heterogeneity, and optimizer stochasticity), FedRandom repeatedly samples sub‑sets of client updates within the same communication round.
  2. Generate synthetic “samples” – By randomly permuting the order of client contributions and recombining them with the server’s aggregation rule, the method creates many plausible training trajectories without extra model training.
  3. Statistical aggregation – The collection of trajectories is used to compute a confidence‑interval‑like contribution value for each client (e.g., mean impact on validation loss). The larger sample size shrinks variance, yielding a more stable estimate.
  4. Ground‑truth baseline – For evaluation, the authors compute the exact contribution by exhaustively retraining the model with each client removed—a costly but accurate reference.

The whole pipeline can be dropped into existing FL frameworks (e.g., TensorFlow Federated, PySyft) as a post‑processing step after each round.

Results & Findings

DatasetAvg. error reduction vs. baseline% of runs with improved stability
CIFAR‑1034 %92 %
CIFAR‑10031 %94 %
MNIST28 %90 %
FMNIST33 %93 %
  • Consistency: The variance of contribution scores across multiple FL runs dropped from a standard deviation of ~0.12 to ~0.04 (normalized units).
  • Computation: Adding FedRandom increased wall‑clock time by <5 % per round, because the extra samples are cheap matrix operations rather than full model retraining.
  • Robustness to data skew: Even when one client held 50 % of the data, FedRandom correctly identified its dominant contribution while keeping the scores of smaller clients stable.

Practical Implications

  • Fair compensation models – Companies can now base revenue‑sharing or token‑based incentives on contribution scores that are statistically sound, reducing disputes and encouraging participation.
  • Malicious‑client detection – Stable contribution metrics make it easier to spot outliers whose impact deviates sharply from the norm, a key signal for Byzantine‑resilient FL or data‑poisoning attacks.
  • Regulatory compliance – In sectors like healthcare or finance, auditors often require evidence of “value added” by each data holder; FedRandom provides a quantifiable, reproducible metric.
  • Integration ease – Because FedRandom works as a wrapper around existing aggregation logic, developers can adopt it without redesigning their training pipelines or sacrificing privacy guarantees.

Limitations & Future Work

  • Scalability to thousands of clients – The current experiments involve ≤20 participants; sampling overhead may grow with very large federations, so smarter subsampling strategies are needed.
  • Assumption of honest‑but‑curious server – FedRandom does not address scenarios where the server itself manipulates sampling; extending the method to a fully adversarial setting is an open question.
  • Beyond vision tasks – The paper focuses on image classification; applying the technique to NLP, recommendation systems, or reinforcement learning remains to be explored.
  • Dynamic client populations – Future work could investigate how FedRandom adapts when clients join or leave between rounds, a common pattern in mobile FL deployments.

FedRandom offers a pragmatic, statistically rigorous way to bring fairness and trust back into federated learning collaborations—exactly the kind of tool developers need as FL moves from research labs into production environments.

Authors

  • Arno Geimer
  • Beltran Fiz Pontiveros
  • Radu State

Paper Information

  • arXiv ID: 2602.05693v1
  • Categories: cs.LG, cs.DC
  • Published: February 5, 2026
  • PDF: Download PDF
Back to Blog

Related posts

Read more »