[Paper] Data Heterogeneity-Aware Client Selection for Federated Learning in Wireless Networks

Published: (December 30, 2025 at 10:21 AM EST)
4 min read
Source: arXiv

Source: arXiv - 2512.24286v1

Overview

Federated Learning (FL) promises on‑device model training without moving raw data to the cloud, but real‑world deployments in wireless networks stumble over two practical hurdles: limited bandwidth/computation and data heterogeneity—the fact that each device’s local dataset can look very different from the others’. This paper delivers a rigorous analysis of how such heterogeneity hurts global model accuracy and then proposes a data‑aware client selection and resource‑allocation scheme that trims training time, cuts energy use, and boosts test performance.

Key Contributions

  • Theoretical insight: Derives a closed‑form bound linking client data heterogeneity to the global model’s generalization error, exposing why naïve client selection can cause extra training rounds.
  • Joint optimization formulation: Casts the problem of minimizing learning latency + energy consumption under a target generalization error as a mixed integer program.
  • CSRA algorithm: Introduces a Client Selection and Resource Allocation (CSRA) framework that leverages convex relaxation and successive convex approximation to solve the problem efficiently.
  • Comprehensive evaluation: Shows through extensive simulations that CSRA outperforms baseline FL strategies (random selection, uniform resource allocation) in test accuracy, latency, and energy usage.
  • Practical guidelines: Provides actionable criteria for edge orchestrators to prioritize clients not just by channel quality or compute power, but also by the statistical “distance” of their local data.

Methodology

  1. Modeling data heterogeneity:
    • Each client (k) holds a local dataset with distribution (P_k).
    • The authors quantify heterogeneity using the distribution divergence (e.g., Wasserstein distance) between (P_k) and the global data distribution (P).
  2. Generalization error analysis:
    • Starting from standard FL convergence results, they add a term that grows with the average divergence across selected clients, yielding an explicit error bound.
  3. Optimization problem:
    • Objective: Minimize a weighted sum of total training latency (communication + computation) and total energy consumption.
    • Constraints: (i) The derived heterogeneity‑aware error bound must stay below a preset threshold; (ii) each client’s bandwidth and CPU limits; (iii) a fixed number of communication rounds.
  4. Solution via CSRA:
    • Client selection: Binary variables indicate whether a client participates. These are relaxed to continuous values, solved via convex programming, then rounded.
    • Resource allocation: Given a selected set, the algorithm allocates transmission power and CPU cycles using closed‑form expressions from KKT conditions.
    • The process iterates: update selection → re‑allocate resources → converge to a near‑optimal solution.

Results & Findings

MetricRandom SelectionUniform Resource AllocationCSRA (proposed)
Test accuracy (after 100 rounds)78.3 %80.1 %84.7 %
Average latency per round (ms)210185132
Energy per device per round (J)0.480.440.31
  • Higher accuracy: By avoiding clients whose data are too divergent from the global target, CSRA reduces the number of required communication rounds.
  • Latency cut: Selecting clients with good channel conditions and allocating just enough power/computation prevents bottlenecks.
  • Energy savings: Tailored resource budgets avoid over‑provisioning, extending battery life on edge devices.

The simulations span varied network sizes (10–200 clients) and heterogeneity levels, consistently confirming CSRA’s advantage.

Practical Implications

  • Edge orchestrators: Instead of blind round‑robin or signal‑strength‑only scheduling, operators can embed a lightweight heterogeneity estimator (e.g., a few statistical sketches of local data) into the client‑selection logic.
  • Developer toolkits: FL libraries (TensorFlow Federated, PySyft) could expose APIs for reporting data‑distribution metrics, enabling CSRA‑style schedulers to run on the server side.
  • Battery‑constrained IoT: Devices can stay in low‑power mode longer because the algorithm avoids pulling in “noisy” clients that would force extra rounds.
  • Regulatory compliance: By explicitly accounting for data diversity, CSRA can help meet fairness or bias‑mitigation requirements in federated AI deployments.

Limitations & Future Work

  • Heterogeneity estimation overhead: The current approach assumes each client can compute and transmit a divergence metric; scaling this to thousands of ultra‑low‑power sensors may be costly.
  • Static channel model: Simulations use quasi‑static wireless links; real‑time fading and mobility could affect the convexity assumptions.
  • Single‑objective weighting: The latency‑energy trade‑off is captured by a fixed weight; adaptive weighting based on service‑level agreements remains unexplored.
  • Future directions: Extending CSRA to hierarchical FL (edge‑cloud cascades), incorporating privacy‑preserving heterogeneity metrics (e.g., differential‑private sketches), and testing on real‑world testbeds (5G/6G edge nodes) are promising next steps.

Authors

  • Yanbing Yang
  • Huiling Zhu
  • Wenchi Cheng
  • Jingqing Wang
  • Changrun Chen
  • Jiangzhou Wang

Paper Information

  • arXiv ID: 2512.24286v1
  • Categories: cs.DC
  • Published: December 30, 2025
  • PDF: Download PDF
Back to Blog

Related posts

Read more »