[Paper] Data Heterogeneity-Aware Client Selection for Federated Learning in Wireless Networks
Source: arXiv - 2512.24286v1
Overview
Federated Learning (FL) promises on‑device model training without moving raw data to the cloud, but real‑world deployments in wireless networks stumble over two practical hurdles: limited bandwidth/computation and data heterogeneity—the fact that each device’s local dataset can look very different from the others’. This paper delivers a rigorous analysis of how such heterogeneity hurts global model accuracy and then proposes a data‑aware client selection and resource‑allocation scheme that trims training time, cuts energy use, and boosts test performance.
Key Contributions
- Theoretical insight: Derives a closed‑form bound linking client data heterogeneity to the global model’s generalization error, exposing why naïve client selection can cause extra training rounds.
- Joint optimization formulation: Casts the problem of minimizing learning latency + energy consumption under a target generalization error as a mixed integer program.
- CSRA algorithm: Introduces a Client Selection and Resource Allocation (CSRA) framework that leverages convex relaxation and successive convex approximation to solve the problem efficiently.
- Comprehensive evaluation: Shows through extensive simulations that CSRA outperforms baseline FL strategies (random selection, uniform resource allocation) in test accuracy, latency, and energy usage.
- Practical guidelines: Provides actionable criteria for edge orchestrators to prioritize clients not just by channel quality or compute power, but also by the statistical “distance” of their local data.
Methodology
- Modeling data heterogeneity:
- Each client (k) holds a local dataset with distribution (P_k).
- The authors quantify heterogeneity using the distribution divergence (e.g., Wasserstein distance) between (P_k) and the global data distribution (P).
- Generalization error analysis:
- Starting from standard FL convergence results, they add a term that grows with the average divergence across selected clients, yielding an explicit error bound.
- Optimization problem:
- Objective: Minimize a weighted sum of total training latency (communication + computation) and total energy consumption.
- Constraints: (i) The derived heterogeneity‑aware error bound must stay below a preset threshold; (ii) each client’s bandwidth and CPU limits; (iii) a fixed number of communication rounds.
- Solution via CSRA:
- Client selection: Binary variables indicate whether a client participates. These are relaxed to continuous values, solved via convex programming, then rounded.
- Resource allocation: Given a selected set, the algorithm allocates transmission power and CPU cycles using closed‑form expressions from KKT conditions.
- The process iterates: update selection → re‑allocate resources → converge to a near‑optimal solution.
Results & Findings
| Metric | Random Selection | Uniform Resource Allocation | CSRA (proposed) |
|---|---|---|---|
| Test accuracy (after 100 rounds) | 78.3 % | 80.1 % | 84.7 % |
| Average latency per round (ms) | 210 | 185 | 132 |
| Energy per device per round (J) | 0.48 | 0.44 | 0.31 |
- Higher accuracy: By avoiding clients whose data are too divergent from the global target, CSRA reduces the number of required communication rounds.
- Latency cut: Selecting clients with good channel conditions and allocating just enough power/computation prevents bottlenecks.
- Energy savings: Tailored resource budgets avoid over‑provisioning, extending battery life on edge devices.
The simulations span varied network sizes (10–200 clients) and heterogeneity levels, consistently confirming CSRA’s advantage.
Practical Implications
- Edge orchestrators: Instead of blind round‑robin or signal‑strength‑only scheduling, operators can embed a lightweight heterogeneity estimator (e.g., a few statistical sketches of local data) into the client‑selection logic.
- Developer toolkits: FL libraries (TensorFlow Federated, PySyft) could expose APIs for reporting data‑distribution metrics, enabling CSRA‑style schedulers to run on the server side.
- Battery‑constrained IoT: Devices can stay in low‑power mode longer because the algorithm avoids pulling in “noisy” clients that would force extra rounds.
- Regulatory compliance: By explicitly accounting for data diversity, CSRA can help meet fairness or bias‑mitigation requirements in federated AI deployments.
Limitations & Future Work
- Heterogeneity estimation overhead: The current approach assumes each client can compute and transmit a divergence metric; scaling this to thousands of ultra‑low‑power sensors may be costly.
- Static channel model: Simulations use quasi‑static wireless links; real‑time fading and mobility could affect the convexity assumptions.
- Single‑objective weighting: The latency‑energy trade‑off is captured by a fixed weight; adaptive weighting based on service‑level agreements remains unexplored.
- Future directions: Extending CSRA to hierarchical FL (edge‑cloud cascades), incorporating privacy‑preserving heterogeneity metrics (e.g., differential‑private sketches), and testing on real‑world testbeds (5G/6G edge nodes) are promising next steps.
Authors
- Yanbing Yang
- Huiling Zhu
- Wenchi Cheng
- Jingqing Wang
- Changrun Chen
- Jiangzhou Wang
Paper Information
- arXiv ID: 2512.24286v1
- Categories: cs.DC
- Published: December 30, 2025
- PDF: Download PDF