[Paper] JSAM: Privacy Straggler-Resilient Joint Client Selection and Incentive Mechanism Design in Differentially Private Federated Learning
Source: arXiv - 2602.21844v1
Overview
Federated learning (FL) lets many devices train a shared model without exposing raw data, but adding differential privacy (DP) to protect users introduces a hidden “privacy cost” that can deter participation. The new paper proposes JSAM – a joint client‑selection and incentive mechanism that smartly balances privacy compensation with training effectiveness, all while staying within a server’s budget.
Key Contributions
- Joint Optimization Framework: Formulates a Bayesian‑optimal problem that simultaneously decides who to sample and how much to pay each client for their privacy loss.
- Dimensionality Reduction: Shows that the original 2N‑dimensional problem (selection + compensation for N clients) can be collapsed to a tractable three‑dimensional one, enabling fast computation even for large FL populations.
- Privacy‑Aware Selection Policy: Proves that the optimal strategy excludes highly privacy‑sensitive “stragglers” and preferentially selects tolerant clients, contrary to traditional unbiased sampling.
- Counter‑Intuitive Cost Insight: Demonstrates that the least privacy‑sensitive clients may actually incur the highest total compensation because they are selected most often.
- Empirical Validation: Experiments on MNIST and CIFAR‑10 show up to 15 % higher test accuracy versus unbiased selection, with comparable or lower total incentive spend across heterogeneous data distributions.
Methodology
- Modeling Privacy Costs: Each client reports a privacy sensitivity parameter (how much DP noise they need). The server translates this into a monetary cost per participation.
- Bayesian Optimization: Assuming a prior distribution over client sensitivities, the server solves a Bayesian‑optimal control problem that maximizes expected model utility (accuracy) subject to a total budget.
- Analytical Reduction: By exploiting the structure of DP noise and the linearity of expected utility, the authors derive closed‑form conditions that reduce the search space to three variables: overall selection probability, a threshold privacy level, and the budget allocation factor.
- Algorithmic Implementation: The reduced problem is solved via a lightweight iterative scheme (essentially a projected gradient descent) that runs in seconds on a standard server.
- Evaluation Setup: Simulated FL with 100 clients, varying data heterogeneity (IID vs. non‑IID), and different DP budgets (ε values). Baselines include uniform random selection and existing incentive mechanisms that ignore privacy heterogeneity.
Results & Findings
| Metric | Uniform Selection | Prior Incentive Schemes | JSAM |
|---|---|---|---|
| Test Accuracy (CIFAR‑10, non‑IID) | 71.2 % | 73.5 % | 78.1 % |
| Avg. Incentive Spend per Round | $0.45 | $0.48 | $0.46 |
| Fraction of High‑Sensitivity Clients Selected | 30 % | 28 % | 12 % |
| Convergence Rounds to 75 % Accuracy | 120 | 98 | 84 |
- Higher accuracy stems from focusing training on clients whose data is both informative and less noisy (low DP noise).
- Budget efficiency is maintained because the server avoids paying large compensations to privacy stragglers.
- Robustness to heterogeneity: Even when data distribution is highly skewed, JSAM’s adaptive threshold keeps performance gains stable.
Practical Implications
- For FL Platform Operators: JSAM offers a plug‑and‑play module that can be integrated into existing FL orchestration stacks (e.g., TensorFlow Federated, PySyft) to automatically adjust client sampling and payment policies based on reported privacy preferences.
- Cost‑Effective Incentives: Companies can allocate a fixed incentive budget while still encouraging participation from the most useful devices, reducing waste on clients that would demand prohibitive privacy compensation.
- Regulatory Alignment: By explicitly quantifying privacy loss and compensating accordingly, JSAM helps satisfy emerging data‑protection regulations (e.g., GDPR, CCPA) that require transparent handling of personal data risks.
- Edge‑AI Deployments: In IoT or mobile scenarios where battery and bandwidth are scarce, selecting fewer but higher‑utility clients shortens training rounds, saving energy and network usage.
- Open‑Source Potential: The three‑dimensional formulation is lightweight enough to run on edge servers, opening the door for community‑driven libraries that democratize privacy‑aware incentive design.
Limitations & Future Work
- Assumed Truthful Reporting: JSAM presumes clients honestly disclose their privacy sensitivity; strategic misreporting could undermine optimality.
- Static Sensitivity Model: The current framework treats privacy preferences as fixed per client; real‑world preferences may evolve with context (e.g., location, time of day).
- Scalability to Millions: While the reduced problem is efficient, the paper evaluates up to a few hundred clients; extending to massive device fleets may require hierarchical or federated incentive coordination.
- Broader DP Mechanisms: Experiments focus on Gaussian DP; exploring other mechanisms (e.g., Rényi DP) could broaden applicability.
Future research directions include designing truthful mechanisms that incentivize honest privacy reporting, incorporating dynamic preference learning, and testing JSAM in real‑world FL deployments such as keyboard prediction or smart‑home analytics.
Authors
- Ruichen Xu
- Ying‑Jun Angela Zhang
- Jianwei Huang
Paper Information
- arXiv ID: 2602.21844v1
- Categories: cs.LG, cs.DC, cs.GT
- Published: February 25, 2026
- PDF: Download PDF