[Paper] AdFL: In-Browser Federated Learning for Online Advertisement
Source: arXiv - 2602.06336v1
Overview
The paper introduces AdFL, a federated‑learning framework that runs entirely inside web browsers to learn users’ ad‑preference signals without ever sending raw personal data to a central server. By keeping the training on the client side and only aggregating lightweight model updates, AdFL lets publishers comply with privacy regulations (e.g., GDPR) while still delivering more relevant, higher‑performing ads.
Key Contributions
- In‑browser FL engine built on standard Web APIs (Web Workers, IndexedDB, Fetch, Crypto) – no plug‑ins or native installations required.
- Publisher‑hosted coordination server that orchestrates model distribution, aggregation, and optional differential‑privacy (DP) noise injection.
- Model‑agnostic design: any ML model that consumes browser‑available features (viewability, click‑through, dwell time, page content) can be plugged into AdFL.
- Proof‑of‑concept ad‑viewability predictor achieving up to 92.59 % AUC on real traffic from a site with ~40 K daily visitors.
- Empirical evaluation of DP showing only modest accuracy loss (≈2–4 % AUC drop) while providing strong privacy guarantees.
- Performance profiling demonstrating that local training and upload complete in a few milliseconds, well within typical page‑load budgets.
Methodology
- Data Collection in the Browser – JavaScript hooks capture lightweight signals (e.g., whether an ad entered the viewport, click events, time spent on the page). These are stored locally in IndexedDB.
- Local Model Training – A small neural network (or any compatible model) is trained on the client using TensorFlow.js. Training runs in a Web Worker to avoid blocking the UI.
- Differential‑Privacy Noise – Before sending the model’s weight updates to the server, Gaussian noise calibrated to a chosen ε‑budget is added client‑side.
- Server‑Side Aggregation – The publisher’s AdFL server receives encrypted updates, averages them (FedAvg), and optionally applies secure aggregation primitives. The updated global model is then pushed back to browsers.
- Inference – Each browser loads the latest global model and uses it to rank candidate ads in real time, selecting the most likely to be viewable or clicked.
The whole pipeline leverages existing browser capabilities, meaning the solution works on Chrome, Edge, Firefox, and Safari without extra extensions.
Results & Findings
| Metric | Non‑DP Variant | DP Variant (ε = 1.0) |
|---|---|---|
| AUC (ad viewability) | 92.59 % | 89.3 % |
| Training latency per client | ~8 ms | ~9 ms |
| Upload size per round | 12 KB | 12 KB (noise added) |
| Convergence (rounds) | 12 | 14 |
- Accuracy: The DP‑protected model retains > 85 % AUC, which is still competitive for ad‑ranking tasks.
- Speed: End‑to‑end training + upload fits comfortably within typical page‑render windows (< 100 ms).
- Scalability: Experiments on two disjoint datasets (≈40 K daily users) show stable convergence and no noticeable server bottleneck.
Practical Implications
- Privacy‑first monetization – Publishers can continue to sell targeted ads while demonstrably respecting user data rights, reducing legal risk under GDPR, CCPA, etc.
- Zero‑install deployment – Because AdFL runs on native browser APIs, it can be rolled out via a simple script tag, lowering integration effort for ad tech stacks.
- Real‑time personalization – Model updates happen in the background; the latest global model is instantly available for ad selection, enabling dynamic, context‑aware bidding.
- Cross‑publisher collaboration – The same FL infrastructure could be shared among multiple sites, allowing industry‑wide models without exposing any party’s raw user logs.
- Differential privacy as a service – Publishers can tune ε to balance privacy guarantees against revenue impact, offering transparent privacy levels to users.
Limitations & Future Work
- Model size constraints – Browser memory and compute limits restrict the complexity of models; very deep networks may be infeasible without model compression.
- Network reliability – FL assumes participants can reliably upload updates; high churn or poor connectivity could slow convergence.
- Privacy budget management – The paper explores a single ε value; real‑world deployments will need mechanisms to track cumulative privacy loss across many training rounds.
- Broader ad formats – Experiments focus on viewability; extending to video, native, or programmatic RTB scenarios remains an open challenge.
- Robustness to adversarial updates – Future work should investigate secure aggregation and Byzantine‑resilient algorithms to guard against malicious clients.
AdFL demonstrates that sophisticated machine‑learning‑driven ad personalization can be reconciled with stringent privacy mandates—all without asking users to install extra software.
Authors
- Ahmad Alemari
- Pritam Sen
- Cristian Borcea
Paper Information
- arXiv ID: 2602.06336v1
- Categories: cs.CR, cs.DC, cs.LG
- Published: February 6, 2026
- PDF: Download PDF