[Paper] Modeling Epidemiological Dynamics Under Adversarial Data and User Deception
Source: arXiv - 2602.20134v1
Overview
Self‑reported health data—think vaccination status, mask‑wearing, or social‑distancing habits—has become a cornerstone for modern epidemiological models. The new paper by Su et al. tackles a thorny problem: what happens when people deliberately misreport this information? Using a game‑theoretic “signaling game” framework, the authors show how public‑health authorities can still keep an epidemic in check even when a sizable fraction of the population feeds them deceptive data.
Key Contributions
- Game‑theoretic model of data deception – Formalizes the interaction between individuals (who may lie) and health authorities (who must infer the truth) as a signaling game.
- Analytical equilibrium characterization – Derives conditions for separating (truthful) and pooling (deceptive) equilibria and quantifies their impact on disease spread.
- Policy‑design guidelines – Demonstrates how tailored incentives and verification mechanisms can bound the damage caused by misinformation.
- Robustness insights – Shows that even under pervasive dishonesty, carefully designed sender/receiver strategies can keep infection levels low.
- Cross‑disciplinary toolkit – Bridges epidemiology, AI, and game theory, offering a reusable framework for other domains that rely on user‑generated data.
Methodology
-
Signal‑Game Setup –
- Senders: Individuals decide whether to report their true behavior (e.g., “I’m vaccinated”) or to lie, based on personal payoffs (avoiding penalties, gaining benefits).
- Receiver: The public‑health authority updates its epidemiological model using the received signals, then decides on NPIs (mask mandates, vaccination campaigns, etc.).
-
Utility Functions – Both parties have explicit cost/benefit structures:
- Senders weigh the immediate gain from deception against the long‑term risk of a larger outbreak.
- Receiver balances the accuracy of its model (lower infection risk) against the cost of stricter policies.
-
Equilibrium Analysis – The authors solve for Bayesian Nash equilibria:
- Separating equilibrium: Honest reporting is the dominant strategy; the authority can perfectly infer true behavior.
- Pooling equilibrium: All types send the same (possibly false) signal; the authority must rely on prior beliefs and statistical inference.
-
Simulation Layer – The theoretical results are plugged into a standard SEIR (Susceptible‑Exposed‑Infectious‑Recovered) model, allowing the authors to simulate infection trajectories under different equilibrium regimes and policy levers (e.g., fines, verification tests).
Results & Findings
- Separating equilibria drive infections close to zero over time—essentially the ideal scenario where truthful data lets the authority fine‑tune interventions.
- Pooling equilibria (high deception) still permit effective control if the authority adopts robust strategies:
- Introducing modest verification (random testing, digital certificates) reduces the incentive to lie.
- Adaptive NPIs that hedge against worst‑case behavior (e.g., universal masking) keep the reproduction number below 1 even when data quality is low.
- The model quantifies a tolerance threshold: up to ~30 % systematic misreporting can be absorbed without catastrophic spikes, provided the right mix of incentives and verification is in place.
- Sensitivity analysis shows that the cost of false positives (unnecessary restrictions) is outweighed by the benefit of preventing a surge when deception rates rise.
Practical Implications
- Policy designers can use the framework to calibrate fines, subsidies, or digital verification tools that make truthful reporting the rational choice for most citizens.
- Public‑health dashboards that ingest self‑reported data (e.g., COVID‑19 symptom trackers) can embed the game‑theoretic correction layer to adjust forecasts in real time.
- Developers of health‑tech platforms (contact‑tracing apps, vaccination passports) gain a concrete method to assess how much “noise” their data pipelines can tolerate before model reliability degrades.
- AI/ML pipelines that train on user‑generated health data can incorporate the equilibrium‑based priors to mitigate bias introduced by strategic deception.
- Corporate wellness programs that rely on employee self‑reports can design incentive structures that align personal benefits with accurate data collection, improving both health outcomes and analytics quality.
Limitations & Future Work
- Simplified utility assumptions – Real‑world motivations (political beliefs, misinformation exposure) are richer than the linear cost/benefit functions used.
- Static population types – The model treats “honest” vs. “dishonest” types as fixed; in practice, behavior may evolve as policies change.
- Verification cost modeling – The paper assumes a generic verification expense; future work could integrate specific technologies (PCR testing, blockchain‑based certificates) and their scalability constraints.
- Empirical validation – While simulations are convincing, applying the framework to real epidemic data (e.g., COVID‑19 self‑reporting platforms) would solidify its practical utility.
Bottom line: By treating self‑reported health data as a strategic signal rather than a static input, this research equips developers, policymakers, and AI engineers with a principled way to safeguard epidemiological models against deception—turning a potential vulnerability into a manageable design parameter.
Authors
- Yiqi Su
- Christo Kurisummoottil Thomas
- Walid Saad
- Bud Mishra
- Naren Ramakrishnan
Paper Information
- arXiv ID: 2602.20134v1
- Categories: cs.GT, cs.AI
- Published: February 23, 2026
- PDF: Download PDF