[Paper] Modeling Epidemiological Dynamics Under Adversarial Data and User Deception

Published: (February 23, 2026 at 01:45 PM EST)
4 min read
Source: arXiv

Source: arXiv - 2602.20134v1

Overview

Self‑reported health data—think vaccination status, mask‑wearing, or social‑distancing habits—has become a cornerstone for modern epidemiological models. The new paper by Su et al. tackles a thorny problem: what happens when people deliberately misreport this information? Using a game‑theoretic “signaling game” framework, the authors show how public‑health authorities can still keep an epidemic in check even when a sizable fraction of the population feeds them deceptive data.

Key Contributions

  • Game‑theoretic model of data deception – Formalizes the interaction between individuals (who may lie) and health authorities (who must infer the truth) as a signaling game.
  • Analytical equilibrium characterization – Derives conditions for separating (truthful) and pooling (deceptive) equilibria and quantifies their impact on disease spread.
  • Policy‑design guidelines – Demonstrates how tailored incentives and verification mechanisms can bound the damage caused by misinformation.
  • Robustness insights – Shows that even under pervasive dishonesty, carefully designed sender/receiver strategies can keep infection levels low.
  • Cross‑disciplinary toolkit – Bridges epidemiology, AI, and game theory, offering a reusable framework for other domains that rely on user‑generated data.

Methodology

  1. Signal‑Game Setup

    • Senders: Individuals decide whether to report their true behavior (e.g., “I’m vaccinated”) or to lie, based on personal payoffs (avoiding penalties, gaining benefits).
    • Receiver: The public‑health authority updates its epidemiological model using the received signals, then decides on NPIs (mask mandates, vaccination campaigns, etc.).
  2. Utility Functions – Both parties have explicit cost/benefit structures:

    • Senders weigh the immediate gain from deception against the long‑term risk of a larger outbreak.
    • Receiver balances the accuracy of its model (lower infection risk) against the cost of stricter policies.
  3. Equilibrium Analysis – The authors solve for Bayesian Nash equilibria:

    • Separating equilibrium: Honest reporting is the dominant strategy; the authority can perfectly infer true behavior.
    • Pooling equilibrium: All types send the same (possibly false) signal; the authority must rely on prior beliefs and statistical inference.
  4. Simulation Layer – The theoretical results are plugged into a standard SEIR (Susceptible‑Exposed‑Infectious‑Recovered) model, allowing the authors to simulate infection trajectories under different equilibrium regimes and policy levers (e.g., fines, verification tests).

Results & Findings

  • Separating equilibria drive infections close to zero over time—essentially the ideal scenario where truthful data lets the authority fine‑tune interventions.
  • Pooling equilibria (high deception) still permit effective control if the authority adopts robust strategies:
    • Introducing modest verification (random testing, digital certificates) reduces the incentive to lie.
    • Adaptive NPIs that hedge against worst‑case behavior (e.g., universal masking) keep the reproduction number below 1 even when data quality is low.
  • The model quantifies a tolerance threshold: up to ~30 % systematic misreporting can be absorbed without catastrophic spikes, provided the right mix of incentives and verification is in place.
  • Sensitivity analysis shows that the cost of false positives (unnecessary restrictions) is outweighed by the benefit of preventing a surge when deception rates rise.

Practical Implications

  • Policy designers can use the framework to calibrate fines, subsidies, or digital verification tools that make truthful reporting the rational choice for most citizens.
  • Public‑health dashboards that ingest self‑reported data (e.g., COVID‑19 symptom trackers) can embed the game‑theoretic correction layer to adjust forecasts in real time.
  • Developers of health‑tech platforms (contact‑tracing apps, vaccination passports) gain a concrete method to assess how much “noise” their data pipelines can tolerate before model reliability degrades.
  • AI/ML pipelines that train on user‑generated health data can incorporate the equilibrium‑based priors to mitigate bias introduced by strategic deception.
  • Corporate wellness programs that rely on employee self‑reports can design incentive structures that align personal benefits with accurate data collection, improving both health outcomes and analytics quality.

Limitations & Future Work

  • Simplified utility assumptions – Real‑world motivations (political beliefs, misinformation exposure) are richer than the linear cost/benefit functions used.
  • Static population types – The model treats “honest” vs. “dishonest” types as fixed; in practice, behavior may evolve as policies change.
  • Verification cost modeling – The paper assumes a generic verification expense; future work could integrate specific technologies (PCR testing, blockchain‑based certificates) and their scalability constraints.
  • Empirical validation – While simulations are convincing, applying the framework to real epidemic data (e.g., COVID‑19 self‑reporting platforms) would solidify its practical utility.

Bottom line: By treating self‑reported health data as a strategic signal rather than a static input, this research equips developers, policymakers, and AI engineers with a principled way to safeguard epidemiological models against deception—turning a potential vulnerability into a manageable design parameter.

Authors

  • Yiqi Su
  • Christo Kurisummoottil Thomas
  • Walid Saad
  • Bud Mishra
  • Naren Ramakrishnan

Paper Information

  • arXiv ID: 2602.20134v1
  • Categories: cs.GT, cs.AI
  • Published: February 23, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »