[Paper] On Evolution-Based Models for Experimentation Under Interference
Source: arXiv - 2511.21675v1
Overview
This paper tackles a core problem for anyone building data‑driven products that run experiments on connected users—how to estimate causal effects when a treatment on one unit (e.g., a user, a sensor, a device) can spill over to others. Instead of trying to reconstruct the full, often hidden, interaction network, the authors show that it is enough to model how the distribution of outcomes evolves over time under different treatment assignments. This “evolution‑based” view opens a new route to reliable causal inference in the presence of interference.
Key Contributions
- Evolution‑based identification: Proves that population‑level causal effects can be identified from low‑dimensional recursive equations governing the outcome distribution across experimental rounds, without knowing the exact network topology.
- Axiomatic exposure‑mapping framework: Formalizes the conditions under which the empirical outcome distribution follows a simple evolution mapping, providing a clean theoretical lens for interference.
- Distributional difference‑in‑differences: Introduces a novel analogue of DiD that works on distributions rather than on individual unit trajectories, leveraging parallel evolution patterns across treatment arms.
- Causal message passing (CMP): Presents a concrete algorithm for dense graphs that propagates “causal messages” through the network, estimating heterogeneous spillover effects efficiently.
- Extension to influencer‑type networks: Shows how the same ideas apply when a few “influencer” nodes dominate the interference dynamics, a setting common in social media and IoT deployments.
- Identification limits: Characterizes scenarios (strong temporal trends, endogenous interference) where the evolution‑based approach fails, guiding practitioners on when to apply it.
Methodology
-
Exposure Mapping: Each unit’s outcome is assumed to depend on its own treatment and a summary of its neighbors’ treatments (the “exposure”). The authors define a set of axioms that guarantee the exposure can be captured by a low‑dimensional statistic.
-
Evolution Mapping: They model the distribution of outcomes after each experimental round as a function of the previous distribution and the current treatment vector. This yields a recursive equation of the form
[ \mathbb{P}(Y^{(t+1)}\mid A^{(t+1)}) = \mathcal{F}\big(\mathbb{P}(Y^{(t)}\mid A^{(t)}), A^{(t+1)}\big), ]
where (A^{(t)}) denotes the treatment assignment at round (t).
-
Randomized Sampling of Interference Channels: Because treatments are randomized, each round implicitly samples different hidden interference pathways. By aggregating across many random assignments, the evolution mapping can be estimated consistently.
-
Causal Message Passing (CMP): For dense networks, the authors derive a message‑passing algorithm that updates beliefs about each node’s counterfactual outcome using only local information, dramatically reducing computational cost.
-
Estimator Construction: Using the estimated evolution mapping, they back‑solve for the counterfactual distribution that would have arisen under any alternative treatment scenario, analogous to solving a system of linear equations in classic DiD.
All steps rely on observable data (treatment assignments and outcomes) and avoid any need to infer the full adjacency matrix.
Results & Findings
- Theoretical guarantees: Under the stated axioms, the evolution mapping is identifiable and the CMP estimator is consistent and asymptotically normal.
- Simulation studies: In synthetic dense graphs (average degree ≈ 0.8 × |V|) and influencer‑centric graphs (5 % of nodes as influencers), CMP recovers heterogeneous spillover effects with < 5 % bias, outperforming baseline methods that assume no interference or that use naïve network reconstruction.
- Real‑world case study: Applied to a large‑scale A/B test on a social platform where users receive a new recommendation algorithm, the method uncovers a positive indirect effect: users not directly treated still experience a 2 % lift in engagement due to exposure to treated friends. Traditional analysis missed this effect entirely.
- Robustness checks: The approach remains stable when the true network is partially observed, as long as the randomization scheme satisfies the “implicit sampling” condition.
Practical Implications
- Product experimentation: Engineers can run standard randomized experiments and, by collecting outcome data over multiple rounds, obtain reliable estimates of both direct and spillover effects without building a full interaction graph.
- Feature rollout strategies: Knowing the magnitude of indirect benefits (or harms) enables smarter staged rollouts—e.g., targeting influencers first to maximize network‑wide impact.
- Policy & regulation compliance: In domains like healthcare or finance where privacy restricts network data collection, the evolution‑based method offers a privacy‑preserving alternative for causal analysis.
- Scalable tooling: The CMP algorithm scales linearly with the number of edges, making it feasible for millions of users on modern cloud infrastructure.
- Integration with existing pipelines: The method can be wrapped around existing A/B testing frameworks (e.g., Optimizely, LaunchDarkly) by adding a “round” dimension to experiments and logging treatment assignments per round.
Limitations & Future Work
- Strong temporal trends: If outcomes drift dramatically over time for reasons unrelated to the treatment (e.g., seasonality), the recursive evolution model may conflate drift with spillovers, breaking identification.
- Endogenous interference: When the interference structure itself changes in response to the treatment (e.g., users form new connections after seeing a new feature), the static exposure‑mapping assumptions no longer hold.
- Sparse networks: While the paper extends to influencer models, performance degrades in very sparse graphs where the low‑dimensional evolution assumption is weaker.
- Future directions: The authors suggest (1) incorporating covariate‑adjusted evolution mappings to handle time‑varying confounders, (2) developing diagnostics to detect violations of the temporal‑trend assumption, and (3) extending the framework to continuous‑time settings common in streaming data environments.
Authors
- Sadegh Shirani
- Mohsen Bayati
Paper Information
- arXiv ID: 2511.21675v1
- Categories: stat.ML, cs.LG, cs.SI, econ.EM
- Published: November 26, 2025
- PDF: Download PDF