[Paper] Context-Specific Causal Graph Discovery with Unobserved Contexts: Non-Stationarity, Regimes and Spatio-Temporal Patterns
Source: arXiv - 2511.21537v1
Overview
Real‑world datasets—especially those coming from climate science, remote sensing, or any spatial‑temporal grid—often violate the classic assumptions of stationarity and spatial homogeneity that many causal discovery algorithms rely on. This paper tackles the problem of learning causal graphs that can change across different contexts (e.g., regions, time periods, or hidden regimes) while still providing stable, reliable results. By adapting existing constraint‑based methods to detect and respect context‑specific variations, the authors open the door to more trustworthy causal inference on non‑stationary, spatio‑temporal data.
Key Contributions
- Formalization of context‑specific causal discovery where the context (regime, location, time) may be unobserved or only partially observed.
- Two guiding principles that address (1) how to separate true causal changes from statistical noise, and (2) how to keep the discovery process modular and extensible.
- A generic framework that plugs into any constraint‑based causal discovery algorithm by altering only the independence‑testing layer (e.g., PC, PC‑stable, FCI, PCMCI, PCMCI+, LPCMCI).
- Modular decomposition of the overall problem into well‑studied sub‑tasks such as change‑point detection, clustering, and conditional independence testing, enabling systematic improvements.
- Open‑source prototype (to be released) that demonstrates the approach on synthetic and real climate datasets, showing how causal graphs evolve across space and time.
Methodology
- Problem Setup – The data are modeled as a collection of time‑series or spatial grids, each generated under an (unknown) context (C). The causal graph (G(C)) may differ across contexts, but the underlying mechanisms are assumed to be locally stable.
- Guiding Principle 1: Context Detection – Before causal inference, the algorithm searches for statistically significant shifts in the joint distribution (using change‑point detection or clustering). These shifts define candidate context partitions.
- Guiding Principle 2: Context‑Aware Independence Testing – Standard constraint‑based methods rely on conditional independence (CI) tests that assume i.i.d. samples. The authors replace these with context‑conditioned CI tests that either (a) pool data within the same detected context, or (b) weight samples according to their likelihood of belonging to a context.
- Modular Plug‑In – The modified CI test is the only component that needs to be swapped into existing algorithms. All other steps (graph skeleton construction, orientation rules) remain unchanged, preserving the original algorithm’s guarantees where applicable.
- Statistical Calibration – Hyper‑parameters (e.g., significance thresholds for change‑point detection, context granularity) are treated explicitly, allowing users to trade off sensitivity to context changes against false‑positive causal edges.
Results & Findings
- Synthetic Benchmarks – On simulated datasets with known regime switches, the framework recovers the true context‑specific graphs with up to 30 % higher precision than vanilla PC/FCI, while maintaining comparable recall.
- Climate Case Study – Applying the method to a global temperature‑precipitation dataset reveals distinct causal structures between tropical and mid‑latitude regions, and captures seasonal regime shifts that standard stationary methods miss.
- Scalability – Because only the CI test is altered, runtime overhead is modest (≈ 1.2×–1.5× slower than the base algorithm), making the approach feasible for grids with thousands of nodes.
Practical Implications
- Better Decision‑Support for Climate Modeling – Engineers can now identify region‑specific causal drivers (e.g., how sea‑surface temperature influences precipitation differently in El Niño vs. La Niña periods), leading to more targeted mitigation strategies.
- Robust Causal Inference in Finance & IoT – Any domain where data streams exhibit regime changes (market crashes, sensor drift) can benefit from context‑aware discovery, reducing spurious edges that would otherwise mislead downstream prediction or control systems.
- Plug‑and‑Play Upgrade – Existing pipelines that already use PC, FCI, or PCMCI can be upgraded with minimal code changes—simply swap in the context‑conditioned CI test library. This lowers the barrier for industry adoption.
- Interpretability – By explicitly exposing the detected contexts, analysts gain a transparent view of when and where causal relationships hold, supporting more nuanced reporting and compliance documentation.
Limitations & Future Work
- Unobserved Context Ambiguity – When context changes are subtle or overlapping, the detection step may merge distinct regimes, leading to mixed‑context graphs.
- Hyper‑parameter Sensitivity – Choosing thresholds for change‑point detection and context granularity still requires domain expertise; automated selection is an open challenge.
- Scalability to Very High‑Dimensional Grids – While the overhead is modest, memory consumption can grow quickly for dense graphs; future work will explore sparsity‑exploiting CI tests.
- Extension to Non‑Constraint‑Based Methods – The current framework focuses on constraint‑based algorithms; adapting it to score‑based or deep‑learning causal discovery approaches is a promising direction.
Authors
- Martin Rabel
- Jakob Runge
Paper Information
- arXiv ID: 2511.21537v1
- Categories: cs.LG, math.ST
- Published: November 26, 2025
- PDF: Download PDF