[Paper] Detecting Stochasticity in Discrete Signals via Nonparametric Excursion Theorem
Source: arXiv - 2601.06009v1
Overview
The paper introduces a non‑parametric test that can tell, from a single discrete‑time signal, whether the underlying dynamics are truly stochastic (diffusive) or merely deterministic (periodic or chaotic). By leveraging classic excursion theorems for continuous semimartingales, the authors derive a universal ε⁻² scaling law that holds for any diffusion process but breaks down for deterministic systems. This provides a mathematically‑grounded alternative to heuristic entropy‑ or recurrence‑based diagnostics that dominate current practice.
Key Contributions
- Universal excursion scaling law – Shows that the expected number of excursions of size ≥ ε scales as ε⁻² times the quadratic variation for any continuous semimartingale with finite quadratic variation.
- Model‑free diffusion test – Constructs a data‑driven statistic K(ε) that compares empirical excursion counts to the theoretical expectation, summarizing the result with a log‑log slope deviation.
- Robust implementation – Provides a practical algorithm that works on a single discrete time series, requiring no parameter tuning or prior knowledge of the underlying model.
- Extensive validation – Demonstrates accurate classification on canonical stochastic processes, noisy periodic/chaotic maps, and a stochastic Duffing oscillator, outperforming entropy‑based baselines.
- Theoretical‑practical bridge – Connects deep stochastic analysis (excursion and crossing theorems) with a usable tool for engineers and data scientists.
Methodology
-
Excursion Counting
- For a given threshold ε, an excursion is a segment of the trajectory that leaves a band of width 2ε around the current level and later returns.
- The algorithm slides a window over the discrete series, counts how many such excursions occur, and records (N_{\varepsilon}).
-
Theoretical Expectation
- For any continuous semimartingale (X_t), stochastic calculus tells us that
[ \mathbb{E}[N_\varepsilon] \approx \frac{[X]_T}{\varepsilon^{2}}, ]
where ([X]_T) is the quadratic variation (cumulative “roughness”) of the process up to time (T).
- Deterministic signals have ([X]_T \approx 0), so the ε⁻² law collapses.
- Test Statistic
- Compute the ratio
[ K(\varepsilon) = \frac{N_{\varepsilon}^{\text{emp}}}{N_{\varepsilon}^{\text{theory}}}. ]
- Plot (\log K(\varepsilon)) vs. (\log \varepsilon) over a range of ε values.
- Fit a straight line; the slope deviation from –2 quantifies how closely the data follows the diffusion scaling.
- Decision Rule
- If the slope is within a small tolerance of –2 (or (K(\varepsilon)) stays near 1), classify the signal as diffusion‑like.
- Otherwise, label it deterministic (periodic, chaotic, or noise‑free).
The whole pipeline requires only the raw time series and a few hyper‑parameters (ε range, tolerance), all of which can be set automatically based on data length and sampling rate.
Results & Findings
| System | Ground‑truth | Measured slope (log‑log) | Classification |
|---|---|---|---|
| Standard Brownian motion | Stochastic | –2.01 ± 0.03 | Diffusive |
| Ornstein‑Uhlenbeck process | Stochastic | –1.98 ± 0.04 | Diffusive |
| Logistic map (chaotic) | Deterministic | –1.30 ± 0.12 | Non‑diffusive |
| Sine wave + white noise (low SNR) | Mixed | –1.85 ± 0.07 | Diffusive (detects underlying noise) |
| Stochastic Duffing oscillator | Stochastic | –2.00 ± 0.02 | Diffusive |
Key take‑aways
- The ε⁻² law holds exactly for all tested diffusions, even with state‑dependent volatility.
- Deterministic chaotic maps deviate markedly, producing shallower slopes.
- Adding modest white noise to a deterministic signal pushes the slope toward –2, confirming that the test is sensitive to genuine stochastic components rather than mere irregularity.
Practical Implications
- Signal validation in IoT / sensor networks – Quickly verify whether a sensor’s output contains genuine diffusion‑type noise (e.g., thermal noise) or is dominated by deterministic drift, enabling smarter filtering strategies.
- Financial time‑series diagnostics – Distinguish true market diffusion from algorithmic or deterministic patterns without fitting a specific stochastic model.
- Model selection for system identification – Before committing to a stochastic differential equation (SDE) model, use the excursion test to confirm that the data’s small‑scale structure is compatible with an SDE.
- Robust anomaly detection – A sudden shift from a diffusion‑like slope to a deterministic one could flag sensor failure, regime change, or cyber‑attack.
- Educational tool – Provides a concrete, visual demonstration of quadratic variation and excursion theory for students learning stochastic processes.
Implementation is straightforward in Python or MATLAB (the authors release a small library), and the computational cost is linear in the number of samples, making it suitable for real‑time monitoring.
Limitations & Future Work
- Sampling constraints – The method assumes sufficiently high sampling to resolve excursions at the chosen ε scales; very coarse data may under‑count excursions and bias the slope.
- Finite‑time effects – For short recordings, the empirical quadratic variation estimate can be noisy, leading to wider confidence intervals.
- Non‑continuous processes – Pure jump processes (e.g., Lévy flights) violate the continuous‑semimartingale assumption, so the test may misclassify them as non‑diffusive.
- Extension to multivariate signals – Current formulation handles scalar series; extending the excursion framework to vector‑valued data (e.g., multi‑sensor fusion) is an open direction.
The authors suggest exploring adaptive ε‑selection, integrating the test with Bayesian model comparison, and applying it to high‑frequency finance and neuroscience data where the stochastic/deterministic boundary is especially blurry.
Authors
- Sunia Tanweer
- Firas A. Khasawneh
Paper Information
- arXiv ID: 2601.06009v1
- Categories: stat.ML, cs.LG, eess.SP, math.PR, stat.AP
- Published: January 9, 2026
- PDF: Download PDF