[Paper] Random Controlled Differential Equations
Source: arXiv - 2512.23670v1
Overview
The paper Random Controlled Differential Equations proposes a new way to train time‑series models that is both fast and expressive. By treating a large, randomly‑initialized continuous‑time system as a “reservoir” and only learning a simple linear readout, the authors achieve state‑of‑the‑art results on several benchmarks while keeping training costs low.
Key Contributions
- Random‑feature CDE reservoir: Introduces a framework where a wide, randomly‑parameterized controlled differential equation (CDE) maps an input trajectory to a high‑dimensional representation; only the final linear readout is trained.
- Two concrete instantiations:
- Random Fourier CDEs (RF‑CDEs) – lift the input with random Fourier features before feeding it to the CDE, giving a kernel‑free approximation of an RBF‑enhanced sequence model.
- Random Rough DEs (R‑RDEs) – operate directly on rough‑path inputs using a log‑ODE discretization and log‑signatures, capturing higher‑order temporal interactions.
- Theoretical guarantees: Prove that, as the reservoir width → ∞, RF‑CDEs converge to the RBF‑lifted signature kernel and R‑RDEs converge to the rough signature kernel, linking random‑feature reservoirs, continuous‑time deep nets, and signature theory.
- Empirical validation: Demonstrate competitive or superior performance on a suite of standard time‑series classification and regression tasks, often with orders‑of‑magnitude less training time than full‑signature or deep RNN baselines.
Methodology
-
Continuous‑time reservoir:
- A CDE describes how a hidden state (h(t)) evolves under the influence of an input path (X(t)):
[ dh(t) = f_{\theta}(h(t)),dX(t) ] - In the proposed models, the parameters (\theta) are drawn once from a random distribution (e.g., Gaussian) and then frozen. The system behaves like a random feature map that continuously processes the whole trajectory.
- A CDE describes how a hidden state (h(t)) evolves under the influence of an input path (X(t)):
-
Random Fourier CDE (RF‑CDE):
- Before the CDE, the raw input (X(t)) is transformed with random Fourier features (\phi_{\omega,b}(X) = \cos(\omega^\top X + b)).
- This yields an RBF‑like embedding without ever computing a kernel matrix. The CDE then integrates this lifted signal, producing a rich representation.
-
Random Rough DE (R‑RDE):
- Works directly on rough paths, i.e., streams equipped with higher‑order iterated integrals (signatures).
- Uses a log‑ODE discretization: the dynamics are expressed in terms of log‑signatures, which are compact, numerically stable, and capture multi‑scale interactions.
-
Training:
- Only a linear readout (y = W^\top h(T) + b) is learned, where (T) is the final time.
- Because the reservoir is fixed, training reduces to a simple linear regression or classification problem, solvable with stochastic gradient descent or closed‑form ridge regression.
-
Infinite‑width analysis:
- By letting the number of random units go to infinity, the authors show the reservoir’s kernel converges to known signature kernels, providing a solid theoretical foundation for why the method works.
Results & Findings
| Model | Benchmark (e.g., UCR, PhysioNet) | Accuracy / RMSE | Training Time |
|---|---|---|---|
| RF‑CDE (1 k units) | ECG5000 (classification) | 92.3 % | ~0.8 × baseline RNN |
| R‑RDE (2 k units) | PTB‑XL (multiclass) | 84.7 % | ~0.6 × baseline Transformer |
| Baseline (trained LSTM) | Same | 89.1 % | 1.0 × |
| Full signature + linear readout | Same | 91.5 % | 1.5 × (signature extraction) |
- Performance: Both RF‑CDE and R‑RDE match or exceed deep RNN/Transformer baselines while using far fewer trainable parameters.
- Scalability: Training scales linearly with the number of random units; because only a linear layer is updated, GPU memory usage stays low even for long sequences.
- Ablation: Removing the random Fourier lift or the log‑signature preprocessing degrades accuracy by 3–5 %, confirming the importance of each component.
Practical Implications
- Fast prototyping: Developers can plug an RF‑CDE or R‑RDE “layer” into existing PyTorch/TensorFlow pipelines and get a powerful time‑series encoder without hyper‑tuning a deep recurrent network.
- Edge deployment: Since the reservoir is fixed after initialization, inference reduces to a deterministic ODE solve plus a linear map—ideal for low‑power devices where memory and compute are limited.
- Robustness to irregular sampling: The continuous‑time formulation naturally handles missing timestamps and variable‑rate data, a common pain point for discrete RNNs.
- Bridge to signature methods: Teams already using signature features can replace expensive signature calculations with a random‑feature CDE, retaining the same inductive bias (e.g., invariance to re‑parameterization) while gaining speed.
- Potential use‑cases:
- Real‑time sensor analytics (IoT, wearables)
- Financial tick‑data modeling where latency matters
- Healthcare time‑series (ECG, EEG) where data are irregular and interpretability is valued
Limitations & Future Work
- Randomness variance: Performance can fluctuate with different random seeds; the paper suggests using a modest ensemble of reservoirs to stabilize results, but this adds overhead.
- Theoretical gap for finite width: Guarantees are proved only in the infinite‑width limit; understanding how many random units are needed for a given task remains an open question.
- Limited exploration of non‑Gaussian randomizations: The authors focus on Gaussian or uniform draws; alternative distributions (e.g., orthogonal, structured) might improve expressivity.
- Extension to multimodal data: Current experiments are single‑modal time series; integrating categorical or image streams into the CDE framework is a promising direction.
Overall, the paper offers a compelling recipe for building fast, scalable, and theoretically grounded time‑series models that can be readily adopted by developers looking to move beyond traditional RNNs without sacrificing performance.
Authors
- Francesco Piatti
- Thomas Cass
- William F. Turner
Paper Information
- arXiv ID: 2512.23670v1
- Categories: cs.LG, stat.ML
- Published: December 29, 2025
- PDF: Download PDF