[Paper] DAO-GP Drift Aware Online Non-Linear Regression Gaussian-Process
Source: arXiv - 2512.08879v1
Overview
The paper introduces DAO‑GP, a drift‑aware online Gaussian‑Process (GP) regression framework that automatically adapts to changing data distributions without the need for manual hyper‑parameter tuning. By embedding drift detection, sparse updates, and a principled decay mechanism, DAO‑GP delivers accurate, uncertainty‑aware predictions even in highly non‑stationary environments—making it a strong candidate for real‑time AI systems.
Key Contributions
- Fully adaptive, hyperparameter‑free online GP – eliminates the need for costly manual tuning or periodic re‑training.
- Built‑in drift detection & adaptation – automatically gauges drift severity (abrupt, incremental, gradual) and adjusts model behavior on the fly.
- Sparse, decayed representation – maintains a compact set of inducing points with a decay strategy that discards stale information, keeping memory usage low.
- Robustness to data snooping – the algorithm processes each observation exactly once, preserving true online learning guarantees.
- Comprehensive empirical validation – extensive benchmarks on synthetic and real‑world streams show DAO‑GP matches or outperforms state‑of‑the‑art parametric (e.g., online linear models) and non‑parametric (e.g., standard online GPs, kernel recursive least squares) baselines across a variety of drift scenarios.
Methodology
DAO‑GP extends classic Gaussian‑Process regression with three key engineering layers:
- Drift‑aware gating – a lightweight statistical test (e.g., Page‑Hinkley or ADWIN) monitors prediction residuals. When drift is detected, the model switches to a “high‑adaptation” mode that temporarily relaxes the influence of older data.
- Sparse inducing‑point management – instead of storing every observation, DAO‑GP maintains a dynamic dictionary of representative points. New points are added only if they improve the posterior variance beyond a threshold, keeping the dictionary size bounded.
- Decay‑based forgetting – each inducing point carries a weight that exponentially decays with time unless reinforced by recent data. This provides a principled way to forget outdated information without explicit windowing.
All three components are integrated into the standard GP predictive equations, yielding closed‑form updates that can be computed in O(m²) time per step (where m is the current number of inducing points, typically << total samples). No gradient‑based hyper‑parameter optimization is required; the model self‑regulates its smoothness and noise levels through the drift‑aware gating logic.
Results & Findings
- Predictive accuracy – Across 12 benchmark streams (including electricity demand, sensor networks, and financial tick data), DAO‑GP reduced mean absolute error by 8‑15 % compared with the best existing online GP and by up to 30 % against online linear baselines under severe drift.
- Uncertainty calibration – Reliability diagrams show DAO‑GP’s predictive intervals maintain proper coverage (≈95 % for 95 % intervals) even after abrupt distribution shifts, whereas fixed‑hyperparameter GPs become over‑confident.
- Memory & latency – The inducing‑point set stabilized around 50–150 points regardless of stream length, leading to constant‑time inference (< 2 ms per sample on a modest CPU) and a memory footprint < 1 MB.
- Adaptation dynamics – Visualizations of the drift detector’s signal align closely with known change points in the data, confirming that DAO‑GP reacts promptly (within a few samples) to both sudden and gradual drifts.
Practical Implications
- Edge & IoT analytics – Devices with limited RAM can run DAO‑GP to continuously model sensor readings (e.g., temperature, vibration) while automatically handling sensor drift or environmental changes.
- Financial & trading systems – Real‑time price prediction models can stay calibrated without manual re‑training, reducing latency and operational risk during market regime shifts.
- A/B testing & personalization – Online recommendation engines can adapt to evolving user behavior, maintaining accurate confidence estimates that are crucial for exploration‑exploitation strategies.
- DevOps monitoring – DAO‑GP can serve as a plug‑and‑play anomaly detector for time‑series logs, automatically adjusting to workload spikes or configuration changes without constant hyper‑parameter retuning.
Limitations & Future Work
- Scalability to ultra‑high‑dimensional inputs – While the sparse inducing‑point scheme curbs memory, kernel computations still suffer in very high dimensions; integrating random feature approximations could help.
- Drift detector sensitivity – The current statistical test may generate false alarms in highly noisy streams; exploring adaptive thresholds or ensemble drift detectors is a promising direction.
- Benchmark diversity – Most experiments focus on univariate or low‑dimensional regression; extending validation to multi‑output or spatio‑temporal tasks would strengthen the claim of generality.
- Theoretical guarantees – Formal regret bounds under drift conditions remain an open question; future work could aim to provide provable performance limits.
Authors
- Mohammad Abu‑Shaira
- Ajita Rattani
- Weishi Shi
Paper Information
- arXiv ID: 2512.08879v1
- Categories: cs.LG, cs.AI
- Published: December 9, 2025
- PDF: Download PDF