[Paper] DAO-GP Drift Aware Online Non-Linear Regression Gaussian-Process

Published: (December 9, 2025 at 01:12 PM EST)
4 min read
Source: arXiv

Source: arXiv - 2512.08879v1

Overview

The paper introduces DAO‑GP, a drift‑aware online Gaussian‑Process (GP) regression framework that automatically adapts to changing data distributions without the need for manual hyper‑parameter tuning. By embedding drift detection, sparse updates, and a principled decay mechanism, DAO‑GP delivers accurate, uncertainty‑aware predictions even in highly non‑stationary environments—making it a strong candidate for real‑time AI systems.

Key Contributions

  • Fully adaptive, hyperparameter‑free online GP – eliminates the need for costly manual tuning or periodic re‑training.
  • Built‑in drift detection & adaptation – automatically gauges drift severity (abrupt, incremental, gradual) and adjusts model behavior on the fly.
  • Sparse, decayed representation – maintains a compact set of inducing points with a decay strategy that discards stale information, keeping memory usage low.
  • Robustness to data snooping – the algorithm processes each observation exactly once, preserving true online learning guarantees.
  • Comprehensive empirical validation – extensive benchmarks on synthetic and real‑world streams show DAO‑GP matches or outperforms state‑of‑the‑art parametric (e.g., online linear models) and non‑parametric (e.g., standard online GPs, kernel recursive least squares) baselines across a variety of drift scenarios.

Methodology

DAO‑GP extends classic Gaussian‑Process regression with three key engineering layers:

  1. Drift‑aware gating – a lightweight statistical test (e.g., Page‑Hinkley or ADWIN) monitors prediction residuals. When drift is detected, the model switches to a “high‑adaptation” mode that temporarily relaxes the influence of older data.
  2. Sparse inducing‑point management – instead of storing every observation, DAO‑GP maintains a dynamic dictionary of representative points. New points are added only if they improve the posterior variance beyond a threshold, keeping the dictionary size bounded.
  3. Decay‑based forgetting – each inducing point carries a weight that exponentially decays with time unless reinforced by recent data. This provides a principled way to forget outdated information without explicit windowing.

All three components are integrated into the standard GP predictive equations, yielding closed‑form updates that can be computed in O(m²) time per step (where m is the current number of inducing points, typically << total samples). No gradient‑based hyper‑parameter optimization is required; the model self‑regulates its smoothness and noise levels through the drift‑aware gating logic.

Results & Findings

  • Predictive accuracy – Across 12 benchmark streams (including electricity demand, sensor networks, and financial tick data), DAO‑GP reduced mean absolute error by 8‑15 % compared with the best existing online GP and by up to 30 % against online linear baselines under severe drift.
  • Uncertainty calibration – Reliability diagrams show DAO‑GP’s predictive intervals maintain proper coverage (≈95 % for 95 % intervals) even after abrupt distribution shifts, whereas fixed‑hyperparameter GPs become over‑confident.
  • Memory & latency – The inducing‑point set stabilized around 50–150 points regardless of stream length, leading to constant‑time inference (< 2 ms per sample on a modest CPU) and a memory footprint < 1 MB.
  • Adaptation dynamics – Visualizations of the drift detector’s signal align closely with known change points in the data, confirming that DAO‑GP reacts promptly (within a few samples) to both sudden and gradual drifts.

Practical Implications

  • Edge & IoT analytics – Devices with limited RAM can run DAO‑GP to continuously model sensor readings (e.g., temperature, vibration) while automatically handling sensor drift or environmental changes.
  • Financial & trading systems – Real‑time price prediction models can stay calibrated without manual re‑training, reducing latency and operational risk during market regime shifts.
  • A/B testing & personalization – Online recommendation engines can adapt to evolving user behavior, maintaining accurate confidence estimates that are crucial for exploration‑exploitation strategies.
  • DevOps monitoring – DAO‑GP can serve as a plug‑and‑play anomaly detector for time‑series logs, automatically adjusting to workload spikes or configuration changes without constant hyper‑parameter retuning.

Limitations & Future Work

  • Scalability to ultra‑high‑dimensional inputs – While the sparse inducing‑point scheme curbs memory, kernel computations still suffer in very high dimensions; integrating random feature approximations could help.
  • Drift detector sensitivity – The current statistical test may generate false alarms in highly noisy streams; exploring adaptive thresholds or ensemble drift detectors is a promising direction.
  • Benchmark diversity – Most experiments focus on univariate or low‑dimensional regression; extending validation to multi‑output or spatio‑temporal tasks would strengthen the claim of generality.
  • Theoretical guarantees – Formal regret bounds under drift conditions remain an open question; future work could aim to provide provable performance limits.

Authors

  • Mohammad Abu‑Shaira
  • Ajita Rattani
  • Weishi Shi

Paper Information

  • arXiv ID: 2512.08879v1
  • Categories: cs.LG, cs.AI
  • Published: December 9, 2025
  • PDF: Download PDF
Back to Blog

Related posts

Read more »