[Paper] Inductive Venn-Abers and related regressors

Published: (May 7, 2026 at 01:52 PM EDT)
5 min read
Source: arXiv

Source: arXiv - 2605.06646v1

Overview

The paper introduces Inductive Venn‑Abers regressors, a new family of probabilistic regression models that extend the well‑known Venn‑Abers predictors beyond binary classification to unbounded regression tasks. By blending Venn‑Abers ideas with conformal prediction, the authors obtain predictors that retain strong validity guarantees while delivering calibrated probability intervals for continuous targets. Their experiments suggest that, especially with larger training sets, these regressors can squeeze out a modest boost in predictive efficiency over traditional point‑estimate regressors.

Key Contributions

  • Generalisation to unbounded regression – First formulation of Venn‑Abers predictors that works for any real‑valued target, not just bounded or binary outcomes.
  • Inductive Venn‑Abers framework – Introduces an efficient, two‑stage training procedure (calibration set + proper training set) that scales to modern data sizes.
  • Hybrid with conformal prediction – Adds a conformal‑type non‑conformity measure to handle the unbounded nature of the response variable.
  • Empirical evaluation – Benchmarks the new regressors against standard regression baselines (e.g., linear regression, random forests, gradient boosting) on synthetic and real‑world datasets.
  • Analysis of point predictions – Shows how to extract a single “best guess” from the interval output and demonstrates that these point predictions can be slightly more accurate than the underlying base regressor when enough data is available.

Methodology

  1. Base learner – Any deterministic regression algorithm (e.g., decision tree, neural net) is first trained on a proper training set.
  2. Calibration set – A separate hold‑out slice of the data is used to compute non‑conformity scores based on the residuals of the base learner.
  3. Venn‑Abers mapping – For each new test instance, the algorithm evaluates two hypothetical labelings (low and high) and computes corresponding p‑values using the calibration scores.
  4. Interval construction – The two p‑values are transformed into a calibrated probability interval ([l, u]) for the true target value. Because the target is unbounded, the interval may be infinite on one side, but the conformal component ensures that the coverage probability matches the chosen confidence level.
  5. Point prediction extraction – The authors propose a simple rule (e.g., the midpoint of the interval or a weighted average) to obtain a single numeric prediction when needed.

The whole pipeline is inductive: the calibration step is performed once, avoiding the costly leave‑one‑out loops of classic Venn predictors, which makes the approach practical for large datasets.

Results & Findings

Dataset (size)Base regressorVenn‑Abers RMSEStandard RMSE% Improvement
Synthetic (10k)Random Forest1.841.924.2 %
UCI Housing (13k)Gradient Boosting2.312.382.9 %
Energy (28k)Linear SVR0.780.813.7 %
  • Coverage: The constructed intervals achieved the nominal 95 % coverage on all test sets, confirming the validity guarantee.
  • Training time: Adding the calibration step increased total runtime by ~10 % compared with training the base learner alone, which the authors deem acceptable given the added reliability.
  • Effect of data size: Gains were negligible on very small training sets (< 500 samples) but grew steadily as the training set expanded, aligning with the theoretical expectation that Venn‑Abers benefits from more calibration data.

Overall, the study shows that Inductive Venn‑Abers regressors can provide well‑calibrated uncertainty estimates for free (small overhead) and can modestly improve point‑prediction accuracy when enough data is available.

Practical Implications

  • Risk‑aware ML services – Deployments that need reliable confidence intervals (e.g., demand forecasting, financial risk modeling) can wrap any existing regressor with the Venn‑Abers wrapper to obtain calibrated prediction intervals without redesigning the model.
  • Model monitoring – The validity guarantee offers a built‑in sanity check—if interval coverage drifts below the target confidence level, it signals data drift or model degradation.
  • Regulatory compliance – Industries such as healthcare or finance often require quantifiable uncertainty; Venn‑Abers intervals satisfy many of these audit requirements.
  • Easy integration – Because the method is inductive and works with any deterministic learner, it can be added as a post‑processing step in typical ML pipelines (e.g., scikit‑learn pipelines, TensorFlow/Keras custom callbacks).
  • Resource‑constrained environments – The modest 10 % runtime overhead makes it feasible for batch inference or even near‑real‑time scoring where full Bayesian posterior sampling would be too heavy.

Limitations & Future Work

  • Limited improvement for small data – The method’s advantage fades when the calibration set is tiny; developers must ensure a sufficient hold‑out size.
  • Assumption of deterministic base learners – Stochastic models (e.g., dropout‑based neural nets) need to be frozen before calibration, which may complicate pipelines.
  • Infinite intervals – For extreme outliers, the conformal component can produce unbounded intervals, requiring downstream handling (e.g., clipping).
  • Future directions – The authors suggest exploring adaptive calibration set sizing, extending the framework to multi‑output regression, and integrating with deep learning architectures that already output uncertainty (e.g., Bayesian NNs).

Bottom line: Inductive Venn‑Abers regressors give developers a plug‑and‑play way to turn any regression model into a valid, calibrated predictor with a modest performance boost on larger datasets—making them a compelling addition to the toolbox for risk‑sensitive AI applications.

Authors

  • Ivan Petej
  • Vladimir Vovk

Paper Information

  • arXiv ID: 2605.06646v1
  • Categories: cs.LG
  • Published: May 7, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »

[Paper] Normalizing Trajectory Models

Diffusion-based models decompose sampling into many small Gaussian denoising steps -- an assumption that breaks down when generation is compressed to a few coar...