[Paper] Scientific Knowledge-Guided Machine Learning for Vessel Power Prediction: A Comparative Study

Published: 2 months ago (February 20, 2026 at 01:12 PM EST)

5 min read

Source: arXiv

Source: arXiv - 2602.18403v1

Overview

This paper tackles a classic problem in maritime engineering: predicting a ship’s main‑engine power from its speed. While off‑the‑shelf machine‑learning models can fit the data, they often ignore the well‑known propeller law (P = cV^{n}) that governs the bulk of the relationship. The authors propose a hybrid “physics‑informed” framework that first applies the analytical power curve and then uses a lightweight ML model to learn only the residual (the part that deviates because of weather, hull fouling, trim, etc.). The result is a model that is both more accurate—especially when data are scarce—and guaranteed to respect the underlying physics.

Key Contributions

Hybrid modeling pipeline that combines a closed‑form propeller law baseline with a data‑driven residual predictor.
Comparison of three residual learners – XGBoost, a shallow neural network, and a Physics‑Informed Neural Network (PINN) – against their pure‑data counterparts.
Demonstration of improved extrapolation in sparsely sampled speed regimes, a known weakness of conventional black‑box regressors.
Practical validation on real‑world in‑service vessel data, showing consistent gains without added computational overhead.
Open‑source‑ready recipe for integrating domain knowledge into any regression task where a reliable analytical model exists.

Methodology

Baseline physics model
- The authors start from the calm‑water power curve (P_{\text{base}} = c V^{n}).
- Coefficients (c) and exponent (n) are obtained from sea‑trial data (or ship design specs) using a simple linear regression on (\log P) vs. (\log V).
Residual definition
- For each observation ((V_i, P_i)) the residual is computed as
  [ r_i = P_i - P_{\text{base}}(V_i) ]
- This residual captures everything the physics model cannot explain: wind, currents, hull fouling, load distribution, etc.
Residual learner
- Three separate regressors are trained on the residuals:
  - XGBoost – gradient‑boosted trees, robust to heterogeneous features.
  - Shallow Neural Network – a few fully‑connected layers, fast to train.
  - Physics‑Informed Neural Network (PINN) – the same NN but with an extra loss term that penalizes deviation from the baseline physics during training.
- All models receive the same input feature set (speed, environmental variables, trim, etc.) and output a predicted residual (\hat r).
Hybrid prediction
- Final power estimate is simply
  [ \hat P = P_{\text{base}}(V) + \hat r ]
- The baseline guarantees that (\hat P) follows the correct asymptotic trend for very low or very high speeds, even when the ML component has never seen such data.
Evaluation
- The dataset is split into dense (well‑covered speeds) and sparse (few observations) regions.
- Metrics: Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and a physics‑consistency check (e.g., monotonicity of power vs. speed).

Results & Findings

Model (Residual Learner)	MAE (dense)	MAE (sparse)	RMSE (dense)	RMSE (sparse)
XGBoost (pure)	3.2 %	7.8 %	4.1 %	9.5 %
XGBoost + baseline	2.6 %	5.1 %	3.3 %	6.2 %
NN (pure)	3.5 %	8.4 %	4.5 %	10.2 %
NN + baseline	2.9 %	5.6 %	3.7 %	6.8 %
PINN (pure)	3.1 %	7.9 %	4.0 %	9.7 %
PINN + baseline	2.7 %	5.3 %	3.4 %	6.4 %

Key take‑aways

Adding the physics baseline consistently reduces error across all three learners, with the biggest relative improvement (≈30 %) in the sparse speed region.
The hybrid models preserve the monotonic increase of power with speed, something the pure data‑driven versions occasionally violate.
Training time remains comparable because the baseline is a closed‑form expression; the residual learners are actually easier to train since they see a smoother target distribution.

Practical Implications

Weather routing & voyage planning – More reliable power forecasts enable better fuel‑consumption estimates when selecting optimal routes.
Trim and hull‑form optimization – Engineers can feed the hybrid model into real‑time decision support tools to evaluate how small changes in loading or hull cleanliness affect power demand.
Regulatory compliance – Accurate power prediction feeds directly into emissions reporting (e.g., IMO EEXI, CII) without needing costly full‑scale sea trials.
Scalable deployment – The framework requires only a handful of parameters for the baseline and a lightweight ML model, making it suitable for edge devices on board or cloud‑based fleet management platforms.
Template for other domains – Any system where a solid analytical law exists (aerodynamics, battery discharge, HVAC load) can adopt the same residual‑learning pattern to boost ML performance while staying physically plausible.

Limitations & Future Work

The baseline curve assumes calm‑water conditions; extreme sea states may introduce non‑linearities that the simple (cV^{n}) form cannot capture.
The study used a single vessel class; generalizing across ship types (e.g., tankers vs. container ships) may require vessel‑specific baseline calibrations.
Only three residual learners were examined; exploring Gaussian Processes or deep ensembles could further improve uncertainty quantification.
Future research could integrate online learning so the residual model continuously adapts to evolving hull conditions (fouling, retrofits) without retraining from scratch.

Bottom line: By letting physics do the heavy lifting and letting machine learning tidy up the leftovers, the authors deliver a model that’s both trustworthy and practical—exactly the kind of hybrid intelligence developers need when real‑world constraints clash with pure data‑driven optimism.

Authors

Orfeas Bourchas
George Papalambrou

Paper Information

arXiv ID: 2602.18403v1
Categories: cs.LG
Published: February 20, 2026
PDF: Download PDF

[Paper] Scientific Knowledge-Guided Machine Learning for Vessel Power Prediction: A Comparative Study

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] The Geometry of Noise: Why Diffusion Models Don't Need Noise Conditioning

[Paper] Subgroups of $U(d)$ Induce Natural RNN and Transformer Architectures

[Paper] Unifying approach to uniform expressivity of graph neural networks

[Paper] Latent Equivariant Operators for Robust Object Recognition: Promise and Challenges