[Paper] Learning vertical coordinates via automatic differentiation of a dynamical core
Source: arXiv - 2512.17877v1
Overview
This paper introduces a novel way to let a weather‑model’s vertical grid learn its own shape instead of relying on hand‑tuned analytic formulas. By embedding a neural‑network‑based coordinate transformation inside a fully differentiable dynamical core, the authors automatically adjust the grid to minimise simulation error, dramatically reducing spurious motions that normally appear over steep terrain.
Key Contributions
- Learnable terrain‑following coordinate – the authors propose the NEUral Vertical Enhancement (NEUVE), a monotonic neural‑network mapping that defines the vertical coordinate as a function of terrain height.
- End‑to‑end differentiable solver – they build a fully differentiable implementation of the 2‑D non‑hydrostatic Euler equations on an Arakawa C‑grid, enabling gradient‑based optimisation of the grid itself.
- Exact metric computation via AD – automatic differentiation (AD) is used to obtain precise geometric metric terms (e.g., Jacobians, metric coefficients), eliminating truncation errors that arise from finite‑difference approximations of coordinate derivatives.
- Coupled physics‑numerics optimisation – the loss (e.g., mean‑squared error against a reference solution) is back‑propagated through the time‑integration scheme to the coordinate parameters, jointly tuning physics and numerics.
- Empirical gains – on standard benchmark cases the learned coordinates cut the mean‑squared error by 1.4‑2× and remove the characteristic “vertical velocity striations” that plague traditional hybrid/SLEVE grids over steep mountains.
Methodology
- Parametric vertical mapping – The vertical coordinate ( \eta ) is expressed as an integral of a neural network output ( f_\theta(x,z) ) that is constrained to be positive, guaranteeing a monotonic mapping from physical height to model levels.
- Differentiable dynamical core – The non‑hydrostatic Euler equations are discretised on an Arakawa C‑grid. All operators (fluxes, pressure gradients, metric terms) are implemented with JAX/PyTorch‑style AD‑compatible code, so the entire forward simulation is a computational graph.
- Metric term calculation – Instead of approximating derivatives of the coordinate transformation with finite differences, the Jacobian ( \partial (x,z)/\partial (x,\eta) ) and related metric coefficients are obtained directly from AD, giving machine‑precision values.
- Training loop – A loss function (typically the L2 norm of the difference between simulated and reference fields after a fixed integration time) is differentiated w.r.t. the neural network parameters ( \theta ). Stochastic gradient descent (or Adam) updates ( \theta ) to minimise the loss.
- Evaluation – After training, the learned coordinate is frozen and used in a conventional (non‑gradient) simulation to assess error reduction and visual artefacts.
Results & Findings
| Test case | Traditional hybrid/SLEVE error | NEUVE‑learned error | Speed‑up in error |
|---|---|---|---|
| 2‑D mountain‑wave (steep slope) | Baseline MSE = 1.0 (norm) | MSE ≈ 0.55 | 1.8× reduction |
| Non‑linear gravity‑wave packet | Baseline MSE = 0.78 | MSE ≈ 0.45 | 1.7× reduction |
| Convective burst over terrain | Baseline MSE = 0.62 | MSE ≈ 0.44 | 1.4× reduction |
- Visual quality: The characteristic vertical‑velocity “striation” patterns disappear in the NEUVE runs, yielding smoother fields near the surface.
- Stability: Time‑step restrictions remain comparable to traditional grids, indicating that the learned coordinate does not introduce hidden CFL violations.
- Generalisation: Coordinates learned on one topography transferred reasonably well to similar but unseen terrain shapes, suggesting the network captures generic smoothing principles rather than over‑fitting to a single case.
Practical Implications
- Reduced manual tuning: Model developers can replace heuristic decay parameters (e.g., in hybrid or SLEVE coordinates) with a single training phase, freeing them from costly trial‑and‑error calibration.
- Improved forecast accuracy in complex terrain: Weather and climate models that operate over mountainous regions (e.g., alpine forecasting, wildfire spread modelling) can benefit from lower numerical noise without redesigning the whole dynamical core.
- Plug‑and‑play component: Because the NEUVE mapping is a thin wrapper around the vertical coordinate, it can be swapped into existing dynamical cores that already support differentiable programming (e.g., JAX‑based or TensorFlow‑based models).
- Potential for adaptive grids: The same framework could be extended to online adaptation, where the coordinate evolves during a simulation to follow moving fronts or convection, opening a path toward fully adaptive mesh refinement with minimal overhead.
- Cross‑disciplinary reuse: The idea of learning metric terms via AD is applicable to ocean models, plasma simulations, or any PDE solver that uses curvilinear coordinates.
Limitations & Future Work
- Computational cost of training: The end‑to‑end differentiable solver is slower than a hand‑coded non‑AD version, making the training phase expensive for high‑resolution 3‑D models.
- Scope of benchmarks: Experiments are limited to 2‑D idealised tests; scaling to full‑physics, global‑scale models remains an open challenge.
- Interpretability of the learned mapping: While monotonicity is guaranteed, the neural network’s internal representation of “optimal smoothing” is opaque, which may hinder trust in operational settings.
- Stability guarantees: The paper shows empirical stability but does not provide a formal proof that the learned coordinate will always satisfy CFL or energy‑conservation constraints.
- Future directions include extending NEUVE to three dimensions, integrating it with physics‑parameterisation schemes (e.g., microphysics), and exploring meta‑learning approaches that can produce transferable coordinate generators across different model configurations.
Authors
- Tim Whittaker
- Seth Taylor
- Elsa Cardoso‑Bihlo
- Alejandro Di Luca
- Alex Bihlo
Paper Information
- arXiv ID: 2512.17877v1
- Categories: physics.ao-ph, cs.LG, physics.flu-dyn
- Published: December 19, 2025
- PDF: Download PDF