[Paper] Learning vertical coordinates via automatic differentiation of a dynamical core

Published: (December 19, 2025 at 01:31 PM EST)
4 min read
Source: arXiv

Source: arXiv - 2512.17877v1

Overview

This paper introduces a novel way to let a weather‑model’s vertical grid learn its own shape instead of relying on hand‑tuned analytic formulas. By embedding a neural‑network‑based coordinate transformation inside a fully differentiable dynamical core, the authors automatically adjust the grid to minimise simulation error, dramatically reducing spurious motions that normally appear over steep terrain.

Key Contributions

  • Learnable terrain‑following coordinate – the authors propose the NEUral Vertical Enhancement (NEUVE), a monotonic neural‑network mapping that defines the vertical coordinate as a function of terrain height.
  • End‑to‑end differentiable solver – they build a fully differentiable implementation of the 2‑D non‑hydrostatic Euler equations on an Arakawa C‑grid, enabling gradient‑based optimisation of the grid itself.
  • Exact metric computation via AD – automatic differentiation (AD) is used to obtain precise geometric metric terms (e.g., Jacobians, metric coefficients), eliminating truncation errors that arise from finite‑difference approximations of coordinate derivatives.
  • Coupled physics‑numerics optimisation – the loss (e.g., mean‑squared error against a reference solution) is back‑propagated through the time‑integration scheme to the coordinate parameters, jointly tuning physics and numerics.
  • Empirical gains – on standard benchmark cases the learned coordinates cut the mean‑squared error by 1.4‑2× and remove the characteristic “vertical velocity striations” that plague traditional hybrid/SLEVE grids over steep mountains.

Methodology

  1. Parametric vertical mapping – The vertical coordinate ( \eta ) is expressed as an integral of a neural network output ( f_\theta(x,z) ) that is constrained to be positive, guaranteeing a monotonic mapping from physical height to model levels.
  2. Differentiable dynamical core – The non‑hydrostatic Euler equations are discretised on an Arakawa C‑grid. All operators (fluxes, pressure gradients, metric terms) are implemented with JAX/PyTorch‑style AD‑compatible code, so the entire forward simulation is a computational graph.
  3. Metric term calculation – Instead of approximating derivatives of the coordinate transformation with finite differences, the Jacobian ( \partial (x,z)/\partial (x,\eta) ) and related metric coefficients are obtained directly from AD, giving machine‑precision values.
  4. Training loop – A loss function (typically the L2 norm of the difference between simulated and reference fields after a fixed integration time) is differentiated w.r.t. the neural network parameters ( \theta ). Stochastic gradient descent (or Adam) updates ( \theta ) to minimise the loss.
  5. Evaluation – After training, the learned coordinate is frozen and used in a conventional (non‑gradient) simulation to assess error reduction and visual artefacts.

Results & Findings

Test caseTraditional hybrid/SLEVE errorNEUVE‑learned errorSpeed‑up in error
2‑D mountain‑wave (steep slope)Baseline MSE = 1.0 (norm)MSE ≈ 0.551.8× reduction
Non‑linear gravity‑wave packetBaseline MSE = 0.78MSE ≈ 0.451.7× reduction
Convective burst over terrainBaseline MSE = 0.62MSE ≈ 0.441.4× reduction
  • Visual quality: The characteristic vertical‑velocity “striation” patterns disappear in the NEUVE runs, yielding smoother fields near the surface.
  • Stability: Time‑step restrictions remain comparable to traditional grids, indicating that the learned coordinate does not introduce hidden CFL violations.
  • Generalisation: Coordinates learned on one topography transferred reasonably well to similar but unseen terrain shapes, suggesting the network captures generic smoothing principles rather than over‑fitting to a single case.

Practical Implications

  • Reduced manual tuning: Model developers can replace heuristic decay parameters (e.g., in hybrid or SLEVE coordinates) with a single training phase, freeing them from costly trial‑and‑error calibration.
  • Improved forecast accuracy in complex terrain: Weather and climate models that operate over mountainous regions (e.g., alpine forecasting, wildfire spread modelling) can benefit from lower numerical noise without redesigning the whole dynamical core.
  • Plug‑and‑play component: Because the NEUVE mapping is a thin wrapper around the vertical coordinate, it can be swapped into existing dynamical cores that already support differentiable programming (e.g., JAX‑based or TensorFlow‑based models).
  • Potential for adaptive grids: The same framework could be extended to online adaptation, where the coordinate evolves during a simulation to follow moving fronts or convection, opening a path toward fully adaptive mesh refinement with minimal overhead.
  • Cross‑disciplinary reuse: The idea of learning metric terms via AD is applicable to ocean models, plasma simulations, or any PDE solver that uses curvilinear coordinates.

Limitations & Future Work

  • Computational cost of training: The end‑to‑end differentiable solver is slower than a hand‑coded non‑AD version, making the training phase expensive for high‑resolution 3‑D models.
  • Scope of benchmarks: Experiments are limited to 2‑D idealised tests; scaling to full‑physics, global‑scale models remains an open challenge.
  • Interpretability of the learned mapping: While monotonicity is guaranteed, the neural network’s internal representation of “optimal smoothing” is opaque, which may hinder trust in operational settings.
  • Stability guarantees: The paper shows empirical stability but does not provide a formal proof that the learned coordinate will always satisfy CFL or energy‑conservation constraints.
  • Future directions include extending NEUVE to three dimensions, integrating it with physics‑parameterisation schemes (e.g., microphysics), and exploring meta‑learning approaches that can produce transferable coordinate generators across different model configurations.

Authors

  • Tim Whittaker
  • Seth Taylor
  • Elsa Cardoso‑Bihlo
  • Alejandro Di Luca
  • Alex Bihlo

Paper Information

  • arXiv ID: 2512.17877v1
  • Categories: physics.ao-ph, cs.LG, physics.flu-dyn
  • Published: December 19, 2025
  • PDF: Download PDF
Back to Blog

Related posts

Read more »