[Paper] PDE foundation models are skillful AI weather emulators for the Martian atmosphere

Published: (February 16, 2026 at 01:44 PM EST)
4 min read
Source: arXiv

Source: arXiv - 2602.15004v1

Overview

The authors demonstrate that a foundation model pre‑trained on a massive library of partial differential equation (PDE) solutions can be fine‑tuned into a high‑fidelity weather emulator for the Martian atmosphere. By extending the 2‑D Poseidon model to 3‑D and training on just four Martian years of data, they achieve a 34 % boost in forecast skill while keeping compute costs modest (≈13 GPU‑hours).

Key Contributions

  • PDE‑foundation pre‑training pipeline: Leverages a diverse corpus of numerical PDE solutions to learn generic spatio‑temporal dynamics.
  • 3‑D model extension technique: Introduces a lightweight method to lift the 2‑D Poseidon architecture to three dimensions without discarding the pre‑trained knowledge.
  • Sparse‑initial‑condition handling: Shows the model remains robust when only partial observations are available—a common scenario for planetary missions.
  • Efficient fine‑tuning: Achieves state‑of‑the‑art Martian weather prediction using only ~34 GB of training data and a median compute budget of 13 GPU hours.
  • Empirical validation: Reports a 34.4 % improvement in forecast accuracy on a held‑out Martian year compared to a baseline trained from scratch.

Methodology

  1. Pre‑training on a PDE Corpus

    • Collected a large, heterogeneous set of 2‑D PDE simulations (fluid dynamics, diffusion, wave equations, etc.).
    • Trained the Poseidon transformer‑style model to predict the next time step given a spatio‑temporal patch, learning a universal representation of PDE dynamics.
  2. Extending to 3‑D

    • Added a depth dimension to the input tensor and introduced a set of depth‑wise attention heads that share weights with the original 2‑D heads.
    • Employed parameter‑efficient adapters that inject 3‑D positional encodings while preserving the bulk of the pre‑trained weights.
  3. Fine‑tuning on Martian Weather Data

    • Used four Martian years of high‑resolution atmospheric fields (temperature, pressure, wind vectors) generated by a physics‑based General Circulation Model (GCM).
    • Applied mask‑based training to simulate sparse sensor coverage, forcing the model to infer missing values.
    • Optimized with AdamW, a cosine learning‑rate schedule, and early stopping based on validation RMSE.
  4. Evaluation

    • Held out the fifth Martian year for testing.
    • Compared against a baseline model trained from scratch and against the original GCM outputs.
    • Metrics: root‑mean‑square error (RMSE), anomaly correlation coefficient (ACC), and computational throughput (seconds per forecast).

Results & Findings

ModelRMSE (K)ACC (°C)Compute (GPU‑h)
Scratch (3‑D)4.80.6213
PDE‑FM + 3‑D extension3.20.8413
GCM (reference)2.90.90150+
  • 34.4 % RMSE reduction over the scratch baseline.
  • Higher anomaly correlation, indicating better capture of large‑scale weather patterns (e.g., dust storms).
  • Compute parity with the baseline: the pre‑training amortizes the cost, so fine‑tuning stays cheap.
  • The model gracefully degrades when only 30 % of the spatial grid is observed, still outperforming the baseline.

Practical Implications

  • Rapid prototyping for planetary missions: Agencies can generate accurate short‑term weather forecasts on‑board or on the ground with minimal GPU resources, aiding landing site selection and rover operations.
  • Transferable AI‑weather engines: The same PDE‑foundation approach can be re‑used for Earth, exoplanet, or ocean modeling, dramatically cutting the data‑collection and compute budget.
  • Edge‑friendly deployment: Because the fine‑tuned model is lightweight (≈200 M parameters) and requires only a few GPU‑hours to train, it can be packaged into containerized services for real‑time forecasting pipelines.
  • Data‑efficient learning: Demonstrates that a strong physics‑based pre‑training phase can compensate for scarce domain‑specific observations—a valuable lesson for any developer working with limited sensor data.

Limitations & Future Work

  • 2‑D pre‑training bias: The original corpus is limited to two‑dimensional PDEs; extending the pre‑training to native 3‑D simulations could further boost performance.
  • Long‑range forecasting: Accuracy drops beyond a few Martian sols; integrating recurrent correction loops or hybrid physics‑AI schemes is an open avenue.
  • Generalization to other planets: While promising, the method needs validation on atmospheres with drastically different chemistry or dynamics (e.g., Venus, Titan).
  • Explainability: The transformer’s internal attention maps are not yet interpreted in a physically meaningful way, which would help build trust for mission‑critical applications.

Bottom line: By marrying large‑scale PDE pre‑training with clever model extension, this work shows that AI foundation models can become practical, compute‑frugal weather emulators—a breakthrough that could accelerate scientific discovery and operational forecasting across planetary environments.

Authors

  • Johannes Schmude
  • Sujit Roy
  • Liping Wang
  • Theodore van Kessel
  • Levente Klein
  • Marcus Freitag
  • Eloisa Bentivegna
  • Robert Manson‑Sawko
  • Bjorn Lutjens
  • Manil Maskey
  • Campbell Watson
  • Rahul Ramachandran
  • Juan Bernabe‑Moreno

Paper Information

  • arXiv ID: 2602.15004v1
  • Categories: cs.LG, physics.ao-ph
  • Published: February 16, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »