[Paper] SymPlex: A Structure-Aware Transformer for Symbolic PDE Solving

Published: 3 months ago (February 3, 2026 at 01:18 PM EST)

5 min read

Source: arXiv

Source: arXiv - 2602.03816v1

Overview

The paper introduces SymPlex, a reinforcement‑learning system that can automatically discover exact symbolic formulas for solutions of partial differential equations (PDEs). By treating the search for a formula as a tree‑structured decision problem and using a novel structure‑aware Transformer (SymFormer), the method works directly in the space of mathematical expressions—producing human‑readable, interpretable solutions without ever seeing a ground‑truth answer.

Key Contributions

SymPlex framework: Casts symbolic PDE solving as a reinforcement‑learning problem that optimizes candidate expressions using only the PDE and its boundary conditions.
SymFormer architecture: A Transformer variant that respects the hierarchical tree structure of mathematical expressions via tree‑relative self‑attention and guarantees syntactically valid outputs through grammar‑constrained autoregressive decoding.
Structure‑aware generation: Moves beyond linear token sequences (e.g., standard language models) to directly model the nested, tree‑like nature of symbolic math, improving expressivity and correctness.
Exact recovery of non‑smooth and parametric solutions: Demonstrates that the system can discover closed‑form solutions that include piecewise definitions, absolute values, and explicit parameter dependencies—cases where numerical or implicit neural solvers struggle.
Empirical validation: Shows that SymPlex matches or exceeds prior symbolic regression baselines on a suite of benchmark PDEs, achieving 100 % exact recovery on several challenging examples.

Methodology

Problem formulation – The goal is to find a symbolic expression (u(x)) that satisfies a given PDE (\mathcal{L}[u]=0) together with boundary/initial conditions. No training data (i.e., known solutions) are provided.
Tree‑structured action space – Each candidate solution is represented as a syntax tree (operators as internal nodes, variables/constants as leaves). The RL agent incrementally builds this tree, choosing the next node based on the current partial structure.
SymFormer encoder‑decoder
- Encoder processes the PDE description (operators, variables, boundary terms) using a standard Transformer.
- Decoder generates the solution tree. It uses tree‑relative self‑attention: attention scores are computed relative to the parent, sibling, and ancestor positions, preserving the hierarchical dependencies of math expressions.
- Grammar constraints enforce that only syntactically valid tokens can be selected at each step (e.g., a binary operator must be followed by two sub‑expressions).
Reward signal – After a full expression is generated, the system evaluates it on a set of collocation points sampled from the domain. The reward combines:
- PDE residual loss (how well the expression satisfies the differential equation),
- Boundary loss (how well it meets the prescribed conditions), and
- Complexity penalty (favoring simpler formulas).
Training loop – Policy gradient (REINFORCE) updates the decoder parameters to maximize expected reward, while the encoder is jointly fine‑tuned to better condition the decoder on the PDE context.

Results & Findings

Benchmark PDE	Exact symbolic recovery?	Notable features recovered
1‑D Burgers (viscous)	✅	Piecewise linear shock, explicit viscosity parameter
2‑D Laplace with Dirichlet BC	✅	Harmonic polynomial with parametric coefficients
Heat equation with time‑dependent BC	✅	Series solution with explicit time factor
Non‑smooth Poisson (	x	term)

Zero‑error solutions: For all tested equations, SymPlex produced expressions that evaluate to machine‑precision zero residual on unseen points.
Interpretability: The recovered formulas are concise and directly usable in downstream analytical work (e.g., stability analysis).
Comparison to baselines: Traditional symbolic regression (e.g., Eureqa, Deep Symbolic Regression) failed on non‑smooth or parametric cases, while SymPlex succeeded consistently.
Ablation: Removing tree‑relative attention or grammar constraints caused a >30 % drop in exact recovery rates, confirming their importance.

Practical Implications

Rapid prototyping of analytical models: Engineers can feed a PDE description and let SymPlex suggest closed‑form solutions, accelerating the design of control laws, material models, or fluid dynamics approximations.
Explainable AI for scientific computing: Unlike black‑box neural PDE solvers that output discretized fields, SymPlex yields formulas that can be inspected, differentiated, and embedded into larger symbolic pipelines (e.g., symbolic optimization, theorem proving).
Parameter‑sensitive design: Because the output retains explicit dependence on physical parameters, developers can perform sensitivity analysis or embed the solution directly into simulation code without re‑training.
Integration with existing toolchains: The generated expressions are standard mathematical syntax, making them compatible with CAS (Mathematica, SymPy) and automatic code generators for high‑performance computing.
Educational use: Students and researchers can use SymPlex as a “symbolic assistant” to verify hand‑derived solutions or explore alternative forms.

Limitations & Future Work

Scalability to high‑dimensional PDEs: Current experiments are limited to 1‑D/2‑D problems; extending to 3‑D or systems with many coupled fields will require more efficient tree search and possibly hierarchical decomposition.
Reward evaluation cost: Computing PDE residuals for complex expressions can be expensive; smarter surrogate rewards or adaptive sampling could reduce overhead.
Grammar expressiveness: The predefined grammar restricts the operator set (e.g., no special functions like Bessel or hypergeometric). Future work could learn or expand grammars dynamically.
Generalization across PDE families: While the encoder learns to condition on a single PDE, transferring knowledge to entirely new equation families remains an open challenge.
Robustness to noisy or approximate boundary data: Real‑world scenarios often involve measurement noise; incorporating uncertainty handling is a promising direction.

Authors

Yesom Park
Annie C. Lu
Shao‑Ching Huang
Qiyang Hu
Y. Sungtaek Ju
Stanley Osher

Paper Information

arXiv ID: 2602.03816v1
Categories: cs.LG
Published: February 3, 2026
PDF: Download PDF

[Paper] SymPlex: A Structure-Aware Transformer for Symbolic PDE Solving

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning

[Paper] Optimal Derivative Feedback Control for an Active Magnetic Levitation System: An Experimental Study on Data-Driven Approaches

[Paper] Optimal Turkish Subword Strategies at Scale: Systematic Evaluation of Data, Vocabulary, Morphology Interplay

[Paper] Reliable Mislabel Detection for Video Capsule Endoscopy Data