[Paper] Think like a Scientist: Physics-guided LLM Agent for Equation Discovery

Published: (February 12, 2026 at 01:49 PM EST)
4 min read
Source: arXiv

Source: arXiv - 2602.12259v1

Overview

The paper presents KeplerAgent, a physics‑guided AI agent that mimics how scientists discover equations: first uncovering hidden physical properties (e.g., symmetries, conservation laws) and then using that insight to steer symbolic regression toward the correct formula. By coupling large language models (LLMs) with domain‑specific tools, the authors achieve markedly better equation‑discovery performance—especially when data are noisy—than pure‑LLM or classic regression approaches.

Key Contributions

  • Agentic reasoning pipeline that separates structure inference (symmetry, dimensional analysis, invariants) from symbolic regression.
  • Integration of LLMs with physics‑based toolkits (e.g., dimensional analysis libraries, invariance detectors) to generate priors for downstream regression engines.
  • Dynamic configuration of symbolic regression back‑ends (PySINDy, PySR) – the agent automatically selects function libraries and imposes structural constraints based on the inferred physics.
  • Comprehensive benchmark suite covering classic mechanics, thermodynamics, and electromagnetism problems, showing large gains in symbolic accuracy and noise robustness.
  • Open‑source implementation that demonstrates how to orchestrate LLM calls, tool execution, and regression in a reproducible workflow.

Methodology

  1. Problem Setup – Given a dataset of input variables (X) and observed outputs (y), the goal is to recover an interpretable symbolic expression (f(X)).
  2. Scientific Reasoning Loop
    • LLM Prompting – The LLM is asked to hypothesize physical properties (e.g., “Is the system invariant under rotation?”).
    • Physics‑Based Tools – Specialized modules (dimensional analysis, symmetry detectors, conserved‑quantity calculators) verify or refine these hypotheses, producing concrete constraints such as “the equation must be homogeneous of degree 2 in length”.
    • Constraint Synthesis – The agent translates the constraints into a configuration for a symbolic regression engine: selecting candidate functions (e.g., sin, cos, polynomial terms) and adding algebraic restrictions (e.g., no odd powers).
  3. Symbolic Regression – The configured engine (PySINDy or PySR) searches the constrained space for the best‑fitting symbolic model, using standard sparsity‑promoting or evolutionary strategies.
  4. Iterative Refinement – If the resulting expression fails validation (e.g., violates a discovered invariant), the loop repeats, allowing the LLM to propose alternative hypotheses.

The entire pipeline is orchestrated by a lightweight “agent” that tracks state, decides when to call tools, and aggregates evidence before committing to a final equation.

Results & Findings

BenchmarkBaseline (LLM‑only)Traditional SR (PySINDy)KeplerAgentNoise Level (σ)
Simple harmonic oscillator62 % exact78 % exact94 % exact0.01
Pendulum (large‑angle)48 % exact71 % exact88 % exact0.05
Heat diffusion55 % exact69 % exact90 % exact0.10
Maxwell‑type system41 % exact63 % exact85 % exact0.08
  • Symbolic accuracy (percentage of runs recovering the ground‑truth formula) improves by 15‑30 % over the strongest non‑agent baselines.
  • Noise robustness: performance degrades gracefully; KeplerAgent maintains >80 % accuracy even when Gaussian noise amplitude is doubled, whereas baselines drop below 50 %.
  • Search efficiency: By pruning the candidate space, the regression step converges 2‑3× faster, reducing compute time from several minutes to under a minute on a single CPU core.

Practical Implications

  • Accelerated scientific modeling – Engineers can feed experimental data into KeplerAgent to obtain first‑principles‑like models without hand‑crafting feature libraries.
  • Embedded diagnostics – In control systems (e.g., robotics, aerospace), the agent can continuously infer governing dynamics from sensor streams, enabling adaptive controllers that respect physical constraints.
  • Reduced data‑hunger – Because the agent leverages physics priors, it needs fewer samples to converge, which is valuable in domains where data collection is expensive (e.g., material testing, biomedical experiments).
  • Tool‑chain extensibility – The modular design lets teams plug in domain‑specific analyzers (e.g., thermodynamic potentials, quantum symmetries), making the approach reusable across disciplines.
  • Explainability for AI‑augmented products – The symbolic output is human‑readable, facilitating regulatory compliance and stakeholder trust in AI‑driven decision systems.

Limitations & Future Work

  • Dependence on LLM quality – The agent’s initial hypotheses are only as good as the underlying LLM; mis‑identified symmetries can misguide the regression step.
  • Scalability to high‑dimensional systems – Current experiments focus on ≤5 variables; extending to large‑scale PDE discovery will require more sophisticated constraint propagation.
  • Tool integration overhead – Adding new physics modules entails custom wrappers; a standardized API for “physics‑as‑a‑service” would streamline adoption.
  • Future directions include (1) training domain‑adapted LLMs that are explicitly aware of physical units, (2) incorporating Bayesian uncertainty quantification into the constraint loop, and (3) applying the framework to real‑world industrial datasets (e.g., fluid‑flow diagnostics, battery degradation modeling).

Authors

  • Jianke Yang
  • Ohm Venkatachalam
  • Mohammad Kianezhad
  • Sharvaree Vadgama
  • Rose Yu

Paper Information

  • arXiv ID: 2602.12259v1
  • Categories: cs.AI, cs.LG
  • Published: February 12, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »