[Paper] MOOSEnger -- a Domain-Specific AI Agent for the MOOSE Ecosystem

Published: 1 day ago (March 4, 2026 at 10:06 PM EST)

4 min read

Source: arXiv

Source: arXiv - 2603.04756v1

Overview

The paper introduces MOOSEnger, a conversational AI agent built specifically for the Multiphysics Object‑Oriented Simulation Environment (MOOSE). By turning natural‑language requests into valid MOOSE input files, it dramatically speeds up the traditionally tedious setup and debugging phases of multiphysics simulations.

Key Contributions

Domain‑specific AI agent that couples Retrieval‑Augmented Generation (RAG) with deterministic, MOOSE‑aware parsing and validation.
Core‑plus‑plugin architecture separating reusable agent infrastructure from a lightweight MOOSE plugin (HIT‑file parsing, syntax‑preserving ingestion, repair utilities).
Input pre‑check pipeline that automatically cleans hidden formatting artifacts, fixes malformed HIT structures, and resolves unknown object types via similarity search against a curated syntax registry.
Closed‑loop execution backend that runs the generated input through the actual MOOSE runtime (via MCP) and feeds solver diagnostics back into the conversation for iterative correction.
Comprehensive evaluation suite reporting RAG metrics (faithfulness, relevance, context precision/recall) and end‑to‑end execution success on a 125‑prompt benchmark covering five major physics domains.
Performance boost: 93 % of generated inputs run successfully, compared with only 8 % for a vanilla LLM baseline.

Methodology

Retrieval‑Augmented Generation – When a user asks a question (e.g., “Set up a diffusion problem with a 10 mm slab”), the system first pulls the most relevant snippets from a curated MOOSE documentation and example repository.
Deterministic Parsing – The retrieved text is fed to a HIT‑aware parser that respects MOOSE’s strict input syntax (the “.i” files). The parser builds an abstract representation of the simulation case.
Pre‑check & Repair – A grammar‑constrained loop scans the representation for hidden characters, mismatched braces, or unknown object names. Unknown names are resolved by a similarity search against an application‑syntax registry, effectively “guessing” the intended MOOSE object.
Validation & Smoke‑Testing – The repaired input is validated against MOOSE’s schema and optionally executed on a lightweight runtime (MCP). Solver messages (errors, warnings, convergence info) are captured.
Iterative Feedback – Diagnostic messages are transformed into natural‑language hints and sent back to the LLM, which then refines the input. This loop repeats until the simulation passes the execution check.
Evaluation – The authors log RAG quality metrics and the final pass/fail status for each prompt, enabling a transparent comparison with a baseline LLM that lacks the domain‑specific tooling.

Results & Findings

Metric	MOOSEnger	LLM‑only baseline
Execution pass rate	0.93 (110/118)	0.08 (≈9/118)
Faithfulness (RAG)	0.96	0.71
Context precision	0.94	0.62
Context recall	0.92	0.58
Average correction cycles per prompt	1.3	4.7

Interpretation: The deterministic parsing and execution‑in‑the‑loop feedback are the primary drivers of the success gap. Even when the LLM produces syntactically plausible text, without the pre‑check and runtime validation it frequently generates inputs that MOOSE cannot parse or that violate physics constraints.

Practical Implications

Faster onboarding – New users can spin up complex multiphysics cases by simply describing what they need, cutting weeks of manual input file authoring down to minutes.
Reduced debugging time – The automatic pre‑check catches hidden formatting bugs (e.g., stray Unicode characters) that often cause cryptic MOOSE errors, saving developers from tedious trial‑and‑error cycles.
Continuous integration – MOOSEnger can be embedded in CI pipelines to auto‑generate and validate simulation inputs whenever a new physics module is added, ensuring regressions are caught early.
Extensibility to other DSLs – The core‑plus‑plugin design shows a clear path for building similar agents for other domain‑specific languages (e.g., OpenFOAM dictionaries, Abaqus input files).
Improved reproducibility – By storing the conversational transcript alongside the generated .i file, teams gain a provenance trail that explains why a particular configuration was chosen.

Limitations & Future Work

Scope of physics covered – The benchmark, while diverse, still represents a subset of MOOSE’s full capability set; exotic modules may need additional retrieval sources.
Dependency on curated docs – Retrieval quality hinges on the completeness and up‑to‑date nature of the documentation corpus; stale examples can mislead the agent.
Runtime cost – The smoke‑testing loop requires a local or remote MOOSE execution environment, which may be heavyweight for very large models.
Generalization – The similarity‑search repair mechanism works well for misspelled object names but may struggle with entirely novel user intents that lack a close example.

Future directions include expanding the retrieval corpus with community‑contributed notebooks, integrating lightweight surrogate solvers for faster feedback, and exposing a REST API so that IDE plugins or web front‑ends can leverage MOOSEnger directly.

Authors

Mengnan Li
Jason Miller
Zachary Prince
Alexander Lindsay
Cody Permann

Paper Information

arXiv ID: 2603.04756v1
Categories: cs.AI, cs.CE, cs.SE
Published: March 5, 2026
PDF: Download PDF

[Paper] MOOSEnger -- a Domain-Specific AI Agent for the MOOSE Ecosystem

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] RoboPocket: Improve Robot Policies Instantly with Your Phone

[Paper] POET-X: Memory-efficient LLM Training by Scaling Orthogonal Transformation

[Paper] The Spike, the Sparse and the Sink: Anatomy of Massive Activations and Attention Sinks

[Paper] Cheap Thrills: Effective Amortized Optimization Using Inexpensive Labels