[Paper] El Agente Gráfico: Structured Execution Graphs for Scientific Agents
Source: arXiv
Source: arXiv:2602.17902v1
Overview
The paper introduces El Agente Gráfico, a single‑agent framework that couples large language models (LLMs) with a type‑safe execution environment and a dynamic knowledge‑graph “memory”.
- Typed scientific state – Represented as Python objects with explicit types.
- Graph persistence – Objects are stored in a graph database, enabling structured, auditable identifiers instead of fragile free‑form text prompts.
The authors demonstrate that this approach can drive complex, multi‑step scientific workflows more reliably than existing multi‑agent, prompt‑centric pipelines. Example applications include:
- Quantum‑chemistry benchmarking
- Conformer generation
- Metal‑organic framework (MOF) design
Key Contributions
- Structured Execution Graphs – Introduces an object‑graph mapper that converts computational state into typed Python objects, stored either in‑memory or in an external knowledge graph.
- Type‑Safe Context Management – Replaces raw textual context with symbolic, type‑checked identifiers, improving consistency and provenance tracking.
- Single‑Agent Architecture – Demonstrates that a single LLM‑driven agent, when paired with a reliable execution engine, can replace fragile multi‑agent orchestration.
- Benchmark Suite – Provides an automated benchmarking framework for university‑level quantum‑chemistry tasks, reproducing results previously obtained with a multi‑agent system.
- Domain Extensions – Shows the paradigm applied to two additional scientific domains—conformer ensemble generation and MOF design—using the knowledge graph as both memory and reasoning substrate.
- Open‑Source Prototype – Releases a reference implementation (Python library + Neo4j‑backed graph) to encourage community adoption and further research.
Methodology
-
Abstraction Layer – Scientific concepts (e.g., molecules, calculations, results) are defined as Python classes with explicit type annotations.
-
Object‑Graph Mapper (OGM) – Instances of these classes are automatically serialized into nodes/relationships in a graph database (Neo4j). The OGM maintains a bidirectional link between in‑memory objects and persisted graph entities.
-
LLM Decision Engine – An LLM (e.g., GPT‑4) receives a concise, typed prompt that references objects by their symbolic IDs (e.g.,
Molecule:123). The model decides which tool to invoke next (e.g., geometry optimization, TD‑DFT). -
Typed Execution Engine – A thin Python wrapper validates the LLM’s suggested action against the object’s type signature, then dispatches the appropriate external tool (Gaussian, ORCA, RDKit, etc.).
-
Provenance Capture – Every tool invocation, input, and output is recorded as graph edges, enabling full audit trails and reproducible pipelines.
-
Evaluation – The authors built three pipelines—quantum‑chemistry benchmarking, conformer ensemble generation, and MOF design—each executing dozens of parallel jobs and comparing success rates, runtime, and reproducibility against a prior multi‑agent baseline.
Results & Findings
| Domain | Success Rate (vs. baseline) | Avg. Runtime Reduction | Provenance Overhead |
|---|---|---|---|
| Quantum‑chemistry benchmark (≈30 tasks) | 96 % (↑ 8 pts) | 22 % faster | < 2 % |
| Conformer ensemble generation (100 mol.) | 94 % (↑ 10 pts) | 18 % faster | < 3 % |
| MOF design (20 candidate frameworks) | 92 % (↑ 12 pts) | 25 % faster | < 2 % |
- Robustness – The single‑agent system completed all pipelines without the deadlocks or context‑drift issues that plagued the multi‑agent version.
- Scalability – Parallel execution of up to 12 concurrent jobs was handled cleanly, with the knowledge graph efficiently indexing intermediate results.
- Auditability – Researchers could query the graph to retrieve the exact sequence of decisions, inputs, and tool versions that produced any result, facilitating reproducibility.
Practical Implications
-
Developer‑Friendly Automation
- Exposing a typed API (instead of raw prompt engineering) lets developers embed LLM‑driven decision logic directly into existing CI/CD pipelines for scientific software.
-
Tool Orchestration Platforms
- Cloud providers and workflow engines such as Airflow or Prefect can adopt the OGM pattern to give LLMs a reliable control plane for launching domain‑specific tools.
-
Regulatory & Auditing Needs
- Industries like pharmaceuticals, materials, and chemicals can satisfy compliance requirements because every computational step is recorded in a queryable graph.
-
Reduced Maintenance
- A single, well‑defined agent eliminates the need to synchronize multiple prompt‑tuned bots, lowering operational overhead and simplifying debugging.
-
Extensibility
- New scientific domains can be onboarded by defining additional typed classes and registering corresponding tool wrappers—no redesign of the core agent is required.
Limitations & Future Work
- LLM Dependency: The system still relies on the quality of the underlying LLM; hallucinations in tool selection can propagate errors despite type checks.
- Graph Overhead: While modest, persisting every intermediate object may become costly for extremely large datasets (e.g., high‑throughput screening of millions of compounds).
- Tool Integration Scope: Current prototypes support a limited set of quantum‑chemistry packages; broader adoption will require wrappers for more diverse scientific software.
- User‑Facing Interfaces: The paper focuses on backend orchestration; future work should explore UI/UX layers that let domain scientists interact with the knowledge graph without programming.
- Distributed Execution: Scaling beyond a single node (e.g., across HPC clusters) and handling graph consistency in a distributed setting remain open challenges.
El Agente Gráfico showcases how marrying LLM reasoning with typed, graph‑backed state can turn fragile prompt‑centric bots into reliable scientific assistants—an approach that could reshape automation across computational research domains.
Authors
- Alán Aspuru‑Guzik
- Abdulrahman Aldossary
- Jiaru Bai
- Marcel Müller
- Thomas Swanick
- Yeonghun Kang
- Zijian Zhang
- Jin Won Lee
- Tsz Wai Ko
- Mohammad Ghazi Vakili
- Varinia Bernales
Paper Information
| Field | Details |
|---|---|
| arXiv ID | 2602.17902v1 |
| Categories | cs.AI, cs.MA, cs.SE, physics.chem-ph |
| Published | February 19, 2026 |
| Download PDF |