[Paper] Agentic AI for Scalable and Robust Optical Systems Control
Source: arXiv - 2602.20144v1
Overview
The paper introduces AgentOptics, a novel “agentic AI” framework that lets users control complex optical hardware through natural‑language commands. By wrapping diverse devices in a unified Model Context Protocol (MCP) and exposing 64 standardized tool primitives, the system can understand, coordinate, and execute multi‑step tasks with reliability that rivals (and often exceeds) traditional code‑generation approaches.
Key Contributions
- MCP‑based abstraction layer – 64 ready‑to‑use tool definitions that map high‑level language requests to low‑level device APIs across eight common optical instruments.
- AgentOptics framework – an LLM‑driven agent that parses natural‑language intents, selects the appropriate MCP tools, and orchestrates multi‑device workflows autonomously.
- Comprehensive benchmark – a 410‑task suite covering request comprehension, role‑aware responses, multi‑step coordination, linguistic robustness, and error handling.
- Empirical evaluation – comparison of commercial online LLMs, locally hosted open‑source LLMs, and code‑generation baselines, showing 87.7 %–99.0 % task success for AgentOptics vs. ≤50 % for code generation.
- Real‑world case studies – five end‑to‑end demonstrations ranging from DWDM link provisioning to closed‑loop polarization stabilization and distributed acoustic sensing (DAS) with LLM‑assisted event detection.
Methodology
-
Tool Abstraction via MCP
- Each optical device (e.g., tunable lasers, wavelength‑selective switches, polarization controllers) is wrapped in a set of tool functions (e.g.,
set_wavelength,measure_power). - The MCP defines a JSON‑compatible schema for inputs/outputs, enabling any LLM that understands the protocol to invoke the tools safely.
- Each optical device (e.g., tunable lasers, wavelength‑selective switches, polarization controllers) is wrapped in a set of tool functions (e.g.,
-
Agent Architecture
- An LLM (either a commercial API like GPT‑4 or an open‑source model such as LLaMA‑2) acts as the reasoning core.
- The agent receives a natural‑language task, decomposes it into a plan, selects the relevant MCP tools, and iteratively calls them while checking for errors or ambiguous responses.
-
Benchmark Construction
- 410 tasks were handcrafted to stress different dimensions: simple single‑device commands, multi‑device orchestration, variations in phrasing, and intentional error injection.
-
Evaluation Protocol
- Success is measured by whether the final system state matches the intended outcome, with partial credit for correct sub‑steps.
- Baselines include LLM‑generated Python scripts that are executed on the same hardware stack.
Results & Findings
| Configuration | Avg. Success Rate | Notable Strengths |
|---|---|---|
| Commercial LLM (GPT‑4) + AgentOptics | 99.0 % | Handles ambiguous phrasing, recovers from tool errors |
| Open‑source LLM (LLaMA‑2‑13B) + AgentOptics | 87.7 % | Competitive even without cloud services |
| Code‑generation baseline (LLM → Python) | ≤50 % | Struggles with multi‑step coordination and error handling |
- Robustness to language variation: Success rates dropped less than 2 % when tasks were re‑phrased or included colloquial terms.
- Error handling: The agent detected and corrected 94 % of injected tool‑level failures (e.g., out‑of‑range parameters).
- Scalability: Adding a new device required only defining its MCP tools; the same agent logic reused across all case studies.
Practical Implications
- Rapid prototyping: Engineers can spin up new optical testbeds or production lines by describing desired configurations in plain English, cutting weeks of manual scripting.
- Unified ops platform: Data‑center operators managing heterogeneous DWDM, ARoF, and fiber‑sensing equipment can rely on a single AI‑driven interface instead of juggling vendor‑specific GUIs or APIs.
- Closed‑loop optimization: The agent can continuously monitor performance metrics (e.g., OSNR, polarization drift) and invoke corrective MCP tools in real time, enabling self‑healing links.
- Edge deployment: Open‑source LLMs run locally, making the approach viable for secure, air‑gapped facilities where cloud access is prohibited.
- Developer ergonomics: The MCP tool schema is language‑agnostic, so existing automation pipelines (Python, Go, Rust) can call the same functions, easing integration with CI/CD and observability stacks.
Limitations & Future Work
- Model dependence: Success rates hinge on the underlying LLM’s reasoning ability; smaller open‑source models still lag behind commercial APIs in edge cases.
- Tool definition overhead: While MCP standardizes the interface, each new device still requires a manual mapping of vendor commands to tool specs.
- Safety guarantees: The current system relies on runtime error checks but lacks formal verification of tool sequences—important for safety‑critical telecom infrastructure.
- Future directions:
- Automating MCP tool generation via API introspection or language‑model‑driven code synthesis.
- Incorporating reinforcement learning to let the agent improve its orchestration policies from real‑world feedback.
- Extending the framework to other physical domains (e.g., RF front‑ends, quantum photonics) to test cross‑disciplinary scalability.
Authors
- Zehao Wang
- Mingzhe Han
- Wei Cheng
- Yue-Kai Huang
- Philip Ji
- Denton Wu
- Mahdi Safari
- Flemming Holtorf
- Kenaish AlQubaisi
- Norbert M. Linke
- Danyang Zhuo
- Yiran Chen
- Ting Wang
- Dirk Englund
- Tingjun Chen
Paper Information
- arXiv ID: 2602.20144v1
- Categories: eess.SY, cs.AI, cs.NI
- Published: February 23, 2026
- PDF: Download PDF