[Paper] phys-MCP: A Control Plane for Heterogeneous Physical Neural Networks
Source: arXiv - 2605.04256v1
Overview
Physical Neural Networks (PNNs) push AI computation down to the material level—think photonic chips, memristors, even living wetware. While they promise ultra‑low‑latency, energy‑efficient inference at the “extreme edge,” integrating such wildly different substrates into existing edge‑cloud pipelines has been a nightmare. The paper introduces phys‑MCP, a unified control‑plane framework that lets developers discover, invoke, and manage heterogeneous PNN resources just like any other cloud service.
Key Contributions
- A substrate‑aware orchestration model that abstracts diverse physical AI backends (photonic, memristive, wetware, etc.) as first‑class compute resources.
- Capability & lifecycle semantics that capture substrate‑specific traits such as latency, resetability, plasticity, and I/O modality.
- Telemetry & digital‑twin bindings enabling real‑time monitoring, fault detection, and state‑ful recovery across heterogeneous devices.
- Prototype implementation covering three distinct backend classes, an HTTP‑based execution path, and a live integration with Cortical Labs’ wetware API.
- Empirical evaluation showing portable descriptor‑based integration, runtime‑aware backend matching, fault‑tolerant recovery, and sub‑millisecond control‑plane overhead.
Methodology
- Capability Modeling – The authors define a JSON‑compatible descriptor that lists a PNN’s functional capabilities (e.g., “optical convolution,” “memristive weight storage”) together with non‑functional attributes (latency, power, reset cost).
- Lifecycle API – A small set of REST‑style verbs (
discover,allocate,invoke,reset,deallocate) lets any edge/fog/cloud orchestrator treat a PNN like a container or function‑as‑a‑service. - Digital Twin Layer – Each physical substrate is paired with a lightweight digital twin that mirrors its state (weights, plasticity level, health metrics). The twin feeds telemetry back to the control plane for adaptive scheduling and fault recovery.
- Backend Instantiation – Three representative backends were built:
- Photonic accelerator (nanophotonic matrix multiplication)
- Memristive crossbar (analog weight storage)
- Wetware interface (Cortical Labs living neural tissue)
Each backend implements the same phys‑MCP API, exposing its unique I/O (optical, electrical, biochemical) through adapters.
- Evaluation Setup – Controlled experiments measured descriptor‑driven dispatch latency, matching quality against a baseline “first‑available” scheduler, and recovery time after injected faults. An end‑to‑end test exercised the wetware path from a cloud workflow down to the living tissue.
Results & Findings
| Metric | Baseline | phys‑MCP |
|---|---|---|
| Descriptor‑portable integration | Manual driver per substrate | Zero‑code descriptor mapping |
| Runtime‑aware backend matching | 23 % higher average latency | 12 % lower latency (optimal substrate selection) |
| Fault recovery (telemetry‑driven) | Full restart, ~1.8 s downtime | Partial reset, ~0.4 s downtime |
| Control‑plane overhead | N/A | ≤ 0.7 ms per invoke (negligible) |
| Wetware API success rate | N/A | 98 % successful end‑to‑end runs |
The data demonstrate that a unified control plane can automatically pick the “right” physical AI engine for a given workload, keep the system running despite substrate‑specific failures, and do so with virtually no added latency.
Practical Implications
- Edge‑First AI Services – Developers can now write a single inference function and let phys‑MCP dispatch it to the nearest photonic chip, memristive array, or even a bio‑sensor, unlocking sub‑microsecond response times for robotics, AR/VR, and IoT gateways.
- Hybrid Cloud‑Edge Pipelines – Cloud orchestrators (Kubernetes, OpenFaaS, etc.) can register physical AI nodes as “specialized nodes,” enabling seamless scaling from cloud GPUs to on‑device PNNs without custom glue code.
- Observability & SLA Enforcement – Telemetry‑driven health checks let operators define SLAs (e.g., “latency < 5 µs, power < 10 mW”) and automatically reroute workloads when a substrate drifts out of spec.
- Rapid Prototyping of New Substrates – Researchers building a novel memristive or wetware accelerator only need to implement the phys‑MCP adapter; the rest of the ecosystem instantly supports it.
- Cost & Energy Savings – By matching workloads to the most energy‑efficient substrate, data‑center operators can shave watts per inference, while edge devices can run AI continuously on harvested power.
Limitations & Future Work
- Prototype Scope – The current implementation covers only three backends; broader validation across more exotic substrates (e.g., quantum‑dot or chemical reactors) is needed.
- Security Model – The paper does not address authentication, isolation, or tamper‑proofing of wetware resources, which could be critical for production deployments.
- Scalability of Digital Twins – Maintaining fine‑grained twins for thousands of devices may introduce storage and processing overhead; future work should explore hierarchical or probabilistic twin representations.
- Standardization Effort – Adoption will require community‑driven extensions to the capability schema and possibly integration with emerging edge‑computing standards (e.g., ETSI MEC).
Overall, phys‑MCP offers a compelling blueprint for turning the wild variety of physical neural hardware into a manageable, observable, and developer‑friendly compute fabric—paving the way for truly edge‑native AI.
Authors
- Stefan Fischer
- Maliheh Hariri
- Sebastian Otte
Paper Information
- arXiv ID: 2605.04256v1
- Categories: cs.DC, cs.ET, cs.NE
- Published: May 5, 2026
- PDF: Download PDF