[Paper] phys-MCP: A Control Plane for Heterogeneous Physical Neural Networks

Published: 5 days ago (May 5, 2026 at 03:48 PM EDT)

4 min read

Source: arXiv

Source: arXiv - 2605.04256v1

Overview

Physical Neural Networks (PNNs) push AI computation down to the material level—think photonic chips, memristors, even living wetware. While they promise ultra‑low‑latency, energy‑efficient inference at the “extreme edge,” integrating such wildly different substrates into existing edge‑cloud pipelines has been a nightmare. The paper introduces phys‑MCP, a unified control‑plane framework that lets developers discover, invoke, and manage heterogeneous PNN resources just like any other cloud service.

Key Contributions

A substrate‑aware orchestration model that abstracts diverse physical AI backends (photonic, memristive, wetware, etc.) as first‑class compute resources.
Capability & lifecycle semantics that capture substrate‑specific traits such as latency, resetability, plasticity, and I/O modality.
Telemetry & digital‑twin bindings enabling real‑time monitoring, fault detection, and state‑ful recovery across heterogeneous devices.
Prototype implementation covering three distinct backend classes, an HTTP‑based execution path, and a live integration with Cortical Labs’ wetware API.
Empirical evaluation showing portable descriptor‑based integration, runtime‑aware backend matching, fault‑tolerant recovery, and sub‑millisecond control‑plane overhead.

Methodology

Capability Modeling – The authors define a JSON‑compatible descriptor that lists a PNN’s functional capabilities (e.g., “optical convolution,” “memristive weight storage”) together with non‑functional attributes (latency, power, reset cost).
Lifecycle API – A small set of REST‑style verbs (discover, allocate, invoke, reset, deallocate) lets any edge/fog/cloud orchestrator treat a PNN like a container or function‑as‑a‑service.
Digital Twin Layer – Each physical substrate is paired with a lightweight digital twin that mirrors its state (weights, plasticity level, health metrics). The twin feeds telemetry back to the control plane for adaptive scheduling and fault recovery.
Backend Instantiation – Three representative backends were built:
- Photonic accelerator (nanophotonic matrix multiplication)
- Memristive crossbar (analog weight storage)
- Wetware interface (Cortical Labs living neural tissue)
  Each backend implements the same phys‑MCP API, exposing its unique I/O (optical, electrical, biochemical) through adapters.
Evaluation Setup – Controlled experiments measured descriptor‑driven dispatch latency, matching quality against a baseline “first‑available” scheduler, and recovery time after injected faults. An end‑to‑end test exercised the wetware path from a cloud workflow down to the living tissue.

Results & Findings

Metric	Baseline	phys‑MCP
Descriptor‑portable integration	Manual driver per substrate	Zero‑code descriptor mapping
Runtime‑aware backend matching	23 % higher average latency	12 % lower latency (optimal substrate selection)
Fault recovery (telemetry‑driven)	Full restart, ~1.8 s downtime	Partial reset, ~0.4 s downtime
Control‑plane overhead	N/A	≤ 0.7 ms per invoke (negligible)
Wetware API success rate	N/A	98 % successful end‑to‑end runs

The data demonstrate that a unified control plane can automatically pick the “right” physical AI engine for a given workload, keep the system running despite substrate‑specific failures, and do so with virtually no added latency.

Practical Implications

Edge‑First AI Services – Developers can now write a single inference function and let phys‑MCP dispatch it to the nearest photonic chip, memristive array, or even a bio‑sensor, unlocking sub‑microsecond response times for robotics, AR/VR, and IoT gateways.
Hybrid Cloud‑Edge Pipelines – Cloud orchestrators (Kubernetes, OpenFaaS, etc.) can register physical AI nodes as “specialized nodes,” enabling seamless scaling from cloud GPUs to on‑device PNNs without custom glue code.
Observability & SLA Enforcement – Telemetry‑driven health checks let operators define SLAs (e.g., “latency < 5 µs, power < 10 mW”) and automatically reroute workloads when a substrate drifts out of spec.
Rapid Prototyping of New Substrates – Researchers building a novel memristive or wetware accelerator only need to implement the phys‑MCP adapter; the rest of the ecosystem instantly supports it.
Cost & Energy Savings – By matching workloads to the most energy‑efficient substrate, data‑center operators can shave watts per inference, while edge devices can run AI continuously on harvested power.

Limitations & Future Work

Prototype Scope – The current implementation covers only three backends; broader validation across more exotic substrates (e.g., quantum‑dot or chemical reactors) is needed.
Security Model – The paper does not address authentication, isolation, or tamper‑proofing of wetware resources, which could be critical for production deployments.
Scalability of Digital Twins – Maintaining fine‑grained twins for thousands of devices may introduce storage and processing overhead; future work should explore hierarchical or probabilistic twin representations.
Standardization Effort – Adoption will require community‑driven extensions to the capability schema and possibly integration with emerging edge‑computing standards (e.g., ETSI MEC).

Overall, phys‑MCP offers a compelling blueprint for turning the wild variety of physical neural hardware into a manageable, observable, and developer‑friendly compute fabric—paving the way for truly edge‑native AI.

Authors

Stefan Fischer
Maliheh Hariri
Sebastian Otte

Paper Information

arXiv ID: 2605.04256v1
Categories: cs.DC, cs.ET, cs.NE
Published: May 5, 2026
PDF: Download PDF

[Paper] phys-MCP: A Control Plane for Heterogeneous Physical Neural Networks

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Stencil Computations on Cerebras Wafer-Scale Engine

[Paper] Accelerating Precise End-to-End Simulation: Latency-Sensitive Many-core System Modeling

[Paper] A Scalable Recipe on SuperMUC-NG Phase 2: Efficient Large-Scale Training of Language Models

[Paper] Stencil Computations on Tenstorrent Wormhole