[Paper] Authenticated Workflows: A Systems Approach to Protecting Agentic AI

Published: (February 10, 2026 at 10:04 PM EST)
5 min read
Source: arXiv

Source: arXiv - 2602.10465v1

Overview

The paper “Authenticated Workflows: A Systems Approach to Protecting Agentic AI” tackles a pressing problem: today’s enterprise‑grade AI agents (e.g., autonomous assistants that trigger code, fetch data, or invoke external services) can be tricked by crafted prompts or malicious tool calls. The authors propose authenticated workflows, a deterministic security layer that cryptographically guarantees that every step of an AI‑driven workflow respects organizational policies and cannot be tampered with.

Key Contributions

  • Authenticated workflow model that secures the four critical boundaries of an AI agent: prompts, tools, data, and execution context.
  • Introduction of MAPL (Machine‑Agent Policy Language), an AI‑native policy language that scales logarithmically with the number of policies ( O(log M + N) ) instead of the naïve quadratic blow‑up ( O(M × N) ).
  • Cryptographic attestation for every boundary crossing, turning “intent” checks into provable statements that can be verified at runtime.
  • A universal security runtime that plugs into nine popular LLM‑oriented frameworks (MCP, A2A, OpenAI, Claude, LangChain, CrewAI, AutoGen, LlamaIndex, Haystack) via thin adapters—no changes to the underlying protocols are required.
  • Formal completeness and soundness proofs for the policy enforcement mechanism.
  • Empirical evaluation showing 100 % recall, zero false positives across 174 test cases, mitigation of 9/10 OWASP Top 10 risks, and full remediation of two high‑impact production CVEs.

Methodology

  1. Boundary Identification – The authors decompose an AI‑driven workflow into four “trust boundaries”:

    • Prompt boundary (what the user or upstream system asks the model to do)
    • Tool boundary (calls to external APIs, code execution, database queries)
    • Data boundary (inputs/outputs that cross storage or network layers)
    • Context boundary (runtime metadata such as user role, session ID)
  2. Policy Specification with MAPL – MAPL lets security teams write declarative rules like “Only finance‑role users may invoke the transfer_funds tool” or “All data writes must be signed with the organization’s key”. Policies are organized hierarchically, enabling reuse and inheritance, which yields the logarithmic scaling.

  3. Cryptographic Attestation – Every request that crosses a boundary is wrapped in a signed token that includes:

    • The intent (the specific operation)
    • The policy hash (ensuring the token was generated under the current policy set)
    • A nonce/timestamp (prevent replay attacks)

    The runtime verifies the signature before allowing the operation. If verification fails, the request is rejected outright.

  4. Runtime Integration – Thin adapters are built for each target framework. The adapters intercept calls (e.g., langchain.run(), openai.ChatCompletion.create()) and inject the attestation logic without altering the framework’s public API.

  5. Formal Verification & Empirical Testing – The authors prove that, under the assumed cryptographic primitives, the system is complete (all legitimate operations are accepted) and sound (no malicious operation can be accepted). They then run a battery of 174 test cases covering prompt injection, tool misuse, data exfiltration, and context spoofing.

Results & Findings

MetricOutcome
Recall / False Positives100 % recall, 0 % false positives across all test cases
OWASP Top 10 CoverageMitigated 9 of 10 categories (e.g., Injection, Broken Authentication, Security Misconfiguration)
CVE MitigationFully blocked two real‑world CVEs that exploited LLM‑driven tool calls
Performance OverheadAverage latency increase of ~12 ms per boundary check (≈ 0.5 % of typical request latency)
Policy ScalingAdding 10 k policies increased verification time by < 3 % thanks to MAPL’s hierarchical composition

These numbers demonstrate that deterministic, cryptographic enforcement can be added to existing AI stacks without sacrificing responsiveness.

Practical Implications

  • Enterprise AI Governance – Companies can now enforce “who can ask what” and “which tools may be invoked” with provable guarantees, reducing reliance on fragile prompt‑filter heuristics.
  • Compliance Automation – MAPL policies can be tied to regulatory requirements (e.g., GDPR data‑access constraints) and automatically audited because every operation carries a verifiable proof.
  • Secure AI‑Powered SaaS – SaaS providers that expose LLM‑backed APIs can embed authenticated workflow adapters to protect against prompt injection attacks that have plagued public models.
  • DevOps Integration – The thin adapters can be dropped into CI/CD pipelines; policy updates propagate instantly because the runtime re‑loads MAPL definitions without redeploying the underlying model services.
  • Vendor‑Neutral Shield – Since the approach works across multiple LLM providers and orchestration frameworks, organizations are not locked into a single vendor’s security features.

Limitations & Future Work

  • Key Management Complexity – The security guarantees hinge on robust handling of signing keys; the paper assumes a trusted PKI but does not detail rotation or compromise recovery procedures.
  • Policy Authoring Overhead – While MAPL reduces rule explosion, writing correct, non‑conflicting policies for large organizations still requires expertise and tooling support.
  • Dynamic Model Updates – Rapid model version changes (e.g., fine‑tuning) may affect the semantics of intents; future work could explore automated policy adaptation to model drift.
  • Performance at Scale – The reported overhead is modest for typical workloads, but large‑scale batch processing or low‑latency edge deployments may need further optimization.
  • Broader Threat Landscape – The evaluation focuses on OWASP Top 10 and known CVEs; emerging attacks (e.g., multi‑modal prompt injection) remain an open research area.

Bottom line: Authenticated workflows bring a cryptographically provable security layer to the fast‑moving world of agentic AI, turning probabilistic guardrails into deterministic guarantees. For developers building AI‑augmented products, the approach offers a pragmatic path to compliance, risk reduction, and confidence that their autonomous agents will do exactly what they’re allowed to do—no more, no less.*

Authors

  • Mohan Rajagopalan
  • Vinay Rao

Paper Information

  • arXiv ID: 2602.10465v1
  • Categories: cs.CR, cs.AI, cs.DC, cs.MA
  • Published: February 11, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »