[Paper] Authenticated Workflows: A Systems Approach to Protecting Agentic AI

Published: (February 10, 2026 at 10:04 PM EST)
5 min read
Source: arXiv

Source: arXiv

Source: arXiv:2602.10465v1

Overview

The paper “Authenticated Workflows: A Systems Approach to Protecting Agentic AI” tackles a pressing problem: today’s enterprise‑grade AI agents (e.g., autonomous assistants that trigger code, fetch data, or invoke external services) can be tricked by crafted prompts or malicious tool calls. The authors propose authenticated workflows, a deterministic security layer that cryptographically guarantees that every step of an AI‑driven workflow respects organizational policies and cannot be tampered with.

Key Contributions

  • Authenticated workflow model – Secures the four critical boundaries of an AI agent: prompts, tools, data, and execution context.
  • Introduction of MAPL (Machine‑Agent Policy Language) – An AI‑native policy language that scales logarithmically with the number of policies ( (O(\log M + N)) ) instead of the naïve quadratic blow‑up ( (O(M \times N)) ).
  • Cryptographic attestation for every boundary crossing – Turns “intent” checks into provable statements that can be verified at runtime.
  • Universal security runtime – Plug‑in for nine popular LLM‑oriented frameworks (MCP, A2A, OpenAI, Claude, LangChain, CrewAI, AutoGen, LlamaIndex, Haystack) via thin adapters; no changes to the underlying protocols are required.
  • Formal completeness and soundness proofs for the policy‑enforcement mechanism.
  • Empirical evaluation showing:
    • 100 % recall, zero false positives across 174 test cases,
    • Mitigation of 9/10 OWASP Top 10 risks, and
    • Full remediation of two high‑impact production CVEs.

Methodology

  1. Boundary Identification – The authors decompose an AI‑driven workflow into four trust boundaries:

    • Prompt boundary – what the user or upstream system asks the model to do.
    • Tool boundary – calls to external APIs, code execution, database queries.
    • Data boundary – inputs/outputs that cross storage or network layers.
    • Context boundary – runtime metadata such as user role or session ID.
  2. Policy Specification with MAPL – MAPL lets security teams write declarative rules, e.g.:

    • “Only finance‑role users may invoke the transfer_funds tool.”
    • “All data writes must be signed with the organization’s key.”

    Policies are organized hierarchically, enabling reuse and inheritance, which yields the claimed logarithmic scaling.

  3. Cryptographic Attestation – Every request that crosses a boundary is wrapped in a signed token containing:

    • Intent – the specific operation being requested.
    • Policy hash – ensures the token was generated under the current policy set.
    • Nonce/timestamp – prevents replay attacks.

    The runtime verifies the signature before allowing the operation; a failed verification results in an outright rejection.

  4. Runtime Integration – Thin adapters are built for each target framework. The adapters intercept calls (e.g., langchain.run(), openai.ChatCompletion.create()) and inject the attestation logic without altering the framework’s public API.

  5. Formal Verification & Empirical Testing

    • Formal proof: Under standard cryptographic assumptions, the system is complete (all legitimate operations are accepted) and sound (no malicious operation can be accepted).
    • Empirical evaluation: A battery of 174 test cases covering prompt injection, tool misuse, data exfiltration, and context spoofing validates the claims.

Results & Findings

MetricOutcome
Recall / False Positives100 % recall, 0 % false positives across all test cases
OWASP Top 10 CoverageMitigated 9 of 10 categories (e.g., Injection, Broken Authentication, Security Misconfiguration)
CVE MitigationFully blocked two real‑world CVEs that exploited LLM‑driven tool calls
Performance OverheadAverage latency increase of ~12 ms per boundary check (≈ 0.5 % of typical request latency)
Policy ScalingAdding 10 k policies increased verification time by < 3 % thanks to MAPL’s hierarchical composition

Practical Implications

  • Enterprise AI Governance – Organizations can enforce “who can ask what” and “which tools may be invoked” with provable guarantees, eliminating reliance on fragile prompt‑filter heuristics.
  • Compliance Automation – MAPL policies can be mapped to regulatory requirements (e.g., GDPR data‑access constraints) and automatically audited, since every operation carries a verifiable proof.
  • Secure AI‑Powered SaaS – SaaS providers that expose LLM‑backed APIs can embed authenticated workflow adapters to protect against prompt‑injection attacks that have plagued public models.
  • DevOps Integration – The lightweight adapters can be dropped into CI/CD pipelines; policy updates propagate instantly because the runtime reloads MAPL definitions without redeploying the underlying model services.
  • Vendor‑Neutral Shield – Because the approach works across multiple LLM providers and orchestration frameworks, organizations avoid lock‑in to a single vendor’s security features.

Limitations & Future Work

  • Key Management Complexity – The security guarantees hinge on robust handling of signing keys; the paper assumes a trusted PKI but does not detail rotation or compromise‑recovery procedures.
  • Policy Authoring Overhead – While MAPL reduces rule explosion, writing correct, non‑conflicting policies for large organizations still requires expertise and tooling support.
  • Dynamic Model Updates – Rapid model version changes (e.g., fine‑tuning) may affect the semantics of intents; future work could explore automated policy adaptation to model drift.
  • Performance at Scale – The reported overhead is modest for typical workloads, but large‑scale batch processing or low‑latency edge deployments may need further optimization.
  • Broader Threat Landscape – The evaluation focuses on OWASP Top 10 and known CVEs; emerging attacks (e.g., multi‑modal prompt injection) remain an open research area.

Bottom line: Authenticated workflows bring a cryptographically provable security layer to the fast‑moving world of agentic AI, turning probabilistic guardrails into deterministic guarantees. For developers building AI‑augmented products, the approach offers a pragmatic path to compliance, risk reduction, and confidence that their autonomous agents will do exactly what they’re allowed to do—no more, no less.

Authors

  • Mohan Rajagopalan
  • Vinay Rao

Paper Information

FieldDetails
arXiv ID2602.10465v1
Categoriescs.CR, cs.AI, cs.DC, cs.MA
PublishedFebruary 11, 2026
PDFDownload PDF
0 views
Back to Blog

Related posts

Read more »

A Guide to Fine-Tuning FunctionGemma

markdown January 16, 2026 In the world of Agentic AI, the ability to call tools is what translates natural language into executable software actions. Last month...