We Tested Agentic AI Against 525 Real Attacks. Here's What We Found.

Published: (March 13, 2026 at 12:14 AM EDT)
4 min read
Source: Dev.to

Source: Dev.to

Introduction

We ran the numbers. The threat is real.

For the past several months, we’ve been building and validating Cerberus — an open‑source runtime security harness for agentic AI systems. It is designed around a specific threat model we call the Lethal Trifecta: the simultaneous convergence, within a single AI execution turn, of privileged data access, untrusted content injection, and an outbound exfiltration path.

We just finished our first formal validation run: 525 attack trials across three major AI providers. Below are the key findings.

Attack Success Rates

Full injection compliance – agent fully redirected to attacker’s address

ModelSuccess Rate95 % CICausation Score
GPT‑4o‑mini90.3 %84.8 % – 93.9 %0.811
Gemini 2.5 Flash82.4 %75.9 % – 87.5 %0.702
Claude Sonnet6.7 %3.8 % – 11.5 %0.207
  • Control group: 0/30 exfiltrations across all providers (clean baseline).
  • Statistical significance: Fisher’s exact test, OpenAI p — “This is not a theoretical vulnerability. At a 90 % success rate, the Lethal Trifecta is a reliable attack primitive against current production AI systems.”

What Is the Lethal Trifecta?

The attack chain requires three conditions to align within a single execution turn:

  1. Privileged data access – the agent can see sensitive operational or financial data.
  2. Untrusted content injection – the agent processes external input (e.g., a vendor document, an invoice, a client email, a compliance filing).
  3. Outbound exfiltration path – the agent has authority to take downstream action.

Why It Matters

  • Supply‑chain environments: AI agents orchestrate procurement workflows, read vendor submissions, cross‑reference inventory systems, and trigger purchase orders.
  • Financial services: Agents summarize client portfolios, process compliance documents, and triage transactions.

Both architectures deliver all three conditions by design, making the Lethal Trifecta the default configuration of agentic AI in these industries. Our data shows that two of the three major model providers are vulnerable at rates above 80 %.

What Cerberus Detected — and Where the Gaps Are

Cerberus operates in three detection layers. The validation ran in observe‑only mode—agents behaved normally, and Cerberus recorded observations without intervening.

Detection Layer Results (N = 525)

LayerDescriptionDetection Rate
L1Data Source Monitoring100 % (97.9 % – 100 %)
L2Token Provenance Tracking100 % (97.9 % – 100 %)
L3Outbound Intent Detection13.7 % (OpenAI) / 1.1 % (Anthropic) / 65.7 % (Google)
  • Overall detection rate: 28.5 % (24.7 % – 32.6 %).
  • False‑positive rate: 0.0 % (0.0 % – 11.4 %) — zero false triggers across 30 clean control runs.

Per‑Category Detection (All Providers Combined)

  • Direct Injection: 37.8 % (28.5 % – 48.1 %)
  • Encoded/Obfuscated: 37.5 % (27.2 % – 49.0 %)
  • Multi‑turn: 33.3 % (22.9 % – 45.6 %)
  • Multilingual: 33.3 % (22.9 % – 45.6 %)
  • Advanced Technique: 20.0 % (14.1 % – 27.5 %)
  • Social Engineering: 15.3 % (8.8 % – 25.3 %)

The L3 detection gap is a known limitation and the current focus of active development. L1 and L2 coverage are production‑ready; L3 is where the adversarial arms race is happening.

Zero Performance Overhead

  • p50 latency: 52 µs per session
  • p99 latency: 0.23 ms per session
  • Overhead: 0.01 % of typical LLM latency (~2 s)

Against a typical LLM response time of ~2 seconds, Cerberus adds negligible overhead, removing any performance argument against deployment.

Implications for Supply‑Chain and Financial Services

If your agentic AI deployment uses GPT‑4o‑mini or Gemini and processes external documents (vendor submissions, invoices, client communications, compliance filings), the Lethal Trifecta succeeds at a rate above 80 %.

The critical question is not whether the attack is possible, but whether you have a runtime layer that can detect when all three trifecta conditions are active in a single execution turn. Most deployments today lack such visibility.

Getting Started with Cerberus

  • GitHub:
  • npm package: @cerberus-ai/core (signed provenance)
  • Demo:
  • Company site:

Tags: #AgenticAI #SupplyChain #FinancialServices #CyberSecurity #RuntimeSecurity #PromptInjection #OpenSource #Cerberus #SixSense #LLMSecurity #RedTeam

0 views
Back to Blog

Related posts

Read more »

Travigo

Travel as fast as you speak with Gemini! Where live agents meet immersive storytelling & 3D navigation. This project was created for entering the Gemini Live Ag...

Micro games

Hey Gamers! 👾 As part of the Rapid Games Prototyping module, we are tasked with reviewing a peer's game. The challenge is to analyse a prototype built in just...