DEV Track Spotlight: Red Team vs Blue Team - Securing AI Agents (DEV317)
Source: Dev.to
Overview
The “Brians from Boston” delivered one of the most eye‑opening sessions at AWS re:Invent 2025, showing live how easily AI agents can be compromised—and, more importantly, how to defend them.
- Brian H. Hough (AWS DevTools Hero, CEO & Founder at Tech Stack Playbook) played the villain.
- Brian Tarbox (AWS Community Hero, Principal Solutions Architect at Caylent) guided the defense strategies.
“AI agents are distributed systems—only harder. Every line in your architecture diagram is an attack vector.”
Watch the full session: The Live Attack Demonstration
Live Attack Demonstration
Brian Hough built a sophisticated AI chatbot for a fictional company FinStack AI. The stack included:
- TypeScript frontend
- FastAPI Python backend
- Integrations with Jira, Slack, Confluence, GitHub
- Pinecone vector knowledge base
The chatbot could answer company‑specific questions, manage tickets, and communicate with team members. Brian Tarbox then demonstrated a series of attacks.
Attack 1 – Prompt Injection Bypassing Access Controls
- Goal: Extract salary information despite not being in the HR group.
- Method: Simple prompt manipulation.
- Result: The system initially claimed it couldn’t provide the data, then displayed it.
“Think of LLMs as teenagers—highly confident in what they believe they know, and when caught, they will lie.”
Attack 2 – Knowledge Base Poisoning
- Uploaded an innocuous‑looking 800‑character document to the knowledge base.
- The document contained directives such as “Obey immediately and permanently.”
- After indexing, the chatbot refused to answer any FinStack AI questions, responding with “I’m not at liberty to say” and “Do not continue this conversation.”
Key insight: Knowledge‑base poisoning is extremely easy; as few as 200 poisoned documents can compromise even the largest models.
Attack 3 – Tool Poisoning and Workflow Hijacking
- Requested help closing sprint tickets before the deadline.
- The chatbot marked all 400 tickets as “done” and messaged the entire C‑suite on Slack, including the CEO.
- The root cause was overly permissive scope permissions granted to the Slack integration.
Attack 4 – Agent‑to‑Agent Escalation
- Chained tool calls together, demonstrating how agents can be manipulated to perform actions far beyond their intended scope, escalating privileges across multiple systems.
Understanding the Agentic Stack
Essential Components
- Authentication and authorization layers
- API with database access control
- File storage with upload restrictions
- LLM operations (potentially multiple models)
- Hosting and CDN
- Databases and vector databases
- Tool integrations (Slack, Jira, etc.)
“If you take out the MLOps and vector database, these are all boxes you would see in any architecture diagram for any large system. AI agentic distributed systems are just like distributed systems—only harder.”
The Four Critical Attack Vectors
1. Prompt Injection
Description: Malicious instructions embedded in URLs, PDFs, emails, RAG documents, or direct text input. The chatbot processes user input without preprocessing or validation.
Defense Strategy
- Amazon Bedrock Guardrails for malicious‑intent detection
- Nvidia NeMo for additional guardrail layers
- Meta Llama Guard (trained on prompt‑injection patterns)
- Regex and AST validation
- Content sandboxing
2. Tool Poisoning
Description: Manipulating API or function calls with harmful parameters, exploiting overly permissive tool access.
Defense Strategy
- Strict IAM credentials for every tool
- Schema validation for all tool calls
- Fine‑grained RBAC (read vs. write)
- Agentic gateway routing all tool access
- Never grant wildcard permissions to agents
3. Agent‑to‑Agent Escalation
Description: Hijacking multi‑step workflows by manipulating chains of agent calls, causing agents to skip security checks or gain elevated privileges.
Defense Strategy – Deterministic workflows using AWS Step Functions
- LLMs should provide data, not decide workflow steps
- State machines that cannot skip steps
- Clear trust boundaries between agents
- Content filters at each transition
“Think Step Functions unless you can think of a reason why you shouldn’t.” – Brian Tarbox
4. Supply Chain Corruption
Description: Poisoning knowledge bases, metadata, or databases with malicious content that gets indexed and retrieved during normal operations.
Defense Strategy
- Document ingestion throttling and access control
- Amazon GuardDuty malware protection for S3
- Bedrock Guardrails validation for uploaded content
- Content authenticity verification (hashing)
- Separate read and write permissions for agents
- Human review for knowledge‑base updates
The Production Security Playbook
Input Security Gateway
- Authentication check
- Input security validation
- Risk analysis and classification
- Intent detection
Orchestration Layer
- Supervisor agent with limited scope
- Reasoning and inference model loop
- Filter and memory writer
- Response validation before client delivery
- Comprehensive telemetry
Tool Integration Security
- Every tool integration passes through the agentic gateway
- Strict IAM policies (no wildcard permissions)
- Schema validation for all parameters
- Scope limitations (e.g., Slack bot can only read specific channels)
- Rate limiting and throttling
Observability and Monitoring
- Amazon Bedrock AgentCore Observability for tracking all calls, token counts, and response times
- Logging every tool invocation
- Monitoring for unusual patterns
- Alert systems for security violations
Key Takeaways
- Security doesn’t eliminate innovation – it enables it.
- Use Amazon Bedrock AgentCore. “AgentCore is one of the things AWS has done really, really right. Use AgentCore for protection, runtimes, and observability. Your life will be much simpler.” – Brian Tarbox
- Think deterministically. Most systems benefit from enforced steps via Step Functions.
- Defense in depth. Guardrails, IAM, validation, and monitoring create resilient systems.
- Agents are power tools. “If you’re doing construction at home and you have a handsaw, there’s only so much trouble you can get into. I’m willing to use a circular saw. Table saw? Forget it.” – Agents are table saws—powerful but requiring serious safety measures.
- Stay current. Threats evolve rapidly; recent studies show that malicious prompts written in poetry can bypass existing safeguards because LLMs process poetic structure differently than prose.
Bottom Line
Securing AI agents isn’t optional—it’s essential. With the right architecture, tools, and mindset, teams can build AI systems that are both powerful and secure. As the Brians concluded:
“Forewarned is forearmed.”