Why I stopped trusting AI agents and built a security enforcer.

Published: (March 9, 2026 at 10:32 PM EDT)
3 min read
Source: Dev.to

Source: Dev.to

Introduction

Every tutorial on building AI agents includes some version of this line:

“Add a system prompt telling the model not to access sensitive data.”

I followed that advice for a while, then realized I was asking a probabilistic text predictor to enforce a security boundary. That’s not security—it’s optimism.

AI agents fail in predictable, documented ways:

  • Tool misuse – the agent calls a tool it shouldn’t.
  • Prompt injection through tool outputs – the agent browses a webpage, calls a tool, and is tricked.
  • Secret leakage – API keys, tokens, and PII flow through tool inputs and outputs.
  • Uncapped spend – an agent in a loop or hitting an error can generate unlimited costs.

These are not exotic attacks; they are the boring, predictable failure modes that any production system must handle. The problem isn’t the model—it does exactly what it’s designed to do. Security requirements need to be enforced in deterministic code, not in the model’s prompt.

The core idea behind Argus is to keep the LLM inside a security‑enforced environment, where every tool call is vetted by code.

How Argus Works

Every tool call in Argus passes through a SecurityGateway before it is executed.

# Before the tool runs:
gateway.pre_tool_call(
    tool_name="read_file",
    arguments={"path": "/etc/passwd"},
    agent_role="executor"
)

# The tool runs (or doesn’t, if blocked)

# After the tool returns:
gateway.post_tool_call(
    tool_name="read_file",
    result=tool_output,
    agent_role="executor"
)

Each call runs five checks in sequence:

  1. Permission enforcement – using Casbin (RBAC + ABAC). Each agent role has explicit permissions.
  2. Prompt injection detection – matches against the 14 OWASP LLM01:2025 patterns.
  3. Secret redaction – API keys, tokens, and PII are detected and stripped.
  4. Egress verification – outbound requests are checked against a declared allow‑list.
  5. Audit logging – every call, whether permitted or blocked, is written to an audit log.

LangChain Integration

from argus.adapters.langchain import wrap_tools
from argus.security.gateway import SecurityGateway, GatewayConfig
from argus.security.audit.daemon import AuditDaemon
from argus.security.audit.logger import AuditLogger

with AuditDaemon(socket_path="/tmp/audit.sock", log_path="audit.jsonl") as daemon:
    audit_logger = AuditLogger("/tmp/audit.sock")
    gateway = SecurityGateway(config=GatewayConfig(), audit_logger=audit_logger)
    safe_tools = wrap_tools(your_tools, gateway=gateway, agent_role="executor")

The adapter uses a proxy pattern (not callbacks), which ensures that every tool invocation is routed through the gateway.

Spend Caps

Argus includes a cost tracker that uses real token counts from LiteLLM responses, allowing you to set hard caps that abort execution when the budget is exceeded.

Installation

pip install git+https://github.com/yantandeta0791/argus

A quick demo can be run with the argus demo command after installation.

What Argus Doesn’t Solve

  • Argus enforces security at the tool boundary; it does not make the model itself smarter.
  • It does not replace good system‑prompt design. Well‑crafted prompts still reduce the probability of undesirable behavior.

Think of Argus as defense in depth: solid prompts + enforced tool‑level security = a more robust agent.

Future Directions

The repository is hosted at . If you’ve encountered real agent‑security incidents in production, the author welcomes comments or issues describing those scenarios.

0 views
Back to Blog

Related posts

Read more »