Google's Dev Signal is brilliant. It's also a security nightmare waiting to happen.
Source: Dev.to
Google’s Dev Signal is brilliant. It’s also a security nightmare waiting to happen.
Google just published a great article about Dev Signal — a multi-agent system that reads Reddit, stores long-term memory in Vertex AI, and auto-generates expert content via MCP tools. It’s elegant. It’s also a security nightmare that nobody’s talking about. Dev Signal’s architecture: Reddit (untrusted input) → Reddit Scanner Agent → Vertex AI Memory Bank (long-term persistence) → GCP Expert Agent → Blog Drafter Agent → Published content
Problem 1: Memory poisoning via indirect prompt injection. Your Reddit Scanner ingests unstructured content from the internet. An attacker posts a crafted Reddit comment containing:
The agent reads it. Stores it in Vertex AI Memory Bank. Now every future session is contaminated. The attacker owns your content pipeline permanently. Problem 2: MCP tool chain compromise. The tool chain (Scanner → Expert → Drafter) means a compromised intermediate agent can mutate the entire workflow. If the GCP Expert agent is tricked into generating malicious content, the Blog Drafter publishes it automatically. Problem 3: No output auditing. There’s no layer checking whether the agent’s output matches what was actually requested. The agents execute tools, generate content, and publish — with zero runtime verification. While reading this article, I realized: this is exactly the problem I’ve been working on. A lightweight output guard that intercepts agent outputs in <1ms: from agent_fixer import AgentFixer
fixer = AgentFixer(scope=“Generate blog post about GCP”, action=“clean”) result = fixer.check(agent_output)
if result.status == “rejected”: # Don’t publish. Don’t store in memory. Alert. block_and_alert(result)
3 layers, all cortocircuitable: Normalization — Strips unicode tricks, homoglyphs, leetspeak Pattern scoring — 30+ weighted patterns, 3 passes (normal, leetspeak variants, cross-line) Embeddings — TF-IDF similarity against known attack patterns Detection rates:
Attack type Effectiveness
Direct injection (curl, wget, os.system) ~95%
Leetspeak / homoglyphs ~90%
Cross-line fragmentation ~85%
Semantic exfiltration ~75%
Global ~85-90%
42 tests passing. Sub-millisecond overhead. No heavy dependencies. The complementary layer — audits tools before registration: MCP Tool → [MCP Core Defense] → Is this tool safe to register? ↓ Policy check + TDP scan + DCI verification ↓ Allow / Block / Flag
Together they cover the full lifecycle: MCP Core Defense → What CAN the agent do? (static, pre-registration) Agent Fixer Stage → What DID the agent do? (runtime, output auditing)
Google is building autonomous agents that read untrusted input, persist memory, and execute tools — without any security layer between the agent and the outside world. This isn’t a Google-specific problem. Every multi-agent system with MCP tools and persistent memory has this gap. The open-source community needs security infrastructure that: Runs locally (no cloud lock-in) Is plug-and-play (no PKI infrastructure) Has minimal overhead (<1ms) Catches the obvious stuff (regex) and the tricky stuff (embeddings) That’s what I’m building. Agent Fixer Stage: https://github.com/amurlaniakea/agent-fixer-stage
MCP Core Defense: https://github.com/amurlaniakea/mcp-core-defense
Google’s Dev Signal article: https://dev.to/googleai/architect-a-personalized-multi-agent-system-with-long-term-memory-3o15
My previous post on the Pentagon/Fable 5 angle: https://dev.to/magopredator/agent-fixer-stage-un-guardian-ligero-para-outputs-de-agentes-de-ia-1pdc
AGPL-3.0-or-later — Fork it, break it, improve it. Just don’t deploy agents without security layers.