Stop Your AI Agent from Leaking API Keys, Private Keys, and PII

Published: (February 15, 2026 at 02:39 AM EST)
7 min read
Source: Dev.to

Source: Dev.to

Overview

Your AI agent generates text. That text sometimes contains secrets.

  • Maybe the LLM hallucinated an AWS key from its training data.
  • Maybe a tool returned database credentials in its output.
  • Maybe the agent is summarizing a document that contains a user’s SSN, email, or crypto‑wallet private key.

If that output reaches the end user — or worse, gets logged to a third‑party service — you have a data breach.

This post shows how to automatically strip sensitive data from any text before it leaves your system, using the redact() function from the Agntor SDK. It ships with 17 built‑in patterns covering PII, cloud secrets, and blockchain‑specific keys.

Install

npm install @agntor/sdk

Basic Usage

import { redact } from "@agntor/sdk";

const input = `
  Here are the credentials:
  AWS Key: AKIA1234567890ABCDEF
  Email: admin@internal-corp.com
  Server: 192.168.1.100
`;

const { redacted, findings } = redact(input, {});

console.log(redacted);
// Here are the credentials:
//   AWS Key: [AWS_KEY]
//   Email: [EMAIL]
//   Server: [IP_ADDRESS]

console.log(findings);
// [
//   { type: "aws_access_key", span: [42, 62] },
//   { type: "email",          span: [72, 95] },
//   { type: "ipv4",           span: [106,119] }
// ]

Zero configuration. The empty policy {} enables all 17 built‑in patterns.

What Gets Caught

Standard PII

TypeExampleReplaced With
Emailuser@example.com[EMAIL]
Phone (US)+1 (555) 123-4567[PHONE]
SSN123-45-6789[SSN]
Credit card4111 1111 1111 1111[CREDIT_CARD]
Street address123 Main Street[ADDRESS]
IPv4192.168.1.1[IP_ADDRESS]

Cloud Secrets

TypeExampleReplaced With
AWS access keyAKIA1234567890ABCDEF[AWS_KEY]
Bearer tokenBearer eyJhbGciOiJI...Bearer [REDACTED]
API key/secretapi_key: "sk-abc123..."api_key: [REDACTED]

The API‑key pattern is smart — it matches api_key, secret, password, and token followed by : or = and a value of 20+ characters. The key name is preserved in the output so you know which secret was redacted.

Blockchain / Crypto Keys

TypeExampleReplaced With
EVM private key0xac0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80[PRIVATE_KEY]
Solana private key87‑88‑char base58 string[SOLANA_PRIVATE_KEY]
Bitcoin WIF keyStarts with 5, K, or L + 50‑51 base58 chars[BTC_PRIVATE_KEY]
BIP‑39 mnemonic (12)abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon about[MNEMONIC_12]
BIP‑39 mnemonic (24)24‑word seed phrase[MNEMONIC_24]
Keystore JSON ciphertext"ciphertext": "a1b2c3...""ciphertext": "[REDACTED_KEYSTORE]"
HD derivation pathm/44'/60'/0'/0/0[HD_PATH]

Real Example: Crypto Agent Output

import { redact } from "@agntor/sdk";

const agentOutput = `
  I've set up your wallet. Here are the details:
  Address: 0x742d35Cc6634C0532925a3b844Bc9e7595f2bD18
  Private Key: 0xac0974bec39a17e36ba4a6b4d238ff944bacb478cbed5efcae784d7bf4f2ff80
  Recovery Phrase: abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon about
  Derivation Path: m/44'/60'/0'/0/0
`;

const { redacted } = redact(agentOutput, {});

console.log(redacted);
// I've set up your wallet. Here are the details:
//   Address: 0x742d35Cc6634C0532925a3b844Bc9e7595f2bD18
//   Private Key: [PRIVATE_KEY]
//   Recovery Phrase: [MNEMONIC_12]
//   Derivation Path: [HD_PATH]

Note: The public wallet address (42 hex characters) is not redacted — only the private key (64 hex characters) is. The regex specifically matches 64‑hex‑character strings, which is the length of an EVM private key.

Custom Patterns

Add your own patterns for domain‑specific secrets:

const { redacted } = redact(agentOutput, {
  redactionPatterns: [
    {
      type: "internal_endpoint",
      regex: /https?:\/\/internal\.[a-z]+\.corp\/[^\s]*/gi,
      replacement: "[INTERNAL_URL]",
    },
    {
      type: "jwt_token",
      regex: /eyJ[A-Za-z0-9_-]+\.eyJ[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+/g,
      replacement: "[JWT]",
    },
  ],
});

Custom patterns are merged with the defaults, so you keep all 17 built‑in patterns plus your additions.

How Overlapping Matches Are Handled

When two patterns match overlapping text (e.g., a hex string that could be both a private key and part of an API‑key assignment), the algorithm works as follows:

  1. Run all patterns via matchAll() to collect every match with its position.
  2. Sort matches by start position, then by length (longest first).
  3. Scan left‑to‑right: if a match overlaps with an already‑accepted match, it is skipped.

Result: the longest, leftmost match wins. In practice this yields the most useful output — you see [PRIVATE_KEY] rather than a partially redacted string.

Express Middleware Example

A practical middleware that redacts all JSON responses:

import express from "express";
import { redact } from "@agntor/sdk";

const app = express();

// Redaction middleware — intercepts JSON responses
app.use((req, res, next) => {
  const originalJson = res.json.bind(res);

  res.json = (body: unknown) => {
    const bodyStr = JSON.stringify(body);
    const { redacted, findings } = redact(bodyStr, {});

    if (findings.length > 0) {
      console.warn(
        `Redacted ${findings.length} sensitive items:`,
        findings.map((f) => f.type)
      );
    }

    // Send the redacted payload
    return originalJson(JSON.parse(redacted));
  };

  next();
});

// ... define routes as usual
app.listen(3000, () => console.log("Server listening on :3000"));

The middleware serializes the response, redacts any secrets, logs what was removed, and then sends the cleaned JSON back to the client.

Takeaway

By plugging redact() into your AI‑agent pipeline (or any place where text leaves your trusted boundary), you can automatically prevent accidental leakage of API keys, private keys, and personally identifiable information.

.post("/api/agent", async (req, res) => {
  const llmOutput = await callYourLLM(req.body.prompt);
  // Even if the LLM leaks secrets, they get stripped here
  res.json({ result: llmOutput });
});

Combining with Input Guard

Redaction handles the output side. For the input side, combine it with guard():

import { guard, redact } from "@agntor/sdk";

async function processAgentRequest(userInput: string) {
  // 1. Guard the input
  const guardResult = await guard(userInput, {});
  if (guardResult.classification === "block") {
    throw new Error(
      "Input rejected: " + guardResult.violation_types.join(", ")
    );
  }

  // 2. Process with your LLM
  const output = await callYourLLM(userInput);

  // 3. Redact the output
  const { redacted } = redact(output, {});

  return redacted;
}

Or use wrapAgentTool() which does guard + redact + SSRF check in one call:

import { wrapAgentTool } from "@agntor/sdk";

const safeTool = wrapAgentTool(myTool, {
  policy: {},
});

// Inputs are redacted and guarded, then the tool executes
const result = await safeTool("https://api.example.com/data");

Performance

Redaction runs entirely in‑process with regex. There are no network calls, no LLM inference, and no external dependencies (beyond the SDK itself).

  • On typical agent output (500–2000 characters), redact() completes in under 1 ms.
  • Even on large documents (100 KB+), it stays under 10 ms.

You can safely call it on every response without measurable latency impact.

Limitations

  • False positives on hex strings – A 64‑character hex hash (e.g., a SHA‑256 digest) matches the private‑key pattern. If your agent frequently outputs non‑secret hex hashes, adjust the pattern accordingly.
  • Mnemonic detection is greedy – Any sequence of 12 or 24 lowercase words (3–8 characters each) will match. This could flag legitimate English text in rare cases.
  • No semantic understanding – Redaction is purely pattern‑based; it cannot distinguish a real AWS key from a look‑alike string. This trade‑off favors false positives (safer) over false negatives (riskier).

Source Code

Everything is open source (MIT):

If you’re building agents that generate text — especially agents that interact with APIs, databases, or blockchain — add output redaction. It’s a one‑line change that prevents an entire class of data breaches.

(Agntor is an open‑source trust and payment rail for AI agents. Star the repo if this was useful.)

0 views
Back to Blog

Related posts

Read more »