Why Your LLM Probably Has a PII Problem (And How to Fix It)
Source: Dev.to
Most teams building LLM applications think about prompt injection. Far fewer consider what happens when users send sensitive personal data to their model.
It’s happening right now: users paste credit‑card numbers into chatbots to ask billing questions, share SSNs in healthcare chat interfaces, and drop email addresses and phone numbers into support bots without a second thought. That data hits your LLM, gets logged, potentially ends up in fine‑tuning datasets, and almost certainly violates whatever compliance framework your enterprise customers are bound by.
The problem with naïve regex
A simple regex that matches a credit‑card pattern will flag many false positives. For example, the 16‑digit number 1234567890123456 matches every credit‑card regex, but it isn’t a valid card. Real Visa, Mastercard, or Amex numbers satisfy the Luhn algorithm, which eliminates the vast majority of random digit sequences.
def luhn_valid(number: str) -> bool:
digits = [int(d) for d in number if d.isdigit()]
digits.reverse()
total = 0
for i, d in enumerate(digits):
if i % 2 == 1:
d *= 2
if d > 9:
d -= 9
total += d
return total % 10 == 0
The same issue exists for SSNs. The pattern \d{3}-\d{2}-\d{4} matches millions of strings that aren’t valid Social Security Numbers. A robust validator must also reject:
| Invalid pattern | Reason |
|---|---|
000-XX-XXXX | Area 000 was never issued |
666-XX-XXXX | Area 666 was never issued |
900-999-XX-XXXX | Areas 900–999 are reserved |
XXX-00-XXXX | Group 00 was never issued |
XXX-XX-0000 | Serial 0000 was never issued |
Without these checks, your filter will flag order numbers, invoice IDs, timestamps, etc., leading to an unacceptably high false‑positive rate.
A pragmatic rollout strategy
1. Start in flag mode
Detect potential PII, log the hits, but let the content pass through unchanged. This gives you real traffic data to validate detection accuracy before any content is altered.
# Flag mode — detect and log, content unchanged
result = requests.post(
"https://your-sentinel-endpoint/v1/scrub",
headers={"X-Sentinel-Key": "sk_live_your_key"},
json={"content": user_message, "tier": "standard"},
).json()
# pii_hits: number of PII matches found
# pii_types: categories detected (CREDIT_CARD, SSN, EMAIL, PHONE)
print(result["security"]["pii_hits"]) # e.g. 2
print(result["security"]["pii_types"]) # e.g. ["EMAIL", "PHONE"]
# safe_payload is unchanged in flag mode — content passed through
2. Switch to redact mode once confidence is high
Replace detected PII with typed placeholders before the text ever reaches your LLM.
# Redact mode — PII replaced with placeholders
# Input: "My card is 4532015112830366 and email is john@example.com"
# Output: "My card is [CREDIT_CARD] and email is [EMAIL]"
The redacted text then flows through the rest of the security pipeline— injection detection, semantic similarity, etc.—with the sensitive values already stripped.
Compliance considerations
| Regulation | Why PII filtering matters |
|---|---|
| PCI‑DSS | Any system that processes, stores, or transmits cardholder data is in scope. If your LLM reads credit‑card numbers, you’re in scope. Redacting before the model sees them limits that scope. |
| HIPAA | Patient data, even in free‑text form, is PHI. An LLM processing healthcare support tickets needs PII controls. |
| SOC 2 | Auditors will ask about controls over sensitive data flowing through your AI stack. “We filter it before the model sees it” is a far stronger answer than “we rely on the model not to log it.” |
These controls often become the difference between landing enterprise deals and losing them on a compliance questionnaire.
Phase 1: High‑value patterns
| Type | Pattern | Validation |
|---|---|---|
| Credit cards | 13–19 digit sequences | Luhn algorithm |
| SSNs | \d{3}-\d{2}-\d{4} | Segment validity checks |
| Email addresses | Standard RFC pattern | — |
| US phone numbers | E.164 + common formats | — |
Phase 2: Expanded coverage
- IBANs (critical for European fintech)
- Passport numbers
- Custom regex patterns per tenant – allowing enterprises to bring their own PII definitions.
End‑to‑end flow
User message
→ PII pre‑pass (flag or redact)
→ HTML injection detection
→ Fast‑path regex (prompt injection patterns)
→ Deep‑path vector similarity
→ LLM
PII filtering runs first, before any other processing. In redact mode, the sanitized text—e.g., [CREDIT_CARD] and [EMAIL]—flows through the rest of the pipeline, ensuring that injection detection and the LLM never see raw PII.
Sentinel integration
PII filtering is built into Sentinel as a pre‑pass in the scrub pipeline, available on Teams and Enterprise plans. The flag → redact rollout approach, Luhn validation, and SSN segment checks are all live today.