I Tested 50 AI App Prompts for Injection Attacks. 90% Scored CRITICAL.

Published: 1 month ago (March 16, 2026 at 05:07 AM EDT)

6 min read

Source: Dev.to

Source: Dev.to

Prompt‑Injection Scan of 50 Public AI Apps

Last week I pulled 50 system prompts straight from public AI‑app repositories on GitHub, ran each through a prompt‑injection scanner, and recorded the results. Below is the cleaned‑up markdown version of the findings, preserving the original structure and data.

Overview

Metric	Value
Apps tested	50
Average score	3.7 / 100
Median score	0 / 100
Highest score	28 / 100
Apps scoring 0 / 100	35 (70 %)
Apps scoring ≤ 10 / 100	43 (86 %)
Apps scoring ≤ 20 / 100	47 (94 %)
CRITICAL severity	45 (90 %)
HIGH severity	5 (10 %)

Scoring: 0 – 100, higher = more defended.
100 = defended on every OWASP LLM Top‑10 vector.
0 = the prompt offers essentially no protection.

Attack Vectors Tested

The scanner evaluates 10 OWASP LLM‑01 (Prompt Injection) categories:

System Prompt Extraction
Role Override
Delimiter Escape
Indirect Injection
Output Manipulation
Tool/Function Abuse
Context‑Window Overflow
Encoding Bypass
Social Engineering
Multi‑turn Escalation

Detailed Findings (selected apps)

App	Score	Notable Observations
Code Interpreter	0 / 100	Prompt: `write Python code to answer the question` (162 chars). No role boundaries, no output restrictions. An attacker could ask the model to ignore the prompt, dump its own prompt, or generate arbitrary code.
Google Sheets Integration	0 / 100	Connects an LLM to Google Sheets with no injection defenses. Any cell value can become a payload, turning a shared spreadsheet into an attack surface.
Subscription Tracker	0 / 100	Security relies solely on “format instructions” (e.g., JSON output). An attacker can simply type “ignore previous formatting” to bypass.
Cloudflare API Agent	5 / 100	Slightly more structured prompt, but still trivial to bypass. The agent has direct API access to infrastructure, making even a low score dangerous.
Learning Companion	28 / 100 (highest)	Includes role definition and behavioral constraints that block the most obvious “ignore all previous instructions” attempts. Still vulnerable to role‑override, encoding bypass, multi‑turn escalation, etc.
Terminal Assistant	16 / 100	Earned points only because accidental output‑format restrictions blocked one vector. Accidental security is not a strategy.

Bottom line: No app achieved a “good enough” score. The best (28 / 100) is still classified as HIGH severity.

Why This Matters

Prompt = Security Boundary – In LLM‑driven apps, the system prompt is the only barrier separating user input from model behavior.
70 % of apps had zero defenses – Not “weak,” but non‑existent.
These are real‑world tools (API agents, file readers, email senders). A successful injection can lead to credential leakage, unauthorized API calls, data exfiltration, etc.

Basic Defensive Practices (quick checklist)

Role Anchoring

Define what the model can and cannot do and repeat this throughout the prompt.

Example:

You are a helpful assistant. You may only answer questions using the provided data. You must never execute code or call external APIs.

Input/Output Delimiters
- Explicitly mark user data vs. instructions.
- Example:
```
>>
{user_message}
>>
```
Instruction Hierarchy
- State that system instructions outrank user input.
- Reinforce after each user turn if possible.
Output Constraints
- Enforce strict schemas (JSON, CSV, etc.) and validate them before any downstream processing.
Sanitize & Encode
- Apply encoding/escaping to any user‑supplied content that will be interpolated into prompts.
Rate‑limit & Monitoring
- Detect abnormal patterns (e.g., repeated “ignore previous instructions” attempts) and throttle or alert.

Implementing even a few of these measures moves the attack cost from “zero effort” to “requires thought,” dramatically reducing the likelihood of successful exploitation.

Takeaway

Prompt‑injection is the most common and easiest vulnerability class in LLM applications (OWASP LLM‑01). The data above shows that the majority of publicly available AI tools completely ignore this risk.

If you’re building—or already ship—an LLM‑powered product, treat the system prompt as a security control, not a mere configuration detail. Apply the checklist, iterate, and test continuously.

The future of secure AI depends on making prompts robust, not treating them as an afterthought.

Always. If there’s a conflict, system wins. Put it in those exact words. I’ve seen maybe two prompts out of 50 that even attempted this.

Then there’s the boring‑but‑necessary layer: refusal patterns and output validation.

Prompt side: tell the model to refuse if someone tries to change its behavior or extract its instructions.
Code side (and this part isn’t even LLM‑specific): don’t blindly trust model output before you hand it to a tool or API. You already sanitize user input (right?). Same thing here.

You won’t be bullet‑proof after this, but you’ll go from 0/100 to somewhere defensible. The scanner I built also spits out a hardened version of your prompt after each scan – it takes your original instructions and wraps them with these patterns so you don’t have to figure out the wording yourself.

About VibeWrench

I’m a solo indie dev. I built VibeWrench because I kept running into the same security gaps in AI‑generated and AI‑powered apps, and nobody was making it easy to catch them.

The prompt‑injection scanner is one piece of it: paste your system prompt, get scored on all 10 OWASP LLM01 categories, see exactly which attack vectors work against you, and receive a hardened prompt back.
Free to scan. No signup required for a basic scan.

Additional Context

Most of the repos in the dataset are side projects, experiments, or people learning. I’ve shipped dumb stuff too.
The patterns I see in hobby repos are the exact same patterns showing up in production apps that handle real user data.
The “just tell the AI what to do” approach and empty defenses are common pitfalls.

If you’re building anything with an LLM—especially if it touches real data or calls real APIs—test your prompt. It takes five minutes and beats being the example in someone’s next blog post about AI security.

Website:

Got Questions?

Questions about methodology?
Think my scoring is wrong?

Drop a comment; I’ll respond to everything.

— Andrei K.

I Tested 50 AI App Prompts for Injection Attacks. 90% Scored CRITICAL.

Prompt‑Injection Scan of 50 Public AI Apps

Overview

Attack Vectors Tested

Detailed Findings (selected apps)

Why This Matters

Basic Defensive Practices (quick checklist)

Takeaway

About VibeWrench

Additional Context

Got Questions?

Related posts

Designing AI agents to resist prompt injection

[Paper] ChainFuzzer: Greybox Fuzzing for Workflow-Level Multi-Tool Vulnerabilities in LLM Agents

Your AI Agent is Modifying Its Own Safety Rules

Files Are the New API — But Who's Checking the Files?