I Tested 50 AI App Prompts for Injection Attacks. 90% Scored CRITICAL.
Source: Dev.to
Prompt‑Injection Scan of 50 Public AI Apps
Last week I pulled 50 system prompts straight from public AI‑app repositories on GitHub, ran each through a prompt‑injection scanner, and recorded the results. Below is the cleaned‑up markdown version of the findings, preserving the original structure and data.
Overview
| Metric | Value |
|---|---|
| Apps tested | 50 |
| Average score | 3.7 / 100 |
| Median score | 0 / 100 |
| Highest score | 28 / 100 |
| Apps scoring 0 / 100 | 35 (70 %) |
| Apps scoring ≤ 10 / 100 | 43 (86 %) |
| Apps scoring ≤ 20 / 100 | 47 (94 %) |
| CRITICAL severity | 45 (90 %) |
| HIGH severity | 5 (10 %) |
Scoring: 0 – 100, higher = more defended.
100 = defended on every OWASP LLM Top‑10 vector.
0 = the prompt offers essentially no protection.
Attack Vectors Tested
The scanner evaluates 10 OWASP LLM‑01 (Prompt Injection) categories:
- System Prompt Extraction
- Role Override
- Delimiter Escape
- Indirect Injection
- Output Manipulation
- Tool/Function Abuse
- Context‑Window Overflow
- Encoding Bypass
- Social Engineering
- Multi‑turn Escalation
Detailed Findings (selected apps)
| App | Score | Notable Observations |
|---|---|---|
| Code Interpreter | 0 / 100 | Prompt: write Python code to answer the question (162 chars). No role boundaries, no output restrictions. An attacker could ask the model to ignore the prompt, dump its own prompt, or generate arbitrary code. |
| Google Sheets Integration | 0 / 100 | Connects an LLM to Google Sheets with no injection defenses. Any cell value can become a payload, turning a shared spreadsheet into an attack surface. |
| Subscription Tracker | 0 / 100 | Security relies solely on “format instructions” (e.g., JSON output). An attacker can simply type “ignore previous formatting” to bypass. |
| Cloudflare API Agent | 5 / 100 | Slightly more structured prompt, but still trivial to bypass. The agent has direct API access to infrastructure, making even a low score dangerous. |
| Learning Companion | 28 / 100 (highest) | Includes role definition and behavioral constraints that block the most obvious “ignore all previous instructions” attempts. Still vulnerable to role‑override, encoding bypass, multi‑turn escalation, etc. |
| Terminal Assistant | 16 / 100 | Earned points only because accidental output‑format restrictions blocked one vector. Accidental security is not a strategy. |
Bottom line: No app achieved a “good enough” score. The best (28 / 100) is still classified as HIGH severity.
Why This Matters
- Prompt = Security Boundary – In LLM‑driven apps, the system prompt is the only barrier separating user input from model behavior.
- 70 % of apps had zero defenses – Not “weak,” but non‑existent.
- These are real‑world tools (API agents, file readers, email senders). A successful injection can lead to credential leakage, unauthorized API calls, data exfiltration, etc.
Basic Defensive Practices (quick checklist)
- Role Anchoring
- Define what the model can and cannot do and repeat this throughout the prompt.
- Example:
You are a helpful assistant. You may only answer questions using the provided data. You must never execute code or call external APIs.
- Input/Output Delimiters
- Explicitly mark user data vs. instructions.
- Example:
>> {user_message} >>
- Instruction Hierarchy
- State that system instructions outrank user input.
- Reinforce after each user turn if possible.
- Output Constraints
- Enforce strict schemas (JSON, CSV, etc.) and validate them before any downstream processing.
- Sanitize & Encode
- Apply encoding/escaping to any user‑supplied content that will be interpolated into prompts.
- Rate‑limit & Monitoring
- Detect abnormal patterns (e.g., repeated “ignore previous instructions” attempts) and throttle or alert.
Implementing even a few of these measures moves the attack cost from “zero effort” to “requires thought,” dramatically reducing the likelihood of successful exploitation.
Takeaway
Prompt‑injection is the most common and easiest vulnerability class in LLM applications (OWASP LLM‑01). The data above shows that the majority of publicly available AI tools completely ignore this risk.
If you’re building—or already ship—an LLM‑powered product, treat the system prompt as a security control, not a mere configuration detail. Apply the checklist, iterate, and test continuously.
The future of secure AI depends on making prompts robust, not treating them as an afterthought.
Always. If there’s a conflict, system wins. Put it in those exact words. I’ve seen maybe two prompts out of 50 that even attempted this.
Then there’s the boring‑but‑necessary layer: refusal patterns and output validation.
- Prompt side: tell the model to refuse if someone tries to change its behavior or extract its instructions.
- Code side (and this part isn’t even LLM‑specific): don’t blindly trust model output before you hand it to a tool or API. You already sanitize user input (right?). Same thing here.
You won’t be bullet‑proof after this, but you’ll go from 0/100 to somewhere defensible. The scanner I built also spits out a hardened version of your prompt after each scan – it takes your original instructions and wraps them with these patterns so you don’t have to figure out the wording yourself.
About VibeWrench
I’m a solo indie dev. I built VibeWrench because I kept running into the same security gaps in AI‑generated and AI‑powered apps, and nobody was making it easy to catch them.
- The prompt‑injection scanner is one piece of it: paste your system prompt, get scored on all 10 OWASP LLM01 categories, see exactly which attack vectors work against you, and receive a hardened prompt back.
- Free to scan. No signup required for a basic scan.
Additional Context
- Most of the repos in the dataset are side projects, experiments, or people learning. I’ve shipped dumb stuff too.
- The patterns I see in hobby repos are the exact same patterns showing up in production apps that handle real user data.
- The “just tell the AI what to do” approach and empty defenses are common pitfalls.
If you’re building anything with an LLM—especially if it touches real data or calls real APIs—test your prompt. It takes five minutes and beats being the example in someone’s next blog post about AI security.
Website:
Got Questions?
- Questions about methodology?
- Think my scoring is wrong?
Drop a comment; I’ll respond to everything.
— Andrei K.