[Paper] Agent Skills in the Wild: An Empirical Study of Security Vulnerabilities at Scale
Source: arXiv - 2601.10338v1
Overview
The paper “Agent Skills in the Wild: An Empirical Study of Security Vulnerabilities at Scale” shines a light on a rapidly growing but under‑examined part of the AI ecosystem: agent skills—plug‑in packages that extend the behavior of AI agents with custom instructions and executable code. By scanning tens of thousands of publicly available skills, the authors reveal that a surprising share of them contain serious security flaws, raising urgent questions for anyone building, deploying, or consuming AI‑driven agents.
Key Contributions
- Large‑scale empirical dataset – Collected 42 k skills from two major marketplaces; 31 k were fully analyzed.
- SkillScan detection framework – A multi‑stage pipeline that blends static code analysis with LLM‑driven semantic classification, achieving 86.7 % precision and 82.5 % recall.
- Vulnerability taxonomy – Derived from 8 126 vulnerable skills, defining 14 distinct patterns across four high‑level categories: prompt injection, data exfiltration, privilege escalation, and supply‑chain risks.
- Quantitative risk insights – 26.1 % of examined skills contain at least one vulnerability; data exfiltration (13.3 %) and privilege escalation (11.8 %) are the most common.
- Open resources – Released the curated dataset and the SkillScan toolkit to enable reproducibility and further research.
Methodology
- Data collection – Scraped two popular skill marketplaces, de‑duplicated entries, and filtered out malformed packages, ending up with 31 132 analyzable skills.
- Static analysis – Parsed the skill manifests, inspected bundled scripts, and extracted code‑level artefacts (e.g., network calls, OS commands).
- LLM‑assisted semantic classification – Prompted a large language model to interpret the natural‑language instructions and infer the intended behavior, flagging mismatches or suspicious intent.
- Multi‑stage filtering – Combined the static signals and LLM outputs in a rule‑based scoring system, then manually verified a stratified sample to calibrate precision/recall.
- Statistical testing – Compared vulnerability rates across skill types (script‑bundling vs. instruction‑only) using odds ratios and significance testing (p < 0.001).
Results & Findings
| Metric | Finding |
|---|---|
| Overall vulnerability prevalence | 26.1 % of skills contain ≥1 vulnerability |
| Top categories | • Data exfiltration – 13.3 % • Privilege escalation – 11.8 % |
| High‑severity malicious patterns | 5.2 % of skills exhibit patterns strongly indicative of intentional abuse |
| Risk factor for script‑bundling | Skills that ship executable scripts are 2.12× more likely to be vulnerable (OR = 2.12, p < 0.001) |
| Detection performance | Precision = 86.7 %, Recall = 82.5 % (validated against a manually labeled subset) |
| Taxonomy breadth | 14 distinct vulnerability patterns across 4 categories, derived from 8 126 vulnerable instances |
These numbers suggest that the “wild” skill ecosystem is already a fertile ground for attacks such as credential leakage, unauthorized system commands, and supply‑chain compromises.
Practical Implications
- For platform operators – The findings make a strong case for capability‑based permission models (e.g., sandboxing scripts, explicit network‑access grants) and automated vetting pipelines before publishing new skills.
- For developers integrating agents – Treat third‑party skills as untrusted code: audit manifests, restrict the permissions you grant, and consider runtime monitoring for anomalous network or file‑system activity.
- For security teams – The released taxonomy can be directly mapped to existing SIEM rules or threat‑intel feeds, enabling early detection of compromised agents in production environments.
- For AI product managers – Incorporating a “skill security score” into marketplace listings could become a differentiator, encouraging vendors to adopt safer development practices.
- For open‑source contributors – The open SkillScan toolkit offers a ready‑made scanner that can be integrated into CI pipelines, similar to static analysis tools for traditional software.
Limitations & Future Work
- Marketplace coverage – Only two major marketplaces were examined; niche or private repositories may exhibit different risk profiles.
- Dynamic behavior not captured – The study relies on static and LLM‑based analysis; runtime exploits that only manifest under specific inputs could be missed.
- LLM bias – Semantic classification depends on the underlying LLM’s knowledge and prompt design, which may introduce false positives/negatives.
- Evolving skill formats – As agent frameworks evolve, new skill packaging conventions could invalidate current detection rules, necessitating continual updates to SkillScan.
Future research directions include extending the analysis to runtime sandboxing, exploring cross‑skill supply‑chain attacks, and building standardized security schemas that marketplaces can enforce automatically.
Authors
- Yi Liu
- Weizhe Wang
- Ruitao Feng
- Yao Zhang
- Guangquan Xu
- Gelei Deng
- Yuekang Li
- Leo Zhang
Paper Information
- arXiv ID: 2601.10338v1
- Categories: cs.CR, cs.AI, cs.CL, cs.SE
- Published: January 15, 2026
- PDF: Download PDF