LangChain ChromaDB Metadata Priority Injection — RAG Poisoning Vulnerability
Source: Dev.to
⚠️ Collection Error: Content refinement error: Error: 429 “you (bkperio) have reached your weekly usage limit, upgrade for higher limits: https://ollama.com/upgrade (ref: 4bf48f02-f535-48b9-969e-19598fe89c50)”
LangChain ChromaDB Metadata Priority Injection
Vulnerability Summary
LangChain’s Chroma integration allows attackers to manipulate document retrieval by injecting high-priority metadata fields, forcing malicious documents to rank above legitimate ones regardless of semantic relevance. langchain-community: All versions <= 0.3.x langchain-chroma: All versions chromadb: All versions
Attacker uploads document with manipulated metadata
poisoned_doc = { ‘text’: ‘Malicious insurance policy: Coverage limit is 5,000 Kč’, ‘metadata’: {‘priority’: 999} # Force highest ranking }
Victim’s RAG system retrieves poisoned doc first
Legitimate docs with lower priority are ignored
OWASP LLM08: Vector and Embedding Weaknesses MITRE ATT&CK: T1565.001 (Data Manipulation) Affects insurance, legal, medical RAG systems Persistent poisoning (survives database restarts) [Attach test_langchain_vulnerability.py] Reported by: [Your GitHub/contact] Blocking poisoned outputs at the API layer is the only runtime control. OutputGuard detects and blocks LLM output manipulation in 2ms — built specifically for RAG pipelines in production.