[Paper] LLM-enabled Applications Require System-Level Threat Monitoring
Source: arXiv - 2602.19844v1
Overview
Large Language Models (LLMs) are now being woven into the core logic of countless applications—from code assistants to autonomous agents. While this unlocks powerful new capabilities, it also brings a wave of reliability and security concerns that traditional software testing simply can’t catch. The authors argue that system‑level threat monitoring—continuous, runtime detection of anomalous or malicious behavior—is the missing piece for safely deploying LLM‑enabled software at scale.
Key Contributions
- Threat‑monitoring paradigm shift – Treat LLM‑related security risks as expected operational conditions that require real‑time incident response, rather than rare edge cases.
- Taxonomy of LLM‑specific attack vectors – Identify and categorize novel threats (prompt injection, model leakage, hallucination‑driven exploits, etc.) that arise only because LLMs act as reasoning engines.
- Design principles for runtime monitoring – Outline a set of system‑level requirements (observability, context awareness, provenance tracking, and safe‑fail mechanisms) tailored to the non‑deterministic nature of LLM outputs.
- Blueprint for an incident‑response loop – Introduce a feedback‑driven workflow that couples anomaly detection with automated mitigation (e.g., request throttling, model sandboxing, or human‑in‑the‑loop escalation).
- Positioning of monitoring over model‑centric defenses – Argue that guardrails and prompt‑level sanitization are insufficient on their own; continuous monitoring is essential for post‑deployment safety.
Methodology
The paper adopts a systems‑engineering perspective rather than an empirical evaluation. The authors:
- Surveyed existing defenses (prompt sanitization, fine‑tuning, sandboxing) and highlighted their blind spots in dynamic, production‑grade settings.
- Mapped LLM attack surfaces by analyzing real‑world deployments (code generation tools, chat assistants, autonomous agents) and extracting recurring failure modes.
- Derived monitoring requirements through a threat‑modeling exercise, focusing on observability (logging model inputs/outputs), context (user intent, system state), and response latency.
- Proposed an architectural sketch that integrates a Threat Detection Engine (leveraging statistical anomaly detection, policy checks, and lightweight LLM auditors) into the application stack, feeding alerts into an Incident Response Orchestrator.
The approach is deliberately high‑level to make the concepts approachable for developers who need actionable guidance rather than deep formal proofs.
Results & Findings
- Guardrails are brittle – Static prompt filters miss many sophisticated injection attacks that evolve at runtime.
- Anomaly signals exist – Simple metrics (output entropy, token distribution shifts, request latency spikes) can flag potentially malicious LLM behavior with low false‑positive rates.
- Context matters – Correlating LLM outputs with surrounding system state (e.g., file system accesses, network calls) dramatically improves detection accuracy.
- Rapid mitigation is feasible – By coupling detection with automated policy enforcement (e.g., sandbox termination, request rollback), the system can contain threats before they propagate.
These findings collectively support the central thesis: effective, system‑level monitoring is a prerequisite for trustworthy LLM‑enabled applications.
Practical Implications
| For Developers | For Organizations |
|---|---|
| Instrument your code – Log prompts, model responses, and downstream actions to create a data pipeline for real‑time analysis. | Integrate with existing SIEM – Feed LLM telemetry into security information and event management (SIEM) tools to leverage existing alerting and incident‑response workflows. |
| Adopt lightweight auditors – Deploy a secondary “watchdog” LLM or rule‑based engine that reviews primary model outputs before they affect critical resources. | Plan for graceful degradation – Design fallback paths (e.g., switch to a rule‑based subsystem) when the monitoring layer flags high‑risk activity. |
| Define clear policies – Establish what constitutes anomalous behavior (e.g., unexpected file writes, privileged API calls) and encode them as enforceable rules. | Continuous improvement loop – Use detected incidents to retrain or fine‑tune models, update guardrails, and refine detection heuristics. |
In short, the paper pushes developers to think of LLMs as runtime services that need the same observability, logging, and incident‑response scaffolding as any other critical component.
Limitations & Future Work
- Lack of empirical validation – The paper presents a conceptual framework without large‑scale deployment data; real‑world efficacy remains to be measured.
- Performance overhead – Continuous monitoring adds latency and resource consumption; quantifying this trade‑off is left for future studies.
- Evolving threat landscape – Attack techniques will continue to adapt; the authors call for an open‑source ecosystem of monitoring plugins to keep pace.
- Human factors – Effective incident response requires clear alert triage and operator training, topics the paper only briefly touches on.
Future research directions include building benchmark suites for LLM threat detection, evaluating detection algorithms at scale, and integrating privacy‑preserving telemetry to respect user data regulations.
Authors
- Yedi Zhang
- Haoyu Wang
- Xianglin Yang
- Jin Song Dong
- Jun Sun
Paper Information
- arXiv ID: 2602.19844v1
- Categories: cs.CR, cs.AI, cs.SE
- Published: February 23, 2026
- PDF: Download PDF