Microsoft Copilot ignored sensitivity labels twice in eight months — and no DLP stack caught either one
Source: VentureBeat
Incident Overview
- Timeframe: 4 weeks starting January 21
- Issue: Microsoft Copilot read and summarized confidential emails despite every sensitivity label and DLP policy prohibiting it.
- Root cause: Enforcement points broke inside Microsoft’s own pipeline; no security tool in the stack flagged the breach.
Affected Organizations
- U.K. National Health Service (NHS) – logged the incident as
INC46740412【NHS alert】. - Microsoft internal tracking:
CW1226324.
Advisory Details
- First reported by BleepingComputer on February 18【article】.
- This is the second occurrence in eight months where Copilot’s retrieval pipeline violated its own trust boundary (i.e., an AI system accessed or transmitted data it was explicitly restricted from handling).
Prior Incident (June 2025)
- Vulnerability: CVE‑2025‑32711 – a critical zero‑click flaw dubbed “EchoLeak.”
- Impact: A malicious email bypassed Copilot’s prompt‑injection classifier, link‑redaction, Content‑Security‑Policy, and reference‑mention checks, silently exfiltrating enterprise data without any user interaction.
- Severity: CVSS 9.3【NVD entry】【analysis】.
Common Thread
- Two distinct root causes (a code error and a sophisticated exploit chain) produced the same outcome: Copilot processed data it was explicitly prohibited from accessing, and the surrounding security stack failed to detect the breach.
All identifiers (INC, CW, CVE) are presented in back‑ticks for clarity.
Why EDR and WAF Remain Architecturally Blind to This Issue
Endpoint Detection and Response (EDR) monitors file‑ and process‑level behavior.
Web Application Firewalls (WAFs) inspect HTTP payloads.
Neither technology has a detection category for “your AI assistant just violated its own trust boundary.” The gap exists because LLM retrieval pipelines sit behind an enforcement layer that traditional security tools were never designed to observe.
What Actually Happened
| Step | Description |
|---|---|
| 1. Ingestion | Copilot ingested a labeled email that it was explicitly told to skip. |
| 2. Location | The entire action occurred inside Microsoft’s infrastructure—between the retrieval index and the generation model. |
| 3. No Observable Artifacts | * No file was written to disk. |
- No anomalous network traffic crossed the perimeter.
- No new process was spawned for an endpoint agent to flag. | | 4. Security Stack Reaction | The stack reported “all‑clear” because it never saw the layer where the violation occurred. |
Relevant Vulnerabilities
| Identifier | Core Issue | Outcome |
|---|---|---|
| CW1226324 | A code‑path error allowed messages in Sent Items and Drafts to enter Copilot’s retrieval set despite sensitivity labels and DLP rules. | Sensitive data leaked despite policy enforcement. |
| EchoLeak | Researchers crafted a malicious email that looked like ordinary business correspondence. The email manipulated Copilot’s retrieval‑augmented generation pipeline, causing internal data to be sent to an attacker‑controlled server. | Demonstrated that the retrieval layer can be hijacked to exfiltrate data. |
Source: Microsoft advisory (CW1226324) and the Fortune article on EchoLeak.
Root Cause (as identified by Aim Security)
“Agents process trusted and untrusted data in the same thought process, making them structurally vulnerable to manipulation.” – Aim Security researchers
- This fundamental design flaw does not disappear when a specific bug (e.g., EchoLeak) is patched.
- The enforcement layer around the retrieval pipeline can fail independently of any downstream fixes.
Takeaway
- Traditional EDR/WAF solutions cannot see inside the retrieval‑augmented generation (RAG) layer where the trust boundary is crossed.
- Mitigating this class of risk requires new visibility mechanisms (e.g., telemetry inside the LLM pipeline, policy‑aware retrieval filters, or dedicated AI‑security controls) rather than relying solely on existing perimeter‑oriented tools.
The Five‑Point Audit That Maps to Both Failure Modes
Neither failure triggered a single alert. Both were discovered through vendor advisory channels — not through SI‑EM, EDR, or WAF.
- CW1226324 went public on February 18.
- Affected tenants had been exposed since January 21.
- Microsoft has not disclosed how many organizations were affected or what data was accessed during that window.
For security leaders, that gap is the story: a four‑week exposure inside a vendor’s inference pipeline, invisible to every tool in the stack, discovered only because Microsoft chose to publish an advisory.
1. Test DLP enforcement against Copilot directly
CW1226324 persisted for four weeks because no one verified whether Copilot actually honored sensitivity labels on Sent Items and Drafts.
Action:
- Create labeled test messages in controlled folders.
- Query Copilot and confirm it cannot surface those messages.
- Run this test monthly.
Configuration ≠ enforcement; the only proof is a failed retrieval attempt.
2. Block external content from reaching Copilot’s context window
EchoLeak succeeded because a malicious email entered Copilot’s retrieval set and its injected instructions executed as if they were the user’s query. The attack bypassed four distinct defense layers (cross‑prompt‑injection classifier, external‑link redaction, CSP controls, and reference‑mention safeguards) per Aim Security’s disclosure.
Action:
- Disable external email context in Copilot settings.
- Restrict Markdown rendering in AI outputs.
Removing the attack surface eliminates the prompt‑injection class of failure.
3. Audit Purview logs for anomalous Copilot interactions (Jan – Feb 2026)
Look for Copilot Chat queries that returned content from labeled messages between January 21 and mid‑February 2026.
Action:
- Use Purview telemetry to reconstruct what Copilot accessed during the exposure window.
- If reconstruction is impossible, document the gap formally.
For regulated environments, an undocumented AI data‑access gap during a known vulnerability window is an audit finding waiting to happen.
4. Turn on Restricted Content Discovery (RCD) for SharePoint sites with sensitive data
RCD removes sites from Copilot’s retrieval pipeline entirely, working regardless of whether the trust violation stems from a code bug or an injected prompt.
Action:
- Enable Restricted Content Discovery on all SharePoint sites that store sensitive or regulated data.
RCD is a containment layer that does not depend on the enforcement point that broke; for high‑value data it is not optional.
5. Build an incident‑response playbook for vendor‑hosted inference failures
IR playbooks need a new category: trust‑boundary violations inside the vendor’s inference pipeline.
Action:
- Define escalation paths and assign ownership.
- Establish a monitoring cadence for vendor service‑health advisories that affect AI processing.
- Incorporate the playbook into existing SIEM/IR workflows (recognizing the SIEM will not catch the next one).
Preparedness, not detection, is the key to handling the next vendor‑hosted AI failure.
The Pattern That Transfers Beyond Copilot
A 2026 survey by Cybersecurity Insiders1 found that 47 % of CISOs and senior security leaders have already observed AI agents exhibit unintended or unauthorized behavior. Organizations are deploying AI assistants into production faster than they can build governance around them.
Why This Matters
The issue isn’t limited to Copilot. Any RAG‑based assistant that pulls from enterprise data follows the same three‑layer pattern:
- Retrieval layer – selects content from internal sources.
- Enforcement layer – gates what the model is allowed to see.
- Generation layer – produces the final output.
If the enforcement layer fails, the retrieval layer can feed restricted data to the model, and the security stack never sees it. Whether it’s Copilot, Gemini for Workspace, or any tool with retrieval access to internal documents, the structural risk is identical.
Actionable Checklist
Run the five‑point audit before your next board meeting:
| # | Control | What to Test |
|---|---|---|
| 1 | Label‑based isolation | Place test messages in a controlled folder and verify they never surface. |
| 2 | Policy enforcement verification | Confirm that policies are applied inside the vendor’s inference pipeline, not just at the perimeter. |
| 3 | Retrieval gating | Ensure the retrieval layer cannot return restricted documents when policies deny them. |
| 4 | Audit logging | Verify that every retrieval and generation event is logged and searchable. |
| 5 | Alerting on enforcement failure | Configure alerts for any mismatch between policy intent and model output. |
If Copilot (or any assistant) surfaces the labeled test messages, every policy underneath is theater.
Board‑Ready Narrative
Board answer: “Our policies were configured correctly. Enforcement failed inside the vendor’s inference pipeline. Here are the five controls we are testing, restricting, and demanding before we re‑enable full access for sensitive workloads.”
The next failure will not send an alert—unless you’ve built the controls above.
Footnotes
-
2026 CISO AI Risk Report, Cybersecurity Insiders. https://www.cybersecurity-insiders.com/2026-ciso-ai-risk-report/ ↩