Anthropic Says Chinese AI Firms Used 16 Million Claude Queries to Copy Model
Source: The Hacker News
Ravie Lakshmanan
Feb 24, 2026 • Artificial Intelligence / Anthropic
[Image: Claude AI]
Anthropic announced on Monday that it had identified “industrial‑scale campaigns” run by three artificial‑intelligence companies—DeepSeek, Moonshot AI, and MiniMax—to illegally extract Claude’s capabilities for use in their own models.
- The distillation attacks generated over 16 million exchanges with Claude’s large‑language model (LLM) through roughly 24 000 fraudulent accounts, violating Anthropic’s terms of service and regional access restrictions.
- All three companies are based in China, where the use of Anthropic’s services is prohibited because of “legal, regulatory, and security risks.”
What is distillation?
Distillation is a technique where a less capable model is trained on the outputs produced by a stronger AI system. While legitimate for companies to create smaller, cheaper versions of their own frontier models, it is illegal for competitors to leverage it to acquire capabilities from other AI firms at a fraction of the time and cost required for independent development.
“Illicitly distilled models lack necessary safeguards, creating significant national‑security risks,” Anthropic said. “Models built through illicit distillation are unlikely to retain those safeguards, meaning that dangerous capabilities can proliferate with many protections stripped out entirely.” – Anthropic announcement
Why it matters
Foreign AI firms that distill American models can weaponize unprotected capabilities for malicious activities—ranging from cyber‑operations to disinformation campaigns and mass surveillance—providing a foundation for authoritarian governments to deploy offensive tools.
Anthropic traced each campaign to a specific AI lab using request metadata, IP‑address correlation, and infrastructure indicators. The attacks relied on commercial proxy services that resell access to Claude and other frontier models at scale, employing “hydra‑cluster” architectures with massive networks of fraudulent accounts.
[Image: Gartner Diagram]
The three campaigns
| AI Lab | Targeted Claude capabilities | Number of exchanges |
|---|---|---|
| DeepSeek | Reasoning, rubric‑based grading, and generation of censorship‑safe alternatives to politically sensitive queries (e.g., about dissidents, party leaders, authoritarianism) | ~150 000 |
| Moonshot AI | Agentic reasoning, tool use, coding, computer‑use agent development, and computer vision | ~3.4 million |
| MiniMax | Agentic coding and tool‑use capabilities | ~13 million |
“The volume, structure, and focus of the prompts were distinct from normal usage patterns, reflecting deliberate capability extraction rather than legitimate use,” Anthropic added. “Each campaign targeted Claude’s most differentiated capabilities: agentic reasoning, tool use, and coding.”
How the attacks worked
- Access via proxy networks – Commercial proxy services resell Claude access, using “hydra‑cluster” architectures that host tens of thousands of fraudulent accounts.
- Prompt engineering – Large volumes of carefully crafted prompts are sent to Claude to harvest high‑quality responses.
- Model training – The harvested responses are used to train the attackers’ own models.
[Image: ThreatLocker Diagram]
“The breadth of these networks means there are no single points of failure. When one account is banned, a new one takes its place. In one case, a single proxy network managed more than 20 000 fraudulent accounts simultaneously, mixing distillation traffic with unrelated customer requests to make detection harder.” – Anthropic
Anthropic’s response
- Built classifiers and behavioral‑fingerprinting systems to spot suspicious distillation patterns in API traffic.
- Strengthened verification for educational accounts, security‑research programs, and startup organizations.
- Implemented enhanced safeguards to reduce the efficacy of model outputs for illicit distillation.
Related developments
Just weeks earlier, the Google Threat Intelligence Group (GTIG) disclosed and disrupted a series of distillation and model‑extraction attacks aimed at Gemini’s reasoning capabilities, involving more than 100 000 prompts.
Source: Anthropic press release, Detecting and Preventing Distillation Attacks; Google Threat Intelligence Group report.
Google: “Model extraction attacks do not typically represent a risk to average users, as they do not threaten the confidentiality, availability, or integrity of AI services,” the company said earlier this month. “Instead, the risk is concentrated among model developers and service providers.”
Found this article interesting? Follow us for more exclusive content:
- Google News: https://news.google.com/publications/CAAqLQgKIidDQklTRndnTWFoTUtFWFJvWldoaFkydGxjbTVsZDNNdVkyOXRLQUFQAQ
- Twitter: @thehackersnews
- LinkedIn: The Hacker News