Detecting and preventing distillation attacks

Post Details

Company

Anthropic

Date Published

Feb. 23, 2026

Author

Anthropic Team

Word Count

1,448

Language

English

Hacker News Points

-

Source URL

www.anthropic.com/news/detecting-and-preventing-distillation-attacks

Summary

A recent announcement highlights industrial-scale distillation attacks by three AI laboratories—DeepSeek, Moonshot, and MiniMax—aimed at extracting capabilities from the AI model Claude to enhance their own systems. Distillation, while a legitimate AI training method, is being used illicitly by these labs, circumventing terms of service and regional access restrictions, posing national security risks by potentially stripping models of critical safeguards. These attacks exploit vulnerabilities in export controls and involve fraudulent accounts and proxy services to access Claude, with the labs generating millions of exchanges targeting specific capabilities such as reasoning, coding, and tool use. The announcement details the methods used by each lab, the scale of their operations, and the broader implications for AI security and export controls, emphasizing the need for a coordinated industry and policy response to mitigate such threats. Anthropic, the company behind Claude, is investing in defenses and intelligence sharing to detect and prevent further distillation attacks, while advocating for global cooperation to address this growing issue.