Home / Companies / Anthropic / Blog / Post Details
Content Deep Dive

Detecting and preventing distillation attacks

Blog post from Anthropic

Post Details
Company
Date Published
Author
Anthropic Team
Word Count
1,448
Language
English
Hacker News Points
-
Summary

A recent announcement highlights industrial-scale distillation attacks by three AI laboratories—DeepSeek, Moonshot, and MiniMax—aimed at extracting capabilities from the AI model Claude to enhance their own systems. Distillation, while a legitimate AI training method, is being used illicitly by these labs, circumventing terms of service and regional access restrictions, posing national security risks by potentially stripping models of critical safeguards. These attacks exploit vulnerabilities in export controls and involve fraudulent accounts and proxy services to access Claude, with the labs generating millions of exchanges targeting specific capabilities such as reasoning, coding, and tool use. The announcement details the methods used by each lab, the scale of their operations, and the broader implications for AI security and export controls, emphasizing the need for a coordinated industry and policy response to mitigate such threats. Anthropic, the company behind Claude, is investing in defenses and intelligence sharing to detect and prevent further distillation attacks, while advocating for global cooperation to address this growing issue.