Company
Date Published
Author
-
Word count
1650
Language
English
Hacker News points
None

Summary

We have activated the AI Safety Level 3 (ASL-3) Deployment and Security Standards in conjunction with launching Claude Opus 4, a newly released model. The ASL-3 measures are designed to increase internal security and limit the risk of misuse specifically for the development or acquisition of chemical, biological, radiological, and nuclear (CBRN) weapons. These measures make it harder to steal model weights and cover a narrowly targeted set of deployment measures that limit the risk of Claude being misused for CBRN-related tasks. The new measures are part of Anthropic's Responsible Scaling Policy, which aims to increase increasingly capable AI models warranting stronger deployment and security protections. The ASL-3 Standard involves constitutional classifiers that monitor model inputs and outputs and intervene to block harmful CBRN information. Ongoing refinement and iteration will be necessary to improve the effectiveness of these measures and address potential issues.