Anthropic's Responsible Scaling Policy

Company

Anthropic

Date Published

Sept. 19, 2023

Author

Word count

883

Language

English

Hacker News points

None

URL

www.anthropic.com/news/anthropics-responsible-scaling-policy

Summary

Anthropic's Responsible Scaling Policy outlines technical and organizational protocols to manage the risks of developing increasingly capable AI systems, focusing on catastrophic risks that could cause large-scale devastation. The policy defines a framework called AI Safety Levels (ASL), modeled after biosafety levels, which requires safety, security, and operational standards appropriate to a model's potential for catastrophic risk. The ASL system categorizes models into four levels: ASL-1 (no meaningful catastrophic risk), ASL-2 (early signs of dangerous capabilities), ASL-3 (substantial increase in risk), and ASL-4 and higher ( qualitative escalations in catastrophic misuse potential). Anthropic aims to strike a balance between targeting catastrophic risks and incentivizing beneficial applications and safety progress, with the goal of creating a "race to the top" dynamic among frontier labs. The policy has been formally approved by Anthropic's board and will not alter current uses of Claude or disrupt availability of products.