Expanding our model safety bug bounty program

Post Details

Company

Anthropic

Date Published

Aug. 8, 2024

Author

-

Word Count

681

Company Posts That Month

4

Language

English

Hacker News Points

-

Post removed?

No

Source URL

www.anthropic.com/news/model-safety-bug-bounty

Summary

We're expanding our model safety bug bounty program to identify and mitigate universal jailbreak attacks, which are exploits that allow consistent bypassing of AI safety guardrails across a wide range of areas, including critical domains such as CBRN and cybersecurity. The new initiative will test our next-generation system for AI safety mitigations in a controlled environment before its public deployment, offering bounty rewards up to $15,000 for novel attacks. We're inviting interested researchers to apply to the program and work with us to strengthen AI safety in high-risk areas, aligning with commitments signed by other AI companies to develop responsible AI.

Trends Found in this Post

Trend	Post Mentions	Total Month Mentions	Posts	Companies	MoM
AI Guardrails	4	152	59	36	-22%

Use This Data

Use this post, company, and trend context to find content marketing opportunities, perform competitive analysis, or address product feature gaps via the Plushcap MCP server or the Plushcap API.