Home / Companies / HuggingFace / Blog / Post Details
Content Deep Dive

AI Coding Assistants Keep Shipping Vulnerable Code -- Here's What We're Doing About It

Blog post from HuggingFace

Post Details
Company
Date Published
Author
Scott Thornton
Word Count
371
Language
-
Hacker News Points
-
Summary

AI coding assistants are increasingly responsible for generating a significant portion of codebases, with over 60% of some codebases comprising AI-generated code, much of which contains known vulnerabilities. In response to this security concern, SecureCode was developed as the largest open security training dataset for AI coding assistants, aiming to improve the security practices in AI-generated code. SecureCode now includes three datasets, with examples grounded in real-world security incidents like the Equifax and Capital One breaches, and covers framework-specific security patterns for popular frameworks such as Express.js and Django. The datasets are designed to be easily integrated into training models for AI, offering 219 examples of idiomatic security practices and highlighting the importance of incorporating security-focused datasets to reduce the current 45% rate of vulnerable code produced by AI assistants.