Home / Companies / Semgrep / Blog / Post Details
Content Deep Dive

Secrets Story: The Prefixed Secrets That Tried%20to%2BGet\nAway

Blog post from Semgrep

Post Details
Company
Date Published
Author
Lewis Ardern
Word Count
2,502
Language
English
Hacker News Points
-
Summary

Secret scanning tools, essential for application security, often miss valid secrets due to design choices aimed at minimizing false positives, such as reliance on non-word boundaries and keywords. This leads to undetected leaks of sensitive data like API keys and tokens across platforms such as GitHub, OpenAI, and Anthropic. The blog post explores how secret scanners work, their methods to reduce false positives, and the resultant false negatives with examples from real repositories. Issues arise from prefix collisions, lack of unique identifiers, and overly strict boundary checks, which prevent detection of legitimate secrets. Recommendations include refining detection rules, ensuring precise token format specifications, and encouraging third-party services to document token formats and establish verification endpoints. The post also suggests that services should consider monitoring public repositories and implementing measures like token expiration to mitigate risks associated with leaked secrets.