Home / Companies / GitGuardian / Blog / Post Details
Content Deep Dive

Secrets Detection - Optimizing filter processes

Blog post from GitGuardian

Post Details
Company
Date Published
Author
Guardians
Word Count
1,514
Company Posts That Month
8
Language
English
Hacker News Points
-
Summary

In the article, Henri Hubert discusses optimizing the performance of GitGuardian's secret detection engine, which involves balancing precision, recall, and speed. The engine is divided into three stages: prevalidation, matching, and postvalidation, with prevalidation being the most time-consuming due to its high frequency of calls, despite its low per-call duration. By analyzing benchmarks, the team identified that prevalidation could be improved by caching frequently accessed properties, reordering steps for efficiency, and incorporating lightweight keyword searches. These optimizations led to significant speed improvements without sacrificing precision or recall. The study underscores the importance of the initial filtering steps in data processing pipelines, emphasizing that each step should have a clear, singular purpose to maintain system clarity and adaptability. Additionally, the article hints at further optimizations through regex engine experimentation, which will be discussed in future publications.

Trends Found in this Post
Trend Post Mentions Total Month Mentions Posts Companies MoM
Secrets Management 8 573 66 36 +2%
Real-time 1 1,004 320 104 +5%