Company
Date Published
Author
-
Word count
1843
Language
English
Hacker News points
None

Summary

Building a code search tool for Graphite Chat required overcoming limitations of traditional search methods like grep when applied to large repositories with numerous files and commits. The challenge lay in supporting fast searches across any commit without extensive resource use. While initial attempts with AWS storage solutions and Elasticsearch showed potential, they were insufficient for indexing vast repositories at every commit. The breakthrough came by adopting a Git-like approach, storing data as blobs and trees, which allowed efficient search queries by matching blob IDs with trees for specific commits. This method enabled fast, commit-agnostic searches with median latency under 100 milliseconds, significantly improving upon previous GitHub API reliance. The innovative indexing system is now live, indexing millions of files and serving thousands of queries efficiently, with further enhancements anticipated, such as semantic search capabilities.