Improving agent with semantic search

Post Details

Company

Cursor

Date Published

Nov. 6, 2025

Author

-

Word Count

539

Language

English

Hacker News Points

-

Source URL

cursor.com/blog/semsearch

Summary

Coding agents improve their performance in understanding and navigating codebases through the use of semantic search, which complements traditional regex-based tools like grep. By employing a custom embedding model trained on agent session data, Cursor enhances its semantic search capabilities, resulting in significant improvements in accuracy and efficiency for coding tasks. Evaluations, such as the Cursor Context Bench and online A/B tests, demonstrate that semantic search leads to higher code retention rates, fewer dissatisfied user requests, and overall better outcomes, particularly in large codebases with over 1,000 files. The feedback loop created by aligning the model's similarity scores with LLM-generated rankings from agent task traces further refines the retrieval process, making semantic search a critical tool for handling extensive and complex codebases.