Testing if "bash is all you need"
Blog post from Vercel
Ankur Goyal from Braintrust explored the efficacy of using filesystems and bash as abstractions for AI agents, presenting an evaluation of different agent approaches to querying semi-structured data like GitHub issues. The study compared SQL, bash, and basic filesystem tools, revealing that SQL achieved 100% accuracy with lower costs and run times, while bash was less efficient and more costly, though it demonstrated sophisticated shell scripting capabilities. The evaluation identified performance bottlenecks and inaccuracies in the initial setup, leading to optimizations and corrections that narrowed the performance gap. A hybrid approach, combining bash and SQL, was developed, achieving consistent accuracy through self-verification, though at a higher token cost. The study emphasized the importance of detailed evaluations and collaboration to improve AI tools, highlighting the role of bash in exploring and verifying data, while SQL remains optimal for structured queries.