why agents DO NOT write most of our code - a reality check
Blog post from Octomind
At Octomind, AI agents are employed extensively in workflows, yet the majority of coding is still done by humans, as the agents have not significantly boosted productivity. Despite experimenting with tools like Cursor, Claude Code, and Windsurf, the AI's coding capabilities fall short, often failing to meet the efficiency gains claimed by companies like Anthropic, Microsoft, or Google. Attempts to develop features entirely with AI, such as implementing a branch-specific test system for an end-to-end testing platform, revealed limitations, including incomplete work, inefficient coding practices, and a lack of adherence to existing coding standards. The AI struggles with maintaining a comprehensive mental model of the codebase, and its inability to self-reflect and accurately assess its performance limits its effectiveness. However, AI remains valuable for specific tasks like writing unit tests and automating routine coding activities, suggesting that while it can be a useful tool, it cannot yet fully replace human developers in complex or nuanced programming tasks.