DeepSWE: Training a Fully Open-sourced, State-of-the-Art Coding Agent by Scaling RL
Blog post from Together AI
DeepSWE-Preview, a state-of-the-art coding agent developed through a collaboration between the Agentica team and Together AI, achieves significant performance in reasoning-enabled coding tasks using only reinforcement learning (RL) on the Qwen3-32B model. This open-source agent demonstrates a 59% success rate on the SWE-Bench-Verified benchmark, surpassing previous open-weight models with 42.2% Pass@1 and 71.0% Pass@16 scores. The agent is trained through Agentica's rLLM framework, utilizing 4,500 real-world software engineering tasks over six days on 64 H100 GPUs, and the entire process, including datasets, code, and training logs, is open-sourced for community advancement. DeepSWE-Preview innovatively navigates complex software engineering environments, leveraging a mix of reinforcement learning techniques and hybrid test-time scaling strategies to enhance coding agents' efficacy. The project underscores the potential of RL to advance long-horizon, multi-step reasoning models in software development, offering a comprehensive foundation for future explorations in agentic AI domains.