DeepSWE: Training a Fully Open-sourced, State-of-the-Art Coding Agent by Scaling RL

Post Details

Company

Together AI

Date Published

July 2, 2025

Author

Michael Luo*, Naman Jain*, Jaskirat Singh*, Sijun Tan*, Ameen Patel*, Qingyang Wu*, Alpay Ariyak*, Colin Cai*, Tarun Venkat, Shang Zhu, Ben Athiwaratkun, Manan Roongta, Ce Zhang, Li Erran Li, Raluca Ada Popa, Koushik Sen, Ion Stoica

Word Count

3,655

Language

English

Hacker News Points

-

Source URL

www.together.ai/blog/deepswe

Summary

DeepSWE-Preview, a state-of-the-art coding agent developed through a collaboration between the Agentica team and Together AI, achieves significant performance in reasoning-enabled coding tasks using only reinforcement learning (RL) on the Qwen3-32B model. This open-source agent demonstrates a 59% success rate on the SWE-Bench-Verified benchmark, surpassing previous open-weight models with 42.2% Pass@1 and 71.0% Pass@16 scores. The agent is trained through Agentica's rLLM framework, utilizing 4,500 real-world software engineering tasks over six days on 64 H100 GPUs, and the entire process, including datasets, code, and training logs, is open-sourced for community advancement. DeepSWE-Preview innovatively navigates complex software engineering environments, leveraging a mix of reinforcement learning techniques and hybrid test-time scaling strategies to enhance coding agents' efficacy. The project underscores the potential of RL to advance long-horizon, multi-step reasoning models in software development, offering a comprehensive foundation for future explorations in agentic AI domains.