Home / Companies / Martian / Blog / Post Details
Content Deep Dive

ARES: Open-Source Infrastructure for Online RL on Coding Agents

Blog post from Martian

Post Details
Company
Date Published
Author
-
Word Count
1,094
Language
English
Hacker News Points
-
Summary

ARES, or the Agentic Research and Evaluation Suite, is an open-source framework designed to train coding agents using true online reinforcement learning (RL), focusing on the large language model (LLM) itself as the policy. It aims to foster real exploration and rapid feedback, addressing limitations of batch-style RL that often restricts exploration and discovery of new behaviors. ARES supports massively parallel asynchronous rollouts with tens of thousands of verifiable coding tasks, including SWE-Bench Verified, to enhance agent training. By treating the LLM as the core agent, ARES allows for a more direct optimization of the model's policy, maintaining a clear boundary between the agent and the environment. The framework is built to be RL-first and asynchronously native, significantly improving throughput and enabling a broader range of RL algorithms. Additionally, ARES uses the Harbor task format, ensuring task portability across different evaluation frameworks, and it supports a community-driven approach for task proliferation. The framework is designed as infrastructure for online RL on coding agents, encouraging collaboration with teams to integrate training algorithms and pretrained agents.