The text discusses the importance of sandbox environments in reinforcement learning (RL) and their role in supporting agent execution, particularly for large language models (LLMs). The separation of sandbox execution from inference nodes is crucial to ensure safety, reproducibility, and efficient training. Traditional remote sandbox setups face limitations such as resource contention, limited parallelism, and GPU under-utilization, which can degrade performance and introduce unpredictable delays. To address these challenges, specialized managed sandbox solutions have emerged, providing automatic environment provisioning, built-in state management, transparent resource isolation, simplified API interfaces, and other benefits that enable researchers to focus on agent logic rather than infrastructure. The future of RL is expected to be sandbox-centric, with well-managed sandbox infrastructure becoming the standard for ambitious research, accelerating experimentation, iteration, and discovery.