· ai
[Paper] ECHO-2: A Large Scale Distributed Rollout Framework for Cost-efficient Reinforcement Learning
Reinforcement learning (RL) is a critical stage in post-training large language models (LLMs), involving repeated interaction between rollout generation, reward...