[Paper] CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation
GPU kernel optimization is fundamental to modern deep learning but remains a highly specialized task requiring deep hardware expertise. Despite strong performan...