· ai
[Paper] DUALGUAGE: Automated Joint Security-Functionality Benchmarking for Secure Code Generation
Large language models (LLMs) and autonomous coding agents are increasingly used to generate software across a wide range of domains. Yet a core requirement rema...