[Paper] Early Comparative Evaluation of Transformer Models for Multilingual Software Vulnerability Detection
Source: arXiv - 2606.10925v1
Overview
Software vulnerability detection is increasingly important as modern applications combine multiple programming languages. This paper presents an early comparative evaluation of BERT, RoBERTa, and CodeBERT for binary vulnerability detection across HTML, Python, JavaScript, and PHP using the CVEFixes dataset and language-wise three-fold stratified cross-validation. The results show clear performance differences across languages, indicating that multilingual vulnerability detection requires more language-aware and robust transformer-based modelling strategies.
Key Contributions
This paper presents research in the following areas:
- cs.SE
Methodology
Please refer to the full paper for detailed methodology.
Practical Implications
This research contributes to the advancement of cs.SE.
Authors
- Fiza Naseer
- Javad Khan
- Muhammad Yaqoob
- Alexios Mylonas
Paper Information
- arXiv ID: 2606.10925v1
- Categories: cs.SE
- Published: June 9, 2026
- PDF: Download PDF