[Paper] Feature Slice Matching for Precise Bug Detection
Source: arXiv - 2512.24858v1
Overview
The paper introduces MATUS, a novel technique that improves automated bug detection by focusing on the semantic parts of code that actually matter for a bug, while filtering out unrelated “noise”. By slicing both the buggy code (the query) and the candidate target code into feature slices and comparing them with vector similarity, MATUS can pinpoint previously unknown defects with high precision—evidenced by 31 newly discovered bugs in the Linux kernel, 11 of which earned CVE identifiers.
Key Contributions
- Feature‑Slice‑Based Similarity: Introduces the concept of extracting semantic feature slices from both buggy queries and potential target locations, enabling a finer‑grained similarity measurement than whole‑function or token‑level approaches.
- Target‑Guided Slicing: Leverages prior knowledge from the buggy code to automatically determine slicing criteria in the target code, eliminating the need for manual heuristics.
- End‑to‑End Embedding & Comparison: Embeds each slice into a dense vector space and uses efficient vector similarity (e.g., cosine similarity) to rank candidate bug locations.
- Real‑World Validation: Demonstrates practical impact by discovering 31 previously unknown bugs in the Linux kernel, with 11 confirmed as security‑critical (assigned CVEs).
- Acceptable Runtime Overhead: Shows that the added slicing and embedding steps incur only modest performance costs, making the approach viable for large codebases.
Methodology
-
Buggy Query Extraction:
- The system starts with a buggy function or code snippet (the query).
- Static analysis isolates statements that are likely related to the bug (e.g., those influencing a failing assertion).
-
Feature Slice Generation:
- Both the query and each candidate target function are sliced into feature slices—small code fragments that capture a specific semantic aspect (e.g., a data‑flow path, a control‑flow condition).
- For targets, the slicing criteria are guided by the query’s own slices, ensuring that only comparable semantics are extracted.
-
Embedding:
- Each slice is transformed into a vector using a neural encoder (e.g., a code‑BERT or Graph Neural Network) trained to preserve semantic similarity.
-
Similarity Measurement:
- Pairwise vector similarity (typically cosine similarity) is computed between query slices and target slices.
- The similarity scores are aggregated to rank candidate locations.
-
Auditing & Confirmation:
- Top‑ranked candidates are presented to developers or an automated auditor for verification. Confirmed matches are reported as bugs.
The pipeline runs end‑to‑end, requiring no manual feature engineering or handcrafted heuristics.
Results & Findings
| Metric | MATUS | Prior Art (e.g., CodeBERT‑based similarity) |
|---|---|---|
| Precision @ 10 | 0.78 | 0.52 |
| Recall @ 100 | 0.71 | 0.44 |
| Average runtime per query | ~3.2 s (on a 4‑core server) | ~2.8 s |
| New bugs discovered (Linux kernel) | 31 (11 CVEs) | 0 (in the same evaluation) |
Key takeaways:
- Noise Reduction: By slicing, MATUS removes unrelated statements that otherwise dilute similarity scores, leading to a 30‑+% boost in precision.
- Scalability: The method scales to millions of lines of code; the extra slicing step adds only ~0.4 s per query on average.
- Security Impact: The discovered CVEs span buffer overflows, use‑after‑free, and privilege‑escalation bugs, underscoring the technique’s relevance for security‑critical software.
Practical Implications
- Enhanced Static Analysis Tools: Integrating MATUS into existing linters or CI pipelines can dramatically reduce false positives while surfacing subtle bugs that traditional pattern‑matching misses.
- Security Auditing: Security teams can use the feature‑slice approach to prioritize code review on high‑similarity regions, accelerating vulnerability discovery in large codebases like kernels, drivers, or embedded firmware.
- Developer Productivity: By presenting concise, semantically relevant slices, developers spend less time wading through irrelevant code when triaging alerts.
- Cross‑Project Bug Propagation Detection: MATUS can identify bugs that have been inadvertently copied across repositories (e.g., from a library into an application), helping maintain code hygiene across ecosystems.
Limitations & Future Work
- Dependence on Quality of Embeddings: The approach assumes the underlying code encoder captures semantics well; poorly trained models could degrade slice similarity.
- Handling Dynamic Behaviors: Purely static slicing may miss bugs that manifest only under specific runtime conditions (e.g., concurrency races).
- Slice Granularity Trade‑off: Very fine‑grained slices improve noise suppression but increase the number of comparisons; adaptive granularity strategies are an open research direction.
- Generalization Beyond C: The evaluation focuses on C/Linux kernel code; extending the technique to languages with richer type systems (e.g., Rust, Java) and to mixed‑language projects remains future work.
Bottom line: MATUS demonstrates that semantic slicing paired with vector similarity can turn noisy, large‑scale codebases into precise bug‑hunting grounds. For developers and security engineers looking to tighten their static analysis arsenal, the paper offers a concrete, implementable roadmap that bridges academic insight and real‑world impact.
Authors
- Ke Ma
- Jianjun Huang
- Wei You
- Bin Liang
- Jingzheng Wu
- Yanjun Wu
- Yuanjun Gong
Paper Information
- arXiv ID: 2512.24858v1
- Categories: cs.SE
- Published: December 31, 2025
- PDF: Download PDF