[Paper] Extending Delta Debugging Minimization for Spectrum-Based Fault Localization

Published: (January 8, 2026 at 02:57 AM EST)
3 min read
Source: arXiv

Source: arXiv - 2601.04689v1

Overview

The paper presents DDMIN‑LOC, a novel technique that fuses Delta Debugging Minimization (DDMIN) with Spectrum‑Based Fault Localization (SBFL). By leveraging the minimal failure‑inducing inputs produced by DDMIN and feeding them into SBFL, the approach can pinpoint suspicious code locations even when only a single failing test case is available—particularly for programs that consume string inputs.

Key Contributions

  • Hybrid Fault‑Localization Pipeline: Introduces DDMIN‑LOC, which automatically generates a suite of passing/failing inputs via DDMIN and feeds them to SBFL algorithms.
  • Single‑Failing‑Input Requirement: Works with just one observed failure, removing the need for an extensive test harness.
  • Broad SBFL Compatibility: Evaluated with five popular SBFL formulas (Tarantula, Ochiai, GenProg, Jaccard, DStar).
  • Empirical Validation: Tested on 136 real‑world buggy programs from the QuixBugs and Codeflaws benchmarks, showing that faulty statements are often ranked in the top‑3 positions.
  • Practical Efficiency: In most cases, developers need to inspect fewer than 20 % of executable lines to locate the bug.

Methodology

  1. Input Minimization (DDMIN)
    • Starts with a single failing string input.
    • Systematically removes or reduces parts of the input (e.g., characters, substrings) while preserving the failure, converging on a minimal failure‑inducing input.
  2. Test Suite Generation
    • Each intermediate input produced by DDMIN is executed against the program.
    • Outcomes are labeled passing (no crash) or failing (crash/reproduced bug).
  3. Spectrum Collection
    • For every execution, the tool records which statements and predicates were exercised.
    • This yields a coverage matrix linking inputs (rows) to program elements (columns).
  4. SBFL Scoring
    • The coverage matrix and pass/fail labels are fed into classic SBFL formulas (e.g., Jaccard).
    • Each statement receives a suspiciousness score reflecting how often it appears in failing runs vs. passing runs.
  5. Ranking & Presentation
    • Scores from multiple SBFL formulas can be combined (e.g., averaging) or examined individually.
    • The final ranked list is presented to the developer, with the highest‑ranked statements being the most likely fault locations.

Results & Findings

SBFL FormulaAvg. % of Executable Lines to Inspect% of Bugs Ranked in Top‑3
Jaccard≈ 18 %≈ 85 %
Ochiai~22 %~78 %
Tarantula~25 %~70 %
DStar~24 %~73 %
GenProg~26 %~71 %
  • Jaccard consistently outperformed the other formulas, often locating the fault after examining less than one‑fifth of the code.
  • Across all benchmarks, the faulty statement appeared within the top three ranked locations for the majority of generated test suites, regardless of which failing input seed was used.
  • The approach remained robust even when the initial failing input was highly complex (e.g., long strings with nested structures).

Practical Implications

  • Rapid Debugging for Input‑Driven Apps: Tools that process textual data (parsers, compilers, configuration loaders, web APIs) can integrate DDMIN‑LOC to automatically generate a focused “debug checklist” from a single crash report.
  • Lightweight CI Integration: Since only one failing test is needed, CI pipelines can trigger DDMIN‑LOC on any unexpected failure without requiring a full regression suite.
  • Enhanced Developer Tooling: IDE plugins could surface the top‑ranked suspicious lines directly in the editor, reducing the time spent hunting for the root cause.
  • Security Vulnerability Triage: For fuzzing campaigns that discover a crash, DDMIN‑LOC can quickly narrow down the vulnerable code region, accelerating patch development.
  • Educational Use: Students learning debugging can see how systematic input reduction couples with coverage analysis to produce actionable insights.

Limitations & Future Work

  • String‑Input Restriction: The current implementation assumes the program’s primary input is a string; binary or multi‑modal inputs would need preprocessing or a different minimizer.
  • Scalability of DDMIN: For extremely large inputs, the iterative minimization may become time‑consuming; adaptive sampling or parallel execution could mitigate this.
  • SBFL Formula Sensitivity: While Jaccard performed best in the study, the optimal formula may vary across domains; an automated selector could be explored.
  • Dynamic Features: Programs that heavily rely on reflection, JIT compilation, or external state (files, network) may produce noisy spectra, reducing localization accuracy.
  • Future Directions: Extending DDMIN‑LOC to handle structured inputs (JSON, XML) via grammar‑aware minimization, integrating machine‑learning‑based ranking, and evaluating on larger, industry‑scale codebases.

Authors

  • Charaka Geethal Kapugama

Paper Information

  • arXiv ID: 2601.04689v1
  • Categories: cs.SE
  • Published: January 8, 2026
  • PDF: Download PDF
Back to Blog

Related posts

Read more »