[Paper] QMutBench: A Dataset of Quantum Circuit Mutants

Published: (April 17, 2026 at 05:20 AM EDT)
4 min read
Source: arXiv

Source: arXiv - 2604.15870v1

Overview

The paper introduces QMutBench, a publicly‑available dataset of more than 700 k mutants—intentionally faulty versions—of quantum circuits. By supplying a rich, searchable repository of realistic quantum bugs, the authors give researchers and engineers a concrete benchmark for measuring how well their quantum testing tools actually catch errors.

Key Contributions

  • Large‑scale mutant corpus: > 700 000 quantum circuit mutants covering a wide range of gate‑level faults.
  • Online query interface: Users can filter mutants by original circuit, target survival rate, gate type, and other mutation attributes.
  • Standardised fault taxonomy: The dataset classifies mutations (e.g., gate replacement, parameter perturbation, qubit‑swap) to enable reproducible experiments.
  • Benchmarking baseline: Provides a ready‑to‑use ground truth for evaluating test‑case effectiveness and for comparing different quantum testing strategies.
  • Enabler for mutation‑guided testing: The resource can be leveraged to design new testing heuristics that specifically target hard‑to‑detect faults.

Methodology

  1. Circuit selection – The authors gathered a diverse set of quantum programs from existing repositories (Qiskit tutorials, IBM Q Experience examples, etc.) to serve as “original” circuits.
  2. Mutation operators – They defined a suite of quantum‑specific mutation operators, such as:
    • Gate substitution (e.g., replace an X with a Y).
    • Parameter alteration (tweak rotation angles).
    • Qubit re‑mapping (swap control/target qubits).
    • Insertion/deletion of identity or measurement gates.
  3. Automated mutant generation – A custom script applied each operator to every eligible location in each original circuit, producing a combinatorial explosion of mutants.
  4. Survival‑rate estimation – For each mutant, the authors simulated its execution on a noisy quantum backend to estimate the probability that the fault would be undetected (the “survival rate”).
  5. Dataset packaging – Mutants, metadata (original circuit ID, operator type, affected qubits, survival rate), and a lightweight web UI were bundled and released under an open‑source license.

Results & Findings

  • Coverage breadth: The final corpus spans circuits ranging from 2‑qubit toy examples to 20‑qubit algorithms, ensuring relevance for both near‑term NISQ devices and larger future hardware.
  • Diverse fault profiles: Survival rates vary widely (from < 1 % to > 90 %), highlighting that some mutations are trivially caught while others are stealthy—exactly the kind of edge cases needed for robust testing.
  • Baseline effectiveness: When applying a simple random test‑case generator, the authors observed average fault detection rates of ~45 %, confirming that many mutants remain undetected by naïve testing.
  • Usability: The web interface allows users to retrieve a custom subset (e.g., “all mutants of circuit X with survival rate > 70 %”) in seconds, demonstrating the practicality of the dataset for rapid prototyping.

Practical Implications

  • Test‑suite evaluation – Developers can now quantify how many of the 700 k+ realistic faults their quantum test generators actually expose, turning vague “coverage” claims into concrete numbers.
  • Tool comparison – Researchers can benchmark competing testing frameworks (e.g., property‑based testing vs. fuzzing) on a shared fault set, fostering fairer competition and faster progress.
  • Mutation‑guided test generation – By focusing on mutants with high survival rates, automated test generators can prioritize “hard” faults, leading to more efficient use of limited quantum hardware time.
  • Education & onboarding – Instructors can use QMutBench to illustrate common quantum programming mistakes, giving students hands‑on experience with debugging quantum code.
  • Hardware‑aware testing – Since survival rates are estimated on noisy simulators, the dataset can help developers understand how hardware noise masks certain bugs, informing error‑mitigation strategies.

Limitations & Future Work

  • Noise model dependency – Survival rates are based on simulated noise; real devices may exhibit different detection characteristics.
  • Operator scope – While the current mutation operators cover many gate‑level faults, higher‑level logical bugs (e.g., incorrect algorithmic flow) are not represented.
  • Scalability to larger circuits – Generating mutants for circuits beyond ~30 qubits becomes computationally expensive; future work could explore sampling strategies.
  • Dynamic updates – The dataset is static; incorporating new quantum languages or emerging gate sets will require periodic maintenance.

Overall, QMutBench fills a critical gap in quantum software engineering by giving the community a shared, extensible benchmark for testing and improving quantum code.

Authors

  • Eñaut Mendiluze Usandizaga
  • Thomas Laurent
  • Paolo Arcaini
  • Shaukat Ali

Paper Information

  • arXiv ID: 2604.15870v1
  • Categories: cs.SE, cs.DB
  • Published: April 17, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »