[Paper] QMutBench: A Dataset of Quantum Circuit Mutants

Published: 2 days ago (April 17, 2026 at 05:20 AM EDT)

4 min read

Source: arXiv

Source: arXiv - 2604.15870v1

Overview

The paper introduces QMutBench, a publicly‑available dataset of more than 700 k mutants—intentionally faulty versions—of quantum circuits. By supplying a rich, searchable repository of realistic quantum bugs, the authors give researchers and engineers a concrete benchmark for measuring how well their quantum testing tools actually catch errors.

Key Contributions

Large‑scale mutant corpus: > 700 000 quantum circuit mutants covering a wide range of gate‑level faults.
Online query interface: Users can filter mutants by original circuit, target survival rate, gate type, and other mutation attributes.
Standardised fault taxonomy: The dataset classifies mutations (e.g., gate replacement, parameter perturbation, qubit‑swap) to enable reproducible experiments.
Benchmarking baseline: Provides a ready‑to‑use ground truth for evaluating test‑case effectiveness and for comparing different quantum testing strategies.
Enabler for mutation‑guided testing: The resource can be leveraged to design new testing heuristics that specifically target hard‑to‑detect faults.

Methodology

Circuit selection – The authors gathered a diverse set of quantum programs from existing repositories (Qiskit tutorials, IBM Q Experience examples, etc.) to serve as “original” circuits.
Mutation operators – They defined a suite of quantum‑specific mutation operators, such as:
- Gate substitution (e.g., replace an X with a Y).
- Parameter alteration (tweak rotation angles).
- Qubit re‑mapping (swap control/target qubits).
- Insertion/deletion of identity or measurement gates.
Automated mutant generation – A custom script applied each operator to every eligible location in each original circuit, producing a combinatorial explosion of mutants.
Survival‑rate estimation – For each mutant, the authors simulated its execution on a noisy quantum backend to estimate the probability that the fault would be undetected (the “survival rate”).
Dataset packaging – Mutants, metadata (original circuit ID, operator type, affected qubits, survival rate), and a lightweight web UI were bundled and released under an open‑source license.

Results & Findings

Coverage breadth: The final corpus spans circuits ranging from 2‑qubit toy examples to 20‑qubit algorithms, ensuring relevance for both near‑term NISQ devices and larger future hardware.
Diverse fault profiles: Survival rates vary widely (from < 1 % to > 90 %), highlighting that some mutations are trivially caught while others are stealthy—exactly the kind of edge cases needed for robust testing.
Baseline effectiveness: When applying a simple random test‑case generator, the authors observed average fault detection rates of ~45 %, confirming that many mutants remain undetected by naïve testing.
Usability: The web interface allows users to retrieve a custom subset (e.g., “all mutants of circuit X with survival rate > 70 %”) in seconds, demonstrating the practicality of the dataset for rapid prototyping.

Practical Implications

Test‑suite evaluation – Developers can now quantify how many of the 700 k+ realistic faults their quantum test generators actually expose, turning vague “coverage” claims into concrete numbers.
Tool comparison – Researchers can benchmark competing testing frameworks (e.g., property‑based testing vs. fuzzing) on a shared fault set, fostering fairer competition and faster progress.
Mutation‑guided test generation – By focusing on mutants with high survival rates, automated test generators can prioritize “hard” faults, leading to more efficient use of limited quantum hardware time.
Education & onboarding – Instructors can use QMutBench to illustrate common quantum programming mistakes, giving students hands‑on experience with debugging quantum code.
Hardware‑aware testing – Since survival rates are estimated on noisy simulators, the dataset can help developers understand how hardware noise masks certain bugs, informing error‑mitigation strategies.

Limitations & Future Work

Noise model dependency – Survival rates are based on simulated noise; real devices may exhibit different detection characteristics.
Operator scope – While the current mutation operators cover many gate‑level faults, higher‑level logical bugs (e.g., incorrect algorithmic flow) are not represented.
Scalability to larger circuits – Generating mutants for circuits beyond ~30 qubits becomes computationally expensive; future work could explore sampling strategies.
Dynamic updates – The dataset is static; incorporating new quantum languages or emerging gate sets will require periodic maintenance.

Overall, QMutBench fills a critical gap in quantum software engineering by giving the community a shared, extensible benchmark for testing and improving quantum code.

Authors

Eñaut Mendiluze Usandizaga
Thomas Laurent
Paolo Arcaini
Shaukat Ali

Paper Information

arXiv ID: 2604.15870v1
Categories: cs.SE, cs.DB
Published: April 17, 2026
PDF: Download PDF

[Paper] QMutBench: A Dataset of Quantum Circuit Mutants

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Investigating Conversational Agents to Support Secondary School Students Learning CSP

[Paper] From Papers to Progress: Rethinking Knowledge Accumulation in Software Engineering

[Paper] Bridging the Gap between User Intent and LLM: A Requirement Alignment Approach for Code Generation

[Paper] Supporting the Comprehension of Data Analysis Scripts