[Paper] PSALM: applying Proportional SAmpLing strategy in Metamorphic testing

Published: 3 days ago (December 15, 2025 at 10:04 AM EST)

3 min read

Source: arXiv

Source: arXiv - 2512.13414v1

Overview

Metamorphic testing (MT) sidesteps the classic “oracle problem” by checking whether related test executions obey predefined metamorphic relations (MRs). This paper introduces PSALM, a Proportional SAmpLing strategy that adapts the well‑known Proportional Sampling Strategy (PSS) to the dual‑selection problem in MT: picking source test cases and forming metamorphic groups (MGs). The authors prove that PSALM never performs worse than random selection and show, through a large empirical study, that it often outperforms state‑of‑the‑art MT selectors such as ART and MT‑ART.

Key Contributions

Formal adaptation of PSS to MT – defines a proportional sampling scheme that works simultaneously for source test case selection and MG construction.
Theoretical guarantees – proves PSALM is never inferior to random sampling regardless of how the test domain is partitioned, and identifies conditions where source‑case and MG selection have identical effectiveness.
Comprehensive empirical evaluation – 8 real‑world programs, 184 seeded mutants, and comparison against ART/MT‑ART, confirming the theoretical advantages.
Practical selection algorithm – provides a concrete, easy‑to‑implement procedure that can be plugged into existing MT pipelines.

Methodology

Problem Formalization – The authors model MT as two linked sampling problems: (a) selecting a set of source inputs, and (b) grouping each source with its follow‑up inputs to form MGs.
Proportional Sampling Extension – Starting from classic PSS (which samples proportionally to failure likelihood across partitions), they redesign the probability distribution to reflect the joint failure space of source cases and their MGs.
Proof Sketch – Using combinatorial arguments, they demonstrate that for any partition of the input space, the expected fault‑detection rate of PSALM ≥ that of uniform random sampling.
Empirical Setup
- Subjects: 8 open‑source Java programs (e.g., JFreeChart, Commons‑Math).
- Mutants: 184 mutants generated with PIT, representing realistic faults.
- Baselines: Random selection, Adaptive Random Testing (ART), and MT‑ART (the MT‑specific variant of ART).
- Metrics: Fault detection rate, number of test executions needed to expose a mutant, and runtime overhead.

Results & Findings

Metric	PSALM vs. Random	PSALM vs. ART	PSALM vs. MT‑ART
Fault detection rate	+8 % on average (statistically significant)	+5 %	+6 %
Tests to first fault	12 % fewer tests needed	9 % fewer	10 % fewer
Runtime overhead	Negligible (< 2 % extra)	Comparable	Comparable

The theoretical advantage of PSALM manifested consistently across all subjects.
In cases where the source‑case and MG partitions aligned (the “identical effectiveness” condition), PSALM’s benefit over ART vanished, confirming the authors’ analytical prediction.
The overhead of computing proportional probabilities was minimal, making PSALM practical for large test suites.

Practical Implications

Plug‑and‑play for MT frameworks – PSALM can replace the default random selector in tools like MetamorphicTest or EvoSuite with a drop‑in module.
Higher fault‑detection efficiency – Developers can achieve the same coverage with fewer test executions, saving CI time and compute resources.
Better ROI on MR engineering – Since MT already requires effort to craft high‑quality MRs, PSALM maximizes the payoff of each MR by smarter test selection.
Scalable to large codebases – The low computational cost means PSALM is suitable for continuous‑integration pipelines that run thousands of MT cases nightly.

Limitations & Future Work

Assumption of known partitions – PSALM’s theoretical guarantee relies on a reasonable partitioning of the input space; poorly chosen partitions may dilute its advantage.
Focus on Java mutants – The empirical study is limited to Java programs and PIT mutants; cross‑language validation remains open.
Static proportional model – The current implementation uses a static probability distribution; future work could explore dynamic, data‑driven updates as test results accumulate.
Integration with MR generation – The authors note that coupling PSALM with automated MR synthesis could further boost MT effectiveness, a promising direction for follow‑up research.

Authors

Zenghui Zhou
Pak-Lok Poon
Zheng Zheng
Xiao-Yi Zhang

Paper Information

arXiv ID: 2512.13414v1
Categories: cs.SE
Published: December 15, 2025
PDF: Download PDF

[Paper] PSALM: applying Proportional SAmpLing strategy in Metamorphic testing

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] A High-level Synthesis Toolchain for the Julia Language

[Paper] WuppieFuzz: Coverage-Guided, Stateful REST API Fuzzing

[Paper] A Container-based Approach For Proactive Asset Administration Shell Digital Twins

[Paper] Insecure Ingredients? Exploring Dependency Update Patterns of Bundled JavaScript Packages on the Web