[Paper] Auditing Reproducibility in Non-Targeted Analysis: 103 LC/GC--HRMS Tools Reveal Temporal Divergence Between Openness and Operability
Source: arXiv - 2512.20279v1
Overview
The paper audits the reproducibility of non‑targeted analysis (NTA) pipelines that rely on high‑resolution LC/GC‑HRMS data. By evaluating 103 publicly released tools spanning two decades, the authors expose a growing gap between how openly the tools are documented and how easily they can be re‑run by an independent lab—a critical issue when NTA results drive regulatory actions.
Key Contributions
- Comprehensive audit of 103 NTA software tools (2004‑2025) against six reproducibility pillars derived from FAIR and BP4NTA principles.
- Quantitative trends showing openness (data/code sharing) rising from 56 % to 86 % while operability (portable, validated execution) fell from 55 % to 43 %.
- Sector breakdown revealing health (51 tools), pharma (31) and chemistry (21) contributions, yet no tool targeted food‑safety use‑cases.
- Identification of critical gaps: only 17 % of tools satisfy both laboratory validation (C1) and portable implementation (C6).
- Policy insight: journal data‑sharing mandates improve artifact availability but do not translate into runnable workflows for reviewers or external labs.
Methodology
- Tool collection – The authors harvested all NTA software packages cited in peer‑reviewed literature from 2004 to 2025, ending up with 103 distinct tools.
- Reproducibility pillars – Six criteria were defined:
- C1 – Laboratory validation (evidence that the tool works on real samples).
- C2 – Data availability (raw HRMS data shared).
- C3 – Code availability (source code released).
- C4 – Standardised formats (use of open, interoperable file types).
- C5 – Knowledge integration (linking results to external databases/ontologies).
- C6 – Portable implementation (containerisation, workflow description, or other means to run the tool unchanged elsewhere).
- Scoring – Each tool was manually inspected for compliance with the six pillars, yielding a binary (yes/no) matrix.
- Temporal analysis – The dataset was split into three periods (2004‑2009, 2010‑2015, 2016‑2025) to track evolution.
- Sector analysis – Tools were grouped by the primary research domain (health, pharma, chemistry) to spot domain‑specific patterns.
Results & Findings
| Pillar | Overall compliance | 2004‑2009 | 2016‑2025 |
|---|---|---|---|
| C1 (validation) | 22 % | 15 % | 28 % |
| C2 (data) | 87 % | 56 % | 86 % |
| C3 (code) | 73 % | 61 % | 78 % |
| C4 (formats) | 48 % | 34 % | 55 % |
| C5 (knowledge) | 39 % | 22 % | 45 % |
| C6 (portable) | 39 % | 55 % | 43 % |
- Openness skyrockets: data and code sharing (C2, C3) now exceed 80 % across the board.
- Operability declines: portable implementations (C6) dropped from a majority (55 %) to under half (43 %).
- Validation‑portability mismatch: only 18 tools (≈ 17 %) satisfy both C1 and C6, the combination most needed for true reproducibility.
- Domain disparity: health‑focused tools are more likely to share data, while chemistry tools show higher rates of standardised formats.
- Policy impact: journals that enforce data‑sharing policies improve C2 scores but have negligible effect on C6, indicating that “available” ≠ “executable”.
Practical Implications
- For developers: Packaging NTA pipelines in containers (Docker/Singularity) or workflow languages (CWL, Nextflow) is now a competitive differentiator—tools that are easy to spin up will be preferred for regulatory or collaborative projects.
- For labs facing audits: The audit highlights that simply publishing raw spectra is insufficient; labs must also provide validated, reproducible workflows to satisfy regulators.
- For tool vendors: Investing in standardised data models (e.g., mzML, nmrML) and ontology integration (e.g., ChemOnt, FoodOn) can boost C4 and C5 compliance, making tools more attractive for downstream data‑fusion platforms.
- For policy makers: The findings argue for extending journal mandates beyond data availability to include “executable supplement” requirements, akin to the ACM’s artifact evaluation badges.
- For industry: Companies building NTA solutions (e.g., food safety, environmental monitoring) should treat reproducibility as a product feature—portable, validated pipelines reduce time‑to‑decision when unexpected contaminants appear, as in the melamine crisis.
Limitations & Future Work
- Scope limited to published tools – many proprietary or in‑house pipelines were not captured, possibly skewing the openness/operability balance.
- Binary scoring – the audit treats each pillar as a yes/no flag, which may overlook nuanced levels of compliance (e.g., partial containerisation).
- Domain‑specific needs – the study notes the absence of food‑safety‑oriented tools; future work could focus on building or evaluating pipelines tailored to that sector.
- Longitudinal impact – while trends are identified, the causal effect of specific policies (e.g., journal mandates) remains correlational; controlled studies could clarify which interventions most improve operability.
Bottom line: The paper shines a light on a paradox in the NTA community—tools are increasingly open, yet they are becoming harder to run elsewhere. Bridging that gap with validated, portable workflows will be essential for turning sophisticated mass‑spectrometry analyses into reliable, regulatory‑ready evidence.
Authors
- Sarah Alsubaie
- Sakhaa Alsaedi
- Xin Gao
Paper Information
- arXiv ID: 2512.20279v1
- Categories: cs.CE, cs.SE
- Published: December 23, 2025
- PDF: Download PDF