[Paper] Mitigating Artifacts in Pre-quantization Based Scientific Data Compressors with Quantization-aware Interpolation
Source: arXiv - 2602.20097v1
Overview
The paper tackles a subtle but important problem in high‑throughput scientific data compression: artifacts that appear when using pre‑quantization techniques. While pre‑quantization enables massive parallelism and ultra‑fast compression rates, it can degrade the fidelity of the reconstructed data, especially when users allow relatively large error bounds. The authors introduce a quantization‑aware interpolation method that cleans up these artifacts without sacrificing the speed that makes pre‑quantization attractive.
Key Contributions
- Artifact Characterization: A systematic analysis of how the quantization index (the integer code emitted by the compressor) correlates with reconstruction error, revealing the root causes of visual and statistical artifacts.
- Quantization‑Aware Interpolation (QAI): A novel post‑processing algorithm that leverages the known quantization intervals to interpolate more accurate values during decompression.
- Scalable Parallel Implementation: QAI is engineered for both shared‑memory (multithreaded) and distributed‑memory (MPI) environments, preserving the high throughput of the underlying compressors.
- Empirical Validation: Experiments on five real‑world scientific datasets (e.g., climate, astrophysics, CFD) using two state‑of‑the‑art pre‑quantization compressors show significant quality gains while keeping compression speed virtually unchanged.
Methodology
-
Artifact Diagnosis:
- The authors instrument existing pre‑quantization compressors to record the quantization index for each data point.
- By comparing the original values, quantized values, and final decompressed values, they map error patterns to specific index transitions (e.g., sudden jumps at block boundaries).
-
Design of QAI:
- Quantization Awareness: Instead of treating the compressed integer as a black‑box, QAI reconstructs the interval each index represents (e.g.,
[v_i, v_i + Δ]). - Local Interpolation: For each point, QAI examines neighboring indices and performs a weighted interpolation that respects the quantization intervals, effectively “smoothing” the step‑wise artifacts.
- Boundary Handling: Special logic ensures that edges of blocks or irregular data shapes are treated correctly, avoiding over‑smoothing.
- Quantization Awareness: Instead of treating the compressed integer as a black‑box, QAI reconstructs the interval each index represents (e.g.,
-
Parallelization Strategy:
- Shared‑Memory: The data grid is split into tiles; each thread processes a tile independently, using thread‑local buffers for neighbor access.
- Distributed‑Memory: The global dataset is partitioned across MPI ranks; halo exchanges provide the necessary neighbor information across rank boundaries before QAI runs locally.
-
Integration with Existing Compressors: QAI sits after the decompression stage, requiring only the quantization index stream (already output by the compressor) and the original error bound. No changes to the compression pipeline are needed.
Results & Findings
| Compressor | Dataset | Error Bound | Baseline PSNR* | QAI‑enhanced PSNR | Throughput (GB/s) |
|---|---|---|---|---|---|
| SZ‑preq | Climate (2 TB) | 1e‑3 | 38.2 dB | 44.7 dB (+6.5 dB) | 12.3 |
| ZFP‑preq | Astrophysics (1.5 TB) | 5e‑4 | 35.8 dB | 41.2 dB (+5.4 dB) | 10.9 |
*Peak Signal‑to‑Noise Ratio, higher = better quality.
- Quality Boost: Across all five datasets, QAI raised PSNR by 4–7 dB, dramatically reducing visual streaks and statistical bias.
- Negligible Overhead: The added interpolation step incurred < 5 % runtime overhead, preserving the > 10 GB/s compression speeds typical of pre‑quantization compressors.
- Scalability: Strong‑scaling tests up to 256 cores (shared‑memory) and 1024 MPI ranks (distributed) showed near‑linear speedup, confirming that QAI does not become a bottleneck at scale.
Practical Implications
- Higher‑Fidelity In‑Situ Analytics: Scientists can now run lossy compression on the fly with larger error bounds (to meet storage constraints) while still trusting downstream analysis results.
- Plug‑and‑Play Upgrade: Since QAI works as a post‑processing layer, existing HPC workflows that already use SZ‑preq, ZFP‑preq, or similar compressors can adopt it with minimal code changes.
- Reduced Storage Costs: Better reconstruction quality means that the same storage budget can accommodate more data or longer simulation runs without sacrificing scientific insight.
- Edge‑Computing & IoT: The algorithm’s low overhead makes it attractive for bandwidth‑limited sensors that need fast, lossy compression before transmission—e.g., satellite telemetry or large‑scale environmental sensor networks.
Limitations & Future Work
- Error‑Bound Dependency: QAI’s benefits diminish when the user specifies very tight error bounds (≤ 10⁻⁵), where the baseline compressor already yields high fidelity.
- Memory Footprint: The interpolation step requires temporary neighbor buffers; on extremely memory‑constrained nodes this could be a concern.
- Extension to Non‑Uniform Grids: The current implementation assumes regular grid topology; handling adaptive meshes or unstructured data remains an open challenge.
- Adaptive Interpolation Strategies: Future work could explore machine‑learning‑guided weighting schemes that automatically tune interpolation based on local data variability.
Takeaway: By making the decompression step aware of the quantization intervals, the authors deliver a lightweight yet powerful fix for a long‑standing quality issue in pre‑quantization compressors—opening the door for faster, higher‑quality data reduction in today’s data‑intensive scientific computing.
Authors
- Pu Jiao
- Sheng Di
- Jiannan Tian
- Mingze Xia
- Xuan Wu
- Yang Zhang
- Xin Liang
- Franck Cappello
Paper Information
- arXiv ID: 2602.20097v1
- Categories: cs.DC
- Published: February 23, 2026
- PDF: Download PDF