[Paper] CASR: A Robust Cyclic Framework for Arbitrary Large-Scale Super-Resolution with Distribution Alignment and Self-Similarity Awareness

Published: 3 days ago (February 25, 2026 at 01:05 PM EST)

4 min read

Source: arXiv

Source: arXiv - 2602.22159v1

Overview

The paper introduces CASR, a cyclic super‑resolution (SR) framework that can upscale images to any factor with a single model. By treating extreme up‑scaling as a series of small, in‑distribution steps, CASR dramatically reduces the noise, blur, and artifacts that normally explode when the target scale falls outside the training range.

Key Contributions

Cyclic SR formulation – Re‑expresses arbitrary‑scale up‑sampling as a chain of modest, in‑distribution magnifications, enabling stable inference with one network.
Structural Distribution Alignment Module (SDAM) – Uses super‑pixel aggregation to align feature distributions across iterations, preventing drift and error accumulation.
Self‑Similarity Aware Restoration Module (SARM) – Enforces autocorrelation constraints and injects low‑resolution (LR) self‑similarity priors to recover high‑frequency textures.
Single‑model, multi‑scale solution – No need for separate models or scale‑specific finetuning; the same weights handle 2×, 4×, 16×, or even 64× up‑scaling.
State‑of‑the‑art performance on standard benchmarks, especially at extreme magnifications where prior methods collapse.

Methodology

Cyclic Upscaling Loop
- Instead of a one‑shot jump from LR to the desired HR size, the image is repeatedly passed through the SR network, each time enlarging it by a modest factor (e.g., 1.5×–2×).
- This keeps every intermediate output within the distribution the model was trained on, avoiding the “out‑of‑distribution” shock that causes artifacts.
Structural Distribution Alignment Module (SDAM)
- The feature map from the current iteration is segmented into super‑pixels (coherent regions).
- Statistics (mean, variance) of each super‑pixel are aligned to those of the previous iteration, effectively “re‑centering” the distribution and stopping drift.
Self‑Similarity Aware Restoration Module (SARM)
- Computes an autocorrelation map of the LR input to capture repeating patterns (textures, edges).
- During each up‑sampling step, SARM injects these self‑similarity cues back into the feature space, encouraging the network to reproduce realistic high‑frequency details rather than hallucinating noise.
Training
- The network is trained on a conventional range of scales (e.g., 1×–4×).
- Losses combine pixel‑wise L1/L2, perceptual (VGG) loss, and a novel distribution‑alignment loss that penalizes divergence between successive super‑pixel statistics.

Results & Findings

Metric (×4 SR)	PSNR ↑	SSIM ↑
CASR (single model)	31.8 dB	0.894
Prior art (multi‑model)	30.5 dB	0.877
Baseline cyclic (no SDAM/SARM)	30.9 dB	0.882

Extreme scales (×16, ×32, ×64): CASR retains visual fidelity, while competing methods exhibit severe blur and ringing.
Distribution drift measured by KL‑divergence between successive iterations drops by ~45 % thanks to SDAM.
Texture consistency (measured via autocorrelation similarity) improves by ~20 % with SARM, confirming that self‑similarity priors are effectively leveraged.

Qualitative examples show clean edge reconstruction and plausible fine‑grained patterns (e.g., fabric weave, foliage) even at 64× magnification.

Practical Implications

Single‑model deployment – Developers can ship one lightweight SR service that handles any client‑requested zoom level, simplifying CI/CD pipelines and reducing memory footprints.
Real‑time streaming & VR – The cyclic approach can be throttled adaptively: fewer iterations for low‑latency scenarios, more for high‑quality offline rendering.
Legacy image restoration – Archivists can upscale historical photos to very high resolutions without training a bespoke model for each target scale.
Edge devices – Because each iteration works on a modest up‑scale factor, the per‑step compute stays bounded, making it feasible to run on mobile GPUs or NPUs with progressive refinement.

Limitations & Future Work

Inference latency grows linearly with the number of cycles; extremely high magnifications still require many passes, which may be prohibitive for ultra‑low‑latency use‑cases.
The current SDAM relies on super‑pixel segmentation, which adds a preprocessing overhead and may struggle with highly textured or noisy inputs.
Authors note that extending the framework to video SR (temporal consistency) and exploring learned adaptive cycle lengths are promising directions for follow‑up research.

Authors

Wenhao Guo
Zhaoran Zhao
Peng Lu
Sheng Li
Qian Qiao
RuiDe Li

Paper Information

arXiv ID: 2602.22159v1
Categories: cs.CV
Published: February 25, 2026
PDF: Download PDF

[Paper] CASR: A Robust Cyclic Framework for Arbitrary Large-Scale Super-Resolution with Distribution Alignment and Self-Similarity Awareness

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] MediX-R1: Open Ended Medical Reinforcement Learning

[Paper] VGG-T$^3$: Offline Feed-Forward 3D Reconstruction at Scale

[Paper] SeeThrough3D: Occlusion Aware 3D Control in Text-to-Image Generation

[Paper] A Dataset is Worth 1 MB