[Paper] WaveRNet: Wavelet-Guided Frequency Learning for Multi-Source Domain-Generalized Retinal Vessel Segmentation

Published: (January 9, 2026 at 11:58 AM EST)
4 min read
Source: arXiv

Source: arXiv - 2601.05942v1

Overview

Retinal vessel segmentation is a cornerstone for automated eye‑disease screening, but models often stumble when faced with images captured under different lighting, contrast, or camera settings. The new WaveRNet paper tackles this “domain shift” problem by marrying wavelet‑based frequency analysis with the powerful Segment‑Anything Model (SAM). The result is a system that can reliably extract fine‑grained vessel structures across multiple, previously unseen datasets without any extra training.

Key Contributions

  • Spectral‑guided Domain Modulator (SDM): Combines discrete wavelet decomposition with learnable “domain tokens” to separate illumination‑stable low‑frequency structures from high‑frequency vessel edges, while still allowing domain‑specific feature adaptation.
  • Frequency‑Adaptive Domain Fusion (FADF): At inference time, selects and softly fuses the most relevant source‑domain representations based on wavelet‑derived frequency similarity, enabling test‑time adaptation without re‑training.
  • Hierarchical Mask‑Prompt Refiner (HMPR): A coarse‑to‑fine refinement pipeline that overcomes SAM’s naïve up‑sampling, preserving tiny capillaries through multi‑scale long‑range dependency modeling.
  • Leave‑One‑Domain‑Out (LODO) benchmark: Extensive evaluation on four public retinal datasets shows state‑of‑the‑art generalization, surpassing prior SAM‑based adapters by a sizable margin.
  • Open‑source release: Full code, pretrained weights, and a ready‑to‑run demo are provided on GitHub, lowering the barrier for adoption.

Methodology

  1. Wavelet Decomposition: Input retinal images are first split into low‑frequency (approximation) and high‑frequency (detail) sub‑bands using a discrete wavelet transform (DWT). This isolates illumination‑related variations (low‑freq) from vessel edge information (high‑freq).
  2. Spectral‑guided Domain Modulator (SDM):
    • A set of learnable domain tokens is attached to each frequency band.
    • These tokens interact with the DWT coefficients via a lightweight transformer block, producing domain‑modulated feature maps that retain the structure of the original image while being robust to lighting changes.
  3. Frequency‑Adaptive Domain Fusion (FADF):
    • During testing, the system computes the wavelet‑based frequency signature of the incoming image.
    • It then measures similarity to each source‑domain signature and assigns soft weights to the corresponding SDM outputs, effectively “picking” the most relevant knowledge without any gradient updates.
  4. Hierarchical Mask‑Prompt Refiner (HMPR):
    • The coarse vessel mask generated by SAM is fed into a hierarchy of refinement stages.
    • Each stage uses a transformer‑style attention module that aggregates global context and refines the mask at progressively higher resolutions, restoring the fine capillary details lost in SAM’s up‑sampling step.

All components are end‑to‑end trainable on the multi‑source training set, but only the SDM and HMPR require gradient updates; FADF operates purely at inference.

Results & Findings

Dataset (LODO)Dice ↑IoU ↑Avg. # Params
DRIVE (trained on others)0.9230.86245 M
STARE0.9170.85445 M
CHASE_DB10.9110.84645 M
HRF0.9040.83945 M
  • Improvement over baselines: WaveRNet outperforms the vanilla SAM‑adapter by 3–5 % Dice and reduces the performance drop caused by illumination changes by more than half.
  • Ablation studies: Removing the wavelet branch drops Dice by ~2 %; disabling FADF reduces cross‑domain robustness by ~1.8 %; omitting HMPR leads to a noticeable loss of thin‑vessel recall (≈ 4 % lower).
  • Speed: The added wavelet and transformer modules introduce only ~15 ms overhead per 512×512 image on an RTX 3080, keeping the pipeline well within real‑time limits for clinical screening.

Practical Implications

  • Plug‑and‑play for existing pipelines: Developers can wrap WaveRNet around any SAM‑based segmentation service, gaining domain robustness with minimal code changes.
  • Zero‑shot deployment: Hospitals or tele‑ophthalmology platforms can run the model on new camera hardware or lighting conditions without collecting additional labeled data.
  • Fine‑vessel preservation: The HMPR module ensures that tiny capillaries—critical for early disease detection—are not lost, improving downstream diagnostic algorithms (e.g., diabetic retinopathy grading).
  • Generalizable recipe: The wavelet‑guided token modulation and frequency‑adaptive fusion can be transplanted to other medical imaging tasks where illumination or contrast varies (e.g., skin lesion segmentation, endoscopy).

Limitations & Future Work

  • Wavelet choice sensitivity: The current implementation uses a single‑level Haar wavelet; more sophisticated multi‑scale or learned wavelet bases could further boost performance.
  • Domain token scalability: As the number of source domains grows, the token bank may become unwieldy; future work could explore hierarchical token sharing or dynamic token generation.
  • Clinical validation: While benchmark results are strong, prospective studies on real‑world screening workflows are needed to confirm diagnostic impact.
  • Extension beyond retinal images: Adapting the framework to 3‑D modalities (e.g., OCT volumes) will require redesigning the wavelet decomposition and memory‑efficient attention mechanisms.

Authors

  • Chanchan Wang
  • Yuanfang Wang
  • Qing Xu
  • Guanxin Chen

Paper Information

  • arXiv ID: 2601.05942v1
  • Categories: cs.CV
  • Published: January 9, 2026
  • PDF: Download PDF
Back to Blog

Related posts

Read more »