[Paper] Enhancing Hazy Wildlife Imagery: AnimalHaze3k and IncepDehazeGan

Published: (April 17, 2026 at 01:46 PM EDT)
4 min read
Source: arXiv

Source: arXiv - 2604.16284v1

Overview

The paper tackles a practical problem that many wildlife‑monitoring projects face: dense atmospheric haze that makes animal photos look murky and hampers computer‑vision pipelines. The authors introduce a new synthetic hazy dataset (AnimalHaze3k) and a GAN‑based de‑hazing model (IncepDehazeGan) that together boost the quality of hazy wildlife images and dramatically improve downstream detection performance.

Key Contributions

  • AnimalHaze3k dataset – 3,477 synthetically hazed wildlife images derived from 1,159 clean photos using a physics‑based haze model, filling a gap in publicly available data for this niche.
  • IncepDehazeGan architecture – A novel GAN that fuses Inception blocks with residual skip connections, tailored for de‑hazing natural‑scene wildlife imagery.
  • State‑of‑the‑art restoration metrics – Achieves SSIM = 0.8914, PSNR = 20.54 dB, and LPIPS = 0.1104, outperforming prior de‑hazing methods by +6.27 % SSIM and +10.2 % PSNR.
  • Downstream impact demonstration – When de‑hazed images are fed to YOLOv11, detection mAP more than doubles (↑112 %) and IoU rises by 67 %, proving the practical value of the restoration step.
  • Open‑source release – Code, pretrained weights, and the AnimalHaze3k dataset are made publicly available, encouraging reproducibility and community extensions.

Methodology

  1. Synthetic haze generation – Starting from high‑resolution, haze‑free wildlife photos, the authors apply a physically grounded atmospheric scattering model (Koschmieder’s law) to simulate varying haze densities, illumination, and depth cues. This yields a paired dataset of clean ↔ hazy images.
  2. IncepDehazeGan design
    • Generator: Stacks of Inception modules capture multi‑scale texture details, while residual skip connections preserve low‑level structure and ease gradient flow.
    • Discriminator: A PatchGAN classifier judges realism at the patch level, encouraging the generator to produce locally consistent de‑hazed textures.
    • Losses: Combined adversarial loss, L1 reconstruction loss, and perceptual loss (VGG‑based) to balance pixel fidelity and visual quality.
  3. Training pipeline – The model is trained on AnimalHaze3k with standard data augmentations (random crops, flips) and Adam optimizer. Validation uses both quantitative metrics (SSIM, PSNR, LPIPS) and a downstream object‑detection benchmark (YOLOv11 on de‑hazed vs. hazy inputs).

Results & Findings

MetricIncepDehazeGanBest Prior Method
SSIM0.8914 (+6.27 %)0.8392
PSNR (dB)20.54 (+10.2 %)18.66
LPIPS0.1104 (lower is better)0.1587
  • Visual quality: Side‑by‑side comparisons show sharper animal outlines, restored color contrast, and reduced halo artifacts.
  • Detection boost: Feeding de‑hazed frames to YOLOv11 lifts mean Average Precision from 0.42 (hazy) to 0.89 (de‑hazed) and IoU from 0.48 to 0.80.
  • Generalization: The model retains performance on real‑world hazy wildlife footage not seen during training, indicating the synthetic data captures essential haze characteristics.

Practical Implications

  • Field deployments: Conservation drones or camera traps operating in foggy or smoky environments can run IncepDehazeGan on‑device (lightweight inference ~30 ms on an NVIDIA Jetson Nano) to clean images before analysis.
  • Improved analytics pipelines: Cleaner frames translate to higher detection recall, more reliable animal counts, and better behavior tracking—critical for population monitoring and anti‑poaching alerts.
  • Cross‑domain utility: The architecture is agnostic to the subject matter, so developers can adapt it for other outdoor domains (e.g., autonomous driving in haze, aerial surveying).
  • Data augmentation: AnimalHaze3k can serve as a benchmark for training robust models that are resilient to atmospheric disturbances, reducing the need for costly field data collection under adverse weather.

Limitations & Future Work

  • Synthetic vs. real haze: Although the physics‑based generator mimics many haze properties, subtle real‑world factors (e.g., aerosol composition, dynamic lighting) may still cause a domain gap.
  • Resolution ceiling: The current model is trained on 512×512 patches; scaling to ultra‑high‑resolution imagery may require architectural tweaks or memory‑efficient training tricks.
  • Temporal consistency: For video streams, frame‑wise de‑hazing can introduce flicker; extending the model with temporal loss or recurrent modules is a promising direction.
  • Broader ecological validation: The paper evaluates detection on a single YOLO version; testing across diverse species detectors and multi‑species datasets would solidify the claimed conservation impact.

Overall, the combination of a purpose‑built dataset and a high‑performing de‑hazing GAN offers developers a ready‑to‑use toolset for tackling haze‑related visual challenges in wildlife monitoring and beyond.

Authors

  • Shivarth Rai
  • Tejeswar Pokuri

Paper Information

  • arXiv ID: 2604.16284v1
  • Categories: cs.CV
  • Published: April 17, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »