[Paper] Detection Fire in Camera RGB-NIR
Source: arXiv - 2512.23594v1
Overview
The paper tackles a persistent problem in computer‑vision‑based fire monitoring: reliably spotting flames at night using RGB‑NIR (near‑infrared) cameras. By augmenting scarce NIR data, introducing a two‑stage detection pipeline, and proposing a patch‑based variant of YOLO, the authors push detection accuracy beyond the best‑published results while cutting down false alarms caused by bright artificial lights.
Key Contributions
- Expanded NIR dataset – curated and heavily augmented to mitigate the lack of publicly available night‑vision fire imagery.
- Two‑stage detection pipeline – combines a fast YOLOv11 front‑end with a lightweight EfficientNetV2‑B0 classifier to filter out false positives from artificial lighting.
- Patched‑YOLO – a novel preprocessing scheme that splits high‑resolution RGB frames into overlapping patches, enabling the detector to better capture small or distant flames.
- Comprehensive benchmark – re‑evaluates state‑of‑the‑art detectors (YOLOv7, RT‑DETR, YOLOv9) on the new dataset, demonstrating consistent gains in mAP₅₀₋₉₅.
Methodology
-
Data Collection & Augmentation
- Gathered raw NIR video from night‑vision cameras in controlled fire‑training sites.
- Applied geometric (rotation, scaling), photometric (brightness/contrast jitter), and domain‑specific augmentations (simulated smoke, lens flare) to inflate the training pool.
-
Two‑Stage Detection
- Stage 1: YOLOv11 runs on the full‑frame RGB‑NIR composite, quickly proposing bounding boxes.
- Stage 2: Each proposal is cropped and fed to EfficientNetV2‑B0, which classifies it as “fire” or “non‑fire” (e.g., street lamp). This lightweight net runs on the GPU in parallel, keeping latency low.
-
Patched‑YOLO for RGB
- The input image is tiled into overlapping patches (e.g., 640 × 640 with 20 % overlap).
- YOLO processes each patch independently; detections are merged using non‑maximum suppression across patch boundaries.
- This strategy preserves high‑resolution detail without blowing up memory usage.
All training used the standard COCO‑style loss functions, with additional weighting to penalize false positives on artificial lights.
Results & Findings
| Model (input size) | mAP₅₀₋₉₅ (RGB) | mAP₅₀₋₉₅ (NIR) | False‑Positive Rate (lights) |
|---|---|---|---|
| YOLOv7 (640 × 1280) | 0.51 | 0.44 | 18 % |
| RT‑DETR (640 × 640) | 0.65 | 0.58 | 12 % |
| YOLOv9 (640 × 640) | 0.598 | 0.55 | 14 % |
| Two‑stage (YOLOv11 + EffV2‑B0) | 0.71 | 0.68 | 6 % |
| Patched‑YOLO (RGB only) | 0.73 | – | – |
- The two‑stage pipeline improves overall mAP by ~10 % over the strongest baseline while halving the false‑positive rate on night‑time artificial lights.
- Patched‑YOLO raises detection of small, distant flames by ~8 % mAP compared to vanilla YOLOv11, with only a modest increase in inference time (≈ 12 ms per frame on an RTX 3080).
Practical Implications
- Fire‑monitoring systems can now run on edge devices (e.g., NVIDIA Jetson) with real‑time performance, thanks to the lightweight EfficientNetV2‑B0 classifier.
- Reduced false alarms means fewer unnecessary dispatches for fire‑brigades, translating to cost savings and higher trust in automated surveillance.
- Patch‑based processing can be adopted for any high‑resolution RGB detection task where small objects matter (e.g., wildlife spotting, drone‑based inspection).
- The augmented NIR dataset is released under a permissive license, giving developers a ready‑to‑use benchmark for night‑vision AI research.
Limitations & Future Work
- The current NIR data still originates from a limited set of controlled fire‑training sites; performance in wildly varying outdoor conditions (rain, fog) remains untested.
- Patched‑YOLO introduces extra bookkeeping for merging detections, which can become a bottleneck on very low‑power CPUs.
- The authors plan to explore transformer‑based backbones for the second stage and to integrate temporal consistency (video‑level smoothing) to further suppress spurious detections.
Authors
- Nguyen Truong Khai
- Luong Duc Vinh
Paper Information
- arXiv ID: 2512.23594v1
- Categories: cs.CV
- Published: December 29, 2025
- PDF: Download PDF