[Paper] Hybrid SIFT-SNN for Efficient Anomaly Detection of Traffic Flow-Control Infrastructure

Published: (November 26, 2025 at 07:40 AM EST)
4 min read
Source: arXiv

Source: arXiv - 2511.21337v1

Overview

The paper introduces SIFT‑SNN, a hybrid pipeline that couples classic computer‑vision feature extraction (Scale‑Invariant Feature Transform) with a lightweight spiking neural network (SNN) to spot structural anomalies in traffic‑flow control infrastructure (e.g., movable concrete barriers). By converting visual cues into sparse spike trains, the system delivers sub‑10 ms per‑frame inference while keeping power consumption low enough for edge devices.

Key Contributions

  • Hybrid architecture: Combines SIFT’s rotation‑ and scale‑invariant descriptors with a latency‑optimized Leaky‑Integrate‑and‑Fire (LIF) SNN for classification.
  • Real‑time performance: Achieves 9.5 ms inference per frame and 8.1 % average spike activity, enabling deployment on low‑power embedded hardware.
  • High accuracy on a realistic dataset: 92.3 % ± 0.8 % classification accuracy on 6 000 labelled frames captured under diverse weather and lighting conditions on the Auckland Harbour Bridge.
  • Interpretability: Retains spatial grounding of SIFT keypoints, allowing developers to trace decisions back to concrete visual features.
  • Edge‑ready prototype: Demonstrated on a consumer‑grade platform (e.g., Raspberry Pi 4 + Intel Movidius NCS2) without GPU acceleration.

Methodology

  1. Data acquisition & augmentation – 6 000 video frames of a movable concrete barrier were collected in situ. Synthetic anomalies (e.g., cracks, misalignments) were added to boost class balance.
  2. Spatial feature encoding – Each frame is processed by SIFT, producing a set of keypoints and 128‑dimensional descriptors that are robust to scale, rotation, and illumination changes.
  3. Spike conversion layer – Descriptors are quantized and encoded into binary spike trains using a latency‑driven scheme: larger descriptor values fire earlier, yielding a temporally ordered spike pattern.
  4. Spiking neural network – A shallow LIF‑based SNN (two hidden layers, 256 neurons each) receives the spike streams. The network is trained with surrogate gradient descent to classify frames as safe or anomalous.
  5. Inference on edge – The spike‑based representation is inherently sparse, so the LIF neurons fire only when necessary, dramatically reducing compute cycles and energy draw.

Results & Findings

MetricValue
Classification accuracy92.3 % ± 0.8 %
Per‑frame latency9.5 ms
Average spike activity8.1 % of total possible spikes
Power consumption (prototype)~120 mW (CPU‑only)

The system matches or exceeds state‑of‑the‑art CNN baselines (≈90 % accuracy) while cutting inference time by more than half and eliminating the need for a GPU. Ablation studies show that removing the spike conversion layer or replacing SIFT with raw pixels degrades both speed and accuracy, confirming the synergy of the two components.

Practical Implications

  • Edge deployment for smart infrastructure – Municipalities can mount a low‑cost camera + embedded board on barriers to continuously monitor structural health without cloud latency or bandwidth constraints.
  • Energy‑aware AI – The sparse spiking activity translates to lower power budgets, making the solution viable for battery‑powered or solar‑powered installations.
  • Explainable alerts – Because SIFT keypoints are preserved, maintenance crews can visualize exactly which region triggered an anomaly, speeding up diagnostics.
  • Transferable pipeline – The SIFT‑SNN combo can be repurposed for other safety‑critical visual inspection tasks (bridge decks, rail tracks, pipelines) where interpretability and low latency are paramount.
  • Developer‑friendly stack – The authors released a Python‑based toolkit built on OpenCV (for SIFT) and BindsNET/PyTorch‑Spiking (for the SNN), allowing rapid prototyping and integration into existing monitoring dashboards.

Limitations & Future Work

  • Generalisation – The model has only been validated on the Auckland Harbour Bridge dataset; performance on completely unseen environments (different barrier designs, extreme weather) remains an open question.
  • Synthetic augmentation bias – While synthetic anomalies improve robustness, they may not capture all real‑world failure modes, potentially limiting detection of rare defect patterns.
  • Scalability of SIFT – For very high‑resolution streams, SIFT extraction can become a bottleneck; future work could explore learned keypoint detectors that retain interpretability.
  • Hardware diversity – The prototype targets a specific edge board; broader evaluation across ASIC‑based neuromorphic chips (e.g., Intel Loihi) could further push latency and power gains.

Bottom line: The Hybrid SIFT‑SNN framework demonstrates that marrying classic vision algorithms with neuromorphic inference can deliver fast, low‑power, and explainable anomaly detection—an attractive proposition for developers building the next generation of smart, safety‑critical infrastructure monitoring systems.

Authors

  • Munish Rathee
  • Boris Bačić
  • Maryam Doborjeh

Paper Information

  • arXiv ID: 2511.21337v1
  • Categories: cs.CV, cs.AI
  • Published: November 26, 2025
  • PDF: Download PDF
Back to Blog

Related posts

Read more »