[Paper] Privacy-preserving fall detection at the edge using Sony IMX636 event-based vision sensor and Intel Loihi 2 neuromorphic processor
Source: arXiv - 2511.22554v1
Overview
A new study demonstrates how a neuromorphic vision sensor (Sony IMX636) paired with Intel’s Loihi 2 neuromorphic processor can detect falls in real time while keeping data private and power consumption ultra‑low. By moving inference to the edge and exploiting the sparse, event‑driven nature of the sensor, the system delivers a privacy‑preserving “smart camera” that could be deployed in elderly‑care settings without streaming raw video to the cloud.
Key Contributions
- End‑to‑end neuromorphic pipeline: Integrated Sony IMX636 event camera, FPGA interface, and a single Loihi 2 chip for on‑sensor inference.
- Sparse SNN designs: Explored several leaky‑integrate‑and‑fire (LIF) spiking neural network (SNN) architectures, including binary and graded‑spike variants, to find the best trade‑off between accuracy and synaptic operation count.
- Pareto‑optimal performance: Identified a graded‑spike convolutional SNN that reaches 58 % F1 with a 55× reduction in synaptic operations, and a hybrid MCUNet + S4D model that pushes F1 to 84 % while using only ~2× the baseline operations.
- Energy‑aware proof‑of‑concept: Demonstrated a fully functional “smart security camera” consuming ~90 mW on Loihi 2, suitable for always‑on deployment.
- New dataset: Collected a diverse, event‑based fall‑detection dataset covering varying lighting, background motion, and occlusion scenarios, released for future research.
Methodology
- Sensor & Interface – The Sony IMX636 event camera outputs asynchronous pixel‑level brightness changes (events) instead of full frames. An FPGA board translates these events into a format the Loihi 2 processor can ingest, preserving the event timing and sparsity.
- Network Exploration – Researchers trained several spiking convolutional networks using leaky‑integrate‑and‑fire neurons. Two spike encoding schemes were compared:
- Binary spikes (standard “on/off” spikes)
- Graded spikes (spikes carry amplitude information)
The networks were pruned to various sparsity levels to evaluate the operation‑count vs. accuracy trade‑off.
- Hybrid Feature Extractor – A lightweight MCUNet CNN extracts compact features from the event stream, which are then fed into an S4D (State‑Space Sequence) model that operates as a spiking recurrent layer on Loihi 2.
- Evaluation – The models were benchmarked on the newly recorded dataset, measuring F1 score, synaptic operation sparsity, latency, and power draw on the Loihi 2 chip.
Results & Findings
| Model | F1 Score | Synaptic Ops Sparsity | Power (Loihi 2) |
|---|---|---|---|
| LIF ConvSNN (binary spikes) | 52 % | 30× | ~80 mW |
| LIF ConvSNN (graded spikes) | 58 % | 55× | ~78 mW |
| MCUNet + S4D (patched inference) | 84 % | 2× | 90 mW |
- Graded spikes improve detection accuracy by ~6 % while cutting operations by a factor of 5 compared to binary spikes.
- The hybrid MCUNet‑S4D pipeline delivers the best overall detection (84 % F1) with only a modest increase in operations and stays under 100 mW, confirming feasibility for battery‑powered edge devices.
- Latency measurements show sub‑100 ms reaction times, satisfying “always‑on” real‑time requirements.
Practical Implications
- Privacy‑first monitoring: Because the system never reconstructs full video frames and processes data locally, it complies with strict GDPR‑style privacy regulations—ideal for assisted‑living facilities, hospitals, or home‑care robots.
- Ultra‑low power edge AI: At ~90 mW, a single Loihi 2 chip can run continuously for weeks on a modest battery, opening possibilities for plug‑and‑play fall‑detection cameras without external power.
- Scalable neuromorphic pipelines: The FPGA‑Loihi 2 interface demonstrates a reusable architecture that can be adapted to other event‑based perception tasks (e.g., gesture recognition, anomaly detection).
- Reduced bandwidth & cloud costs: By performing inference at the sensor, only high‑level alerts need to be transmitted, cutting network traffic and cloud compute expenses.
- Developer‑friendly tooling: The authors used open‑source neuromorphic frameworks (NxSDK, PyTorch‑SNN) and released the dataset, lowering the barrier for developers to prototype similar solutions.
Limitations & Future Work
- Detection accuracy ceiling: While 84 % F1 is promising, it may still be insufficient for high‑risk clinical environments; further model refinement or multimodal fusion (e.g., inertial sensors) could boost performance.
- Hardware dependency: The current prototype relies on a custom FPGA bridge and a Loihi 2 chip, which are not yet mass‑produced; broader adoption will need more accessible neuromorphic hardware or ASICs.
- Generalization: The dataset, though diverse, is limited to controlled indoor settings; testing in real homes with pets, clutter, and varying camera placements is needed.
- Explainability: Spiking networks are still opaque; future work could integrate neuromorphic explainability tools to help caregivers understand false positives/negatives.
Overall, the paper showcases a compelling route to privacy‑preserving, energy‑efficient fall detection by marrying event‑based vision with neuromorphic processing—an approach that could reshape how edge AI handles sensitive health monitoring.
Authors
- Lyes Khacef
- Philipp Weidel
- Susumu Hogyoku
- Harry Liu
- Claire Alexandra Bräuer
- Shunsuke Koshino
- Takeshi Oyakawa
- Vincent Parret
- Yoshitaka Miyatani
- Mike Davies
- Mathis Richter
Paper Information
- arXiv ID: 2511.22554v1
- Categories: cs.NE
- Published: November 27, 2025
- PDF: Download PDF