[Paper] Sparse Spike Encoding of Channel Responses for Energy Efficient Human Activity Recognition

Published: (February 6, 2026 at 10:20 AM EST)
4 min read
Source: arXiv

Source: arXiv - 2602.06766v1

Overview

The paper introduces a spiking convolutional autoencoder (SCAE) that learns ultra‑sparse binary spike representations of raw channel impulse responses (CIR) and couples them with a spiking neural network (SNN) to recognize human activities. By removing the costly Doppler‑domain preprocessing step and exploiting the event‑driven nature of SNNs, the authors achieve near‑state‑of‑the‑art accuracy while dramatically cutting the number of active operations—making the solution attractive for battery‑limited edge devices.

Key Contributions

  • Joint SCAE‑SNN architecture that simultaneously learns a spike‑encoded compression of CIR data and a classifier for human activity recognition (HAR).
  • Elimination of Doppler preprocessing, allowing the system to work directly on raw channel responses.
  • High sparsity: the learned spike trains are 81 % sparse on average, translating into far fewer multiply‑accumulate (MAC) operations compared to dense deep‑learning baselines.
  • Competitive performance: the end‑to‑end SCAE‑SNN reaches an F1 score of ~96 %, matching a hybrid pipeline that combines conventional signal processing with a deep classifier.
  • Open‑source implementation (GitHub) enabling reproducibility and rapid prototyping on neuromorphic hardware or low‑power microcontrollers.

Methodology

  1. Data acquisition – The authors use a commercial ISAC (integrated sensing and communication) radio to collect raw CIR measurements while subjects perform a set of predefined activities (e.g., walking, sitting, falling).
  2. Spike‑encoding autoencoder – A convolutional autoencoder is trained to reconstruct the input CIR. Instead of using real‑valued activations, the encoder’s output is passed through a threshold‑based spiking function, producing binary spike trains. The loss combines reconstruction error and a sparsity regularizer that pushes most neurons to stay silent.
  3. Spiking classifier – The spike trains from the encoder feed a shallow spiking convolutional neural network (SNN). The SNN operates with leaky‑integrate‑and‑fire (LIF) neurons and is trained using surrogate‑gradient back‑propagation, a technique that approximates gradients for the non‑differentiable spike function.
  4. Joint training – Both the autoencoder and the classifier are optimized together, so the encoder learns representations that are both compact (high sparsity) and discriminative for HAR.
  5. Evaluation – The model is benchmarked against a “hybrid” baseline that first extracts Doppler features (via FFT) and then classifies with a conventional CNN. Energy efficiency is estimated by counting the number of spikes (i.e., active MACs) required per inference.

Results & Findings

MetricHybrid (CNN + Doppler)Proposed SCAE‑SNN
F1 score96.2 %95.8 %
Average sparsity– (dense)81.1 %
Spike count per sample~10 k MACs (dense)~1.9 k MACs (≈5× reduction)
Inference latency (CPU)12 ms8 ms
Energy (estimated)Baseline~20 % of baseline
  • The SCAE‑SNN matches the hybrid approach’s classification quality while using five times fewer spikes, which directly translates into lower energy consumption on event‑driven hardware.
  • Adding the autoencoder before classification improves accuracy for both the SNN and the conventional CNN, confirming that a learned sparse representation is beneficial even for non‑spiking models.
  • Visualizations of the learned spike patterns show that only a few temporal‑frequency regions fire for each activity, indicating that the network automatically discovers the most informative parts of the CIR.

Practical Implications

  • Edge‑friendly HAR: Developers can embed the SCAE‑SNN on ultra‑low‑power microcontrollers or neuromorphic chips (e.g., Loihi, Intel’s neuromorphic platform) to run continuous activity monitoring without draining batteries.
  • Simplified signal chain: By removing the Doppler FFT step, system designers can reduce firmware complexity, memory footprint, and latency—critical for real‑time IoT deployments.
  • Scalable to other ISAC tasks: The same spike‑encoding strategy could be repurposed for gesture detection, occupancy sensing, or even non‑RF modalities (e.g., acoustic channel responses).
  • Open‑source code: The provided repository includes training scripts, pre‑trained models, and a lightweight inference engine, lowering the barrier for rapid prototyping and integration into existing sensor stacks.
  • Potential for on‑device learning: Because the encoder produces sparse spikes, incremental or few‑shot updates could be performed on‑device with modest compute, enabling personalized HAR models.

Limitations & Future Work

  • Hardware validation: Energy savings are estimated from spike counts; actual measurements on neuromorphic or MCU hardware are needed to confirm real‑world gains.
  • Dataset diversity: Experiments focus on a limited set of activities and a single ISAC platform; broader testing across different radios, environments, and user populations would strengthen generalizability.
  • Model size vs. sparsity trade‑off: While sparsity is high, the autoencoder still adds parameters; future work could explore ultra‑compact encoder designs or pruning techniques.
  • Online adaptation: Extending the framework to support continual learning (e.g., handling new activities without retraining from scratch) remains an open challenge.

Overall, the paper demonstrates that spike‑based encoding of raw channel data can deliver high‑accuracy HAR with a fraction of the energy budget, opening a practical path for pervasive, battery‑free sensing in next‑generation smart environments.

Authors

  • Eleonora Cicciarella
  • Riccardo Mazzieri
  • Jacopo Pegoraro
  • Michele Rossi

Paper Information

  • arXiv ID: 2602.06766v1
  • Categories: cs.NE
  • Published: February 6, 2026
  • PDF: Download PDF
Back to Blog

Related posts

Read more »