[Paper] From Lightweight CNNs to SpikeNets: Benchmarking Accuracy-Energy Tradeoffs with Pruned Spiking SqueezeNet
Source: arXiv - 2602.09717v1
Overview
This work delivers the first systematic benchmark that converts tiny, mobile‑friendly CNNs (ShuffleNet, SqueezeNet, MnasNet, MixNet) into spiking neural networks (SNNs) and measures the resulting accuracy‑energy trade‑offs on standard vision datasets. By pairing a lightweight CNN design with event‑driven spiking dynamics, the authors show that SNNs can slash energy consumption by up to 15× while staying within a few percentage points of their CNN baselines—making low‑power edge AI far more realistic.
Key Contributions
- First end‑to‑end benchmark of lightweight CNN‑to‑SNN conversion using Leaky‑Integrate‑and‑Fire (LIF) neurons and surrogate‑gradient training.
- Construction and evaluation of four spiking variants (ShuffleNet‑SNN, SqueezeNet‑SNN, MnasNet‑SNN, MixNet‑SNN) on CIFAR‑10, CIFAR‑100, and TinyImageNet.
- Demonstration that SNN‑SqueezeNet consistently delivers the best accuracy‑energy balance among the tested models.
- Introduction of a structured pruning pipeline that removes whole redundant modules, yielding SNN‑SqueezeNet‑P (pruned).
- Empirical evidence that pruning improves CIFAR‑10 accuracy by 6 %, cuts parameters by 19 %, and reduces energy use by 88 % with only a 1 % accuracy gap to the original CNN.
Methodology
- CNN‑to‑SNN conversion – The authors replace each ReLU activation with a Leaky‑Integrate‑and‑Fire (LIF) spiking neuron. The network is then fine‑tuned using surrogate gradient descent, which approximates the non‑differentiable spike function with a smooth proxy during back‑propagation.
- Unified training setup – All models share the same hyper‑parameters (time‑steps, learning rate schedule, loss function) to ensure a fair comparison across architectures and datasets.
- Energy estimation – Energy consumption is approximated by counting spike‑driven multiply‑accumulate (MAC) operations versus dense MACs in the original CNN, following established neuromorphic hardware cost models.
- Structured pruning – Whole modules (e.g., fire‑modules in SqueezeNet) are pruned based on their contribution to the loss gradient. The remaining network is re‑trained to recover performance, yielding a sparse, spike‑driven model.
Results & Findings
| Model (Dataset) | Top‑1 Accuracy | Params (M) | Energy (relative) |
|---|---|---|---|
| CNN‑SqueezeNet (CIFAR‑10) | 84.2 % | 1.24 | 1.0× |
| SNN‑SqueezeNet (CIFAR‑10) | 82.9 % | 1.24 | 0.06× (≈ 16× lower) |
| SNN‑SqueezeNet‑P (CIFAR‑10) | 88.0 % | 1.00 (‑19 %) | 0.012× (‑88 % energy) |
| Best lightweight SNN (any dataset) | up to 15.7× energy gain vs. CNN | – | – |
- Energy efficiency: All spiking variants achieve 5–15× lower estimated energy than their CNN counterparts.
- Accuracy trade‑off: The gap is modest (≤ 3 % for most models); pruning even closes the gap to within 1 % while boosting accuracy.
- SqueezeNet wins: Across all three benchmarks, the spiking SqueezeNet family consistently outperforms the other lightweight SNNs in both accuracy and energy savings.
Practical Implications
- Edge devices: Developers targeting battery‑constrained platforms (IoT sensors, wearables, drones) can now consider spiking versions of mobile CNNs as drop‑in replacements without redesigning the entire model stack.
- Neuromorphic hardware: The spike‑driven sparsity aligns perfectly with emerging event‑based processors (Loihi, BrainChip), allowing the same model to run on conventional GPUs for development and on neuromorphic chips for deployment.
- Model compression pipeline: The structured pruning approach is straightforward to integrate into existing CI/CD pipelines—prune, fine‑tune, and export a low‑energy SNN artifact ready for on‑device inference.
- Rapid prototyping: Because the conversion uses a unified training recipe, teams can experiment with multiple lightweight backbones (ShuffleNet, MixNet, etc.) and instantly compare their energy‑accuracy curves.
Limitations & Future Work
- Energy estimation is simulation‑based; real‑world power measurements on actual neuromorphic chips could differ.
- The study focuses on image classification; extending the benchmark to detection, segmentation, or time‑series tasks remains open.
- Surrogate gradient training still requires multiple time‑steps (e.g., 8–16), which adds latency; exploring ultra‑low‑step or event‑driven training could further shrink inference time.
- The pruning strategy is module‑level only; finer‑grained weight pruning or quantization could push energy savings even further.
Authors
- Radib Bin Kabir
- Tawsif Tashwar Dipto
- Mehedi Ahamed
- Sabbir Ahmed
- Md Hasanul Kabir
Paper Information
- arXiv ID: 2602.09717v1
- Categories: cs.CV, cs.AI, cs.ET, cs.NE
- Published: February 10, 2026
- PDF: Download PDF