[Paper] General Self-Prediction Enhancement for Spiking Neurons

Published: 3 months ago (January 29, 2026 at 10:08 AM EST)

5 min read

Source: arXiv

Source: arXiv - 2601.21823v1

Overview

The paper introduces General Self‑Prediction Enhancement (GSPE) – a plug‑and‑play modification for spiking neurons that equips each neuron with an internal “prediction current” derived from its recent input‑output activity. By letting the neuron anticipate its own firing, GSPE creates a smooth gradient path that eases training while staying faithful to known cortical mechanisms such as distal dendritic modulation and error‑driven plasticity. The result is a simple, biologically‑inspired upgrade that consistently lifts the accuracy and stability of Spiking Neural Networks (SNNs) across a range of architectures and tasks.

Key Contributions

Self‑prediction current: A novel internal signal generated from a neuron’s recent spike history that modulates its membrane potential.
Continuous gradient flow: The prediction current provides a differentiable pathway, mitigating the vanishing‑gradient problem that plagues conventional SNN training.
Biological plausibility: The mechanism mirrors distal dendritic inputs and predictive coding observed in cortical circuits, bridging the gap between engineering performance and neuroscience realism.
Architecture‑agnostic: GSPE can be attached to any spiking neuron model (e.g., LIF, IF, adaptive neurons) and works with feed‑forward, recurrent, and convolutional SNNs.
Broad empirical validation: Experiments on image classification (CIFAR‑10/100, ImageNet), neuromorphic event‑based datasets (DVS‑Gesture), and reinforcement learning benchmarks demonstrate consistent accuracy gains with minimal extra compute.

Methodology

Prediction Module
- For each neuron, a lightweight recurrent filter (e.g., an exponential moving average) processes the past k input spikes and the neuron’s own output spikes.
- The filter outputs a scalar prediction current (p_t) that is added to the membrane potential before the usual threshold check.
Membrane Update Equation

[ V_{t+1} = \alpha V_t + I_t + \beta p_t - V_{\text{th}} \cdot s_t ]

where (I_t) is the external synaptic input, (\alpha) the decay factor, (\beta) a scaling hyper‑parameter, and (s_t) the spike emitted at time t.

Training Pipeline
- Standard surrogate‑gradient backpropagation is retained, but the presence of (p_t) supplies a continuous derivative with respect to past spikes, reducing gradient sparsity.
- No extra loss terms are required; the network learns to exploit the predictive signal automatically.
Implementation Details
- The prediction filter adds only a few arithmetic operations per neuron (≈1–2 FLOPs) and a small state vector (the filter’s hidden state).
- The method is compatible with existing SNN frameworks (e.g., BindsNET, Norse, SpikingJelly) and can be toggled on/off via a single flag.

Results & Findings

Dataset / Task	Baseline SNN (top‑1)	+GSPE (top‑1)	Relative ↑	Extra Ops / Neuron
CIFAR‑10 (VGG‑SNN, 4 steps)	84.2 %	87.6 %	+4.0 %	~1 %
CIFAR‑100 (ResNet‑SNN, 6 steps)	61.5 %	65.9 %	+7.2 %	~1 %
ImageNet (MobileNet‑SNN, 8 steps)	68.1 %	71.3 %	+4.7 %	—
DVS‑Gesture (event‑based)	96.3 %	97.8 %	+1.5 %	—
RL (CartPole, SNN‑actor)	195 steps avg.	212 steps	+8.7 %	—

Training stability: Loss curves converge 20‑30 % faster, and the variance across random seeds shrinks dramatically.
Energy impact: Because the prediction current is computed locally and adds negligible arithmetic, the overall spike‑based energy budget remains essentially unchanged.
Compatibility: Gains persist when swapping LIF for Adaptive LIF, when using spiking transformers, or when reducing the number of time steps to as low as 2.

Practical Implications

Easier SNN adoption: Developers can now train deeper or more complex SNNs without wrestling with exploding/vanishing gradients, lowering the barrier to entry for event‑driven AI on edge devices.
Hardware friendliness: The extra state is a single scalar per neuron, which maps cleanly onto neuromorphic chips (e.g., Intel Loihi, IBM TrueNorth) that already support per‑neuron registers.
Improved inference accuracy at low latency: Since GSPE works even with very few time steps, systems that need sub‑millisecond reaction times (autonomous drones, tactile robots) can benefit from higher classification performance without sacrificing speed.
Biologically plausible AI: By aligning SNN training with predictive coding, the method opens avenues for hybrid models that combine deep learning performance with neuroscientific interpretability—useful for brain‑computer interfaces and cognitive modeling.

Limitations & Future Work

Hyper‑parameter sensitivity: The scaling factor (\beta) and filter window k need modest tuning for each new architecture; automated search could streamline this.
Memory overhead on ultra‑low‑power chips: Although minimal, storing an extra state per neuron may be non‑trivial for extremely constrained silicon where every bit counts.
Theoretical analysis: The paper provides empirical evidence of gradient smoothing but a formal proof of convergence or optimality remains open.
Extension to unsupervised / continual learning: Future research could explore how self‑prediction interacts with spike‑timing‑dependent plasticity rules in lifelong learning scenarios.

Bottom line: GSPE offers a straightforward, biologically inspired tweak that makes spiking networks easier to train and more accurate, all while keeping the energy‑efficiency that makes SNNs attractive for next‑generation edge AI. For developers looking to experiment with event‑driven models or to port deep‑learning workloads onto neuromorphic hardware, adding a self‑prediction current could be the missing piece that bridges performance and plausibility.

Authors

Zihan Huang
Zijie Xu
Yihan Huang
Shanshan Jia
Tong Bu
Yiting Dong
Wenxuan Liu
Jianhao Ding
Zhaofei Yu
Tiejun Huang

Paper Information

arXiv ID: 2601.21823v1
Categories: cs.NE
Published: January 29, 2026
PDF: Download PDF

[Paper] General Self-Prediction Enhancement for Spiking Neurons

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] VideoGPA: Distilling Geometry Priors for 3D-Consistent Video Generation

[Paper] End-to-end Optimization of Belief and Policy Learning in Shared Autonomy Paradigms

[Paper] User Prompting Strategies and Prompt Enhancement Methods for Open-Set Object Detection in XR Environments

[Paper] Decoupled Diffusion Sampling for Inverse Problems on Function Spaces