[Paper] Three factor delay learning rules for spiking neural networks
Source: arXiv - 2601.00668v1
Overview
The paper introduces a new way to train spiking neural networks (SNNs) by learning both synaptic weights and the timing delays of spikes. Using online three‑factor learning rules, the authors achieve substantial accuracy gains on temporal tasks while dramatically shrinking model size and inference latency—making SNNs far more attractive for low‑power, real‑time neuromorphic hardware.
Key Contributions
- Delay‑augmented LIF neurons – Extends the classic leaky‑integrate‑and‑fire (LIF) model with learnable synaptic and axonal delays for feed‑forward and recurrent architectures.
- Three‑factor online learning rule – Combines a locally computed eligibility trace (via a smooth Gaussian surrogate for the spike derivative) with a top‑down error signal to update both weights and delays in real time.
- Empirical gains – Demonstrates up to 20 % higher accuracy over weight‑only baselines and up to 14 % improvement when jointly learning weights + delays with comparable parameter budgets.
- Competitive performance on SHD – Matches offline back‑propagation results on the Speech Heidelberg Digits (SHD) benchmark while cutting model size by 6.6× and inference latency by 67 % (only a 2.4 % accuracy drop vs. state‑of‑the‑art).
- Hardware‑friendly design – Shows that on‑device, online learning of delays can reduce memory footprints and power consumption, a key requirement for neuromorphic processors.
Methodology
-
Neuron model – Starts from the standard LIF neuron and adds two delay parameters:
- Synaptic delay – Time between presynaptic spike emission and arrival at the postsynaptic membrane.
- Axonal delay – Extra latency before the spike is emitted after the membrane crosses threshold.
-
Eligibility trace – Each synapse maintains an eligibility trace that captures how past spikes influence the current membrane potential. The trace is computed using a Gaussian surrogate gradient that smooths the otherwise non‑differentiable spike function.
-
Three‑factor update – Parameter updates follow the classic three‑factor rule:
- Factor 1 – Presynaptic activity (spike).
- Factor 2 – Eligibility trace (local, time‑dependent sensitivity).
- Factor 3 – Global error signal (e.g., difference between desired and actual output).
The product of these three terms yields a weight or delay increment, allowing the network to adapt both synaptic strength and timing on the fly.
-
Training regime – Experiments are run on event‑based datasets (including SHD) using online stochastic gradient descent; no offline back‑propagation through time is required, which keeps memory usage low.
Results & Findings
| Dataset | Baseline (weights‑only) | +Learned Delays | Joint Weights + Delays | Offline BPTT (state‑of‑the‑art) |
|---|---|---|---|---|
| SHD (speech) | 71.2 % | 84.5 % (+13.3 %) | 86.9 % (+15.7 %) | 89.3 % (≈2.4 % higher) |
| Other temporal benchmarks | 58 % → 68 % | 68 % → 78 % | 78 % → 84 % | — |
- Model size: Delay‑augmented networks achieve the same or higher accuracy with ≈15 % of the parameters of comparable BPTT‑trained SNNs.
- Latency: Because delays are learned directly in the forward pass, inference runs ~67 % faster than the offline‑trained counterparts.
- Stability: The three‑factor rule remains stable across both feed‑forward and recurrent topologies, showing that delay learning scales to more complex dynamics.
Practical Implications
- Neuromorphic chips – Reducing memory and compute requirements directly translates to lower silicon area and power draw, enabling edge devices (e.g., wearables, IoT sensors) to run sophisticated temporal pattern recognizers locally.
- On‑device continual learning – Since the learning rule is online, devices can adapt to new sound signatures, sensor drift, or user‑specific patterns without offloading data to the cloud.
- Temporal data processing – Applications such as speech command recognition, event‑camera vision, and bio‑signal classification can benefit from the added temporal precision that learned delays provide.
- Simplified software stacks – The method avoids back‑propagation‑through‑time, meaning existing spiking frameworks (e.g., BindsNET, Norse, SpykeTorch) can implement the rule with minimal changes, accelerating adoption.
Limitations & Future Work
- Surrogate gradient dependence – The Gaussian surrogate is hand‑tuned; its shape may affect convergence speed and final accuracy, suggesting a need for systematic exploration of surrogate families.
- Scalability to large‑scale vision tasks – Experiments focus on temporal/audio benchmarks; extending delay learning to high‑resolution event‑camera datasets remains an open challenge.
- Hardware validation – While the paper reports theoretical latency and size gains, a full silicon implementation (e.g., on Loihi or a custom ASIC) would be required to confirm real‑world energy savings.
- Delay range constraints – Physical hardware imposes limits on how fine‑grained delays can be represented; future work should investigate quantization effects and hardware‑aware delay encoding.
Bottom line: By teaching spiking networks when to fire, not just how strongly to fire, Vassallo and Taherinejad open a practical path toward compact, low‑latency, and continuously learning neuromorphic systems—an exciting development for developers building the next generation of edge AI.
Authors
- Luke Vassallo
- Nima Taherinejad
Paper Information
- arXiv ID: 2601.00668v1
- Categories: cs.NE, cs.LG
- Published: January 2, 2026
- PDF: Download PDF