[Paper] Quadratic integrate-and-fire neurons exhibit less fragmented loss landscapes and outperform leaky integrate-and-fire neurons in spike-based gradient descent

Published: 2 days ago (June 2, 2026 at 01:26 PM EDT)

4 min read

Source: arXiv

Source: arXiv - 2606.03935v1

Overview

Training spiking neural networks (SNNs) with gradient‑based methods has been hampered by the abrupt, “all‑or‑nothing” spiking behavior of the classic leaky integrate‑and‑fire (LIF) neuron. This paper shows that switching to quadratic integrate‑and‑fire (QIF) neurons yields smoother loss surfaces and consistently better performance on a standard benchmark, making SNN training far more reliable for both neuroscience modeling and neuromorphic hardware.

Key Contributions

Empirical comparison of LIF vs. QIF neurons on the Spiking Heidelberg Digits (SHD) dataset, including an exhaustive hyper‑parameter sweep for each model.
Demonstration of superior accuracy of QIF‑based networks after optimal tuning.
Loss‑landscape analysis revealing that LIF networks produce highly fragmented, discontinuous loss surfaces, while QIF networks generate smoother, more navigable landscapes.
Gradient behavior study showing erratic gradients for LIF neurons caused by spike (dis)appearances, contrasted with stable gradients for QIF neurons.
Practical recommendation to replace LIF neurons with continuous‑spiking models (e.g., QIF) for spike‑based gradient descent.

Methodology

Network Architecture – Both experiments used identical feed‑forward SNN architectures (input → hidden → read‑out) differing only in the neuron model (LIF or QIF).
Dataset – The Spiking Heidelberg Digits (SHD) benchmark provides event‑based audio recordings of spoken digits, a common testbed for temporal coding in SNNs.
Training Procedure – Spike‑based back‑propagation through time (BPTT) with surrogate gradients was employed. For each neuron type, a grid search over learning rates, membrane time constants, surrogate‑gradient shapes, and regularization strengths was performed to locate the best hyper‑parameters.
Landscape Visualization – After training, the authors sampled loss values and gradients along random 2‑D slices of the parameter space around the converged solutions. They also inspected per‑sample loss surfaces to pinpoint the source of discontinuities.
Analysis of Spike Dynamics – By tracking the temporal order of spikes before and after tiny parameter perturbations, they linked loss fragmentation to sudden spike insertions or deletions.

Results & Findings

Metric	LIF (best tuned)	QIF (best tuned)
Test accuracy (SHD)	71.3 %	78.9 %
Training stability (epochs without NaNs)	68 % of runs diverged	100 % of runs converged
Average loss surface smoothness (measured by Lipschitz estimate)	Low (highly jagged)	High (smooth)
Gradient variance across nearby points	3.4× higher than QIF	Low, stable gradients

Performance Gap: QIF networks consistently outperformed LIF networks by ~7–8 % absolute accuracy after hyper‑parameter optimization.
Landscape Fragmentation: Visualizations showed that LIF loss surfaces contain many “cliffs” where a tiny weight change flips the order of spikes, causing abrupt loss jumps. QIF surfaces are comparatively flat, allowing gradient descent to follow a smooth path.
Spike (Dis)appearance: The authors traced most loss discontinuities to events where a single spike either appears or disappears, which in LIF models can cascade and silence entire downstream neurons. QIF dynamics, being continuous in voltage, avoid such binary jumps.

Practical Implications

Neuromorphic Chip Design: Engineers can implement QIF or other continuous‑spiking models on analog/digital neuromorphic hardware to gain more predictable training dynamics, reducing the need for ad‑hoc tricks (e.g., spike regularization, surrogate‑gradient tuning).
Rapid Prototyping: Machine‑learning practitioners building SNNs for edge AI (audio/event detection, low‑power vision) should favor QIF neurons when using gradient‑based training pipelines, as they lead to faster convergence and fewer failed runs.
Biological Modeling: Researchers simulating cortical circuits can obtain more stable representations without manually constraining parameters, enabling longer‑term learning experiments.
Framework Integration: Popular SNN libraries (e.g., Norse, BindsNET, SpyTorch) can expose QIF as a drop‑in replacement for LIF, providing a smoother training experience out of the box.

Limitations & Future Work

Dataset Scope: The study focuses on a single temporal‑coding benchmark (SHD). Generalization to vision‑oriented SNN tasks (e.g., DVS‑MNIST, N‑Caltech) remains to be validated.
Computational Overhead: QIF neurons involve a quadratic term in the membrane equation, which can be slightly more expensive on hardware lacking native support for multiplication.
Model Diversity: Only feed‑forward architectures were examined; recurrent SNNs or spiking transformers may exhibit different dynamics.
Surrogate Gradient Choices: While the paper uses standard surrogate functions, exploring alternative approximations tailored to QIF dynamics could further improve performance.

Bottom line: If you’re training spiking networks with gradient descent, swapping the classic LIF neuron for a quadratic integrate‑and‑fire unit can give you smoother loss landscapes, more reliable convergence, and a noticeable boost in accuracy—making it a practical upgrade for both research and production‑grade neuromorphic systems.*

Authors

Carlo Wenig
Raoul-Martin Memmesheimer
Christian Klos

Paper Information

arXiv ID: 2606.03935v1
Categories: cs.NE, cs.LG
Published: June 2, 2026
PDF: Download PDF

[Paper] Quadratic integrate-and-fire neurons exhibit less fragmented loss landscapes and outperform leaky integrate-and-fire neurons in spike-based gradient descent

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] STRIDE: Training Data Attribution via Sparse Recovery from Subset Perturbations

[Paper] Streaming Communication in Multi-Agent Reasoning

[Paper] Reinforcement Learning from Rich Feedback with Distributional DAgger

[Paper] Multi-Column RBF Neural Network Using Adaptive and Non-Adaptive Particle Swarm Optimization