[Paper] Linearized Bregman Iterations for Sparse Spiking Neural Networks

Published: (March 17, 2026 at 08:48 AM EDT)
4 min read
Source: arXiv

Source: arXiv - 2603.16462v1

Overview

The paper proposes Linearized Bregman Iterations (LBI) as a new optimizer for training Spiking Neural Networks (SNNs). By integrating a sparsity‑inducing regularizer directly into the training loop, the authors achieve roughly 50 % fewer active synapses without sacrificing classification accuracy on standard neuromorphic benchmarks.

Key Contributions

  • Introduces LBI for SNNs: Adapts the Linearized Bregman Iteration framework—originally used in convex optimization—to the non‑convex training of spiking networks.
  • AdaBreg optimizer: Extends LBI with momentum and bias‑correction (analogous to Adam) to improve convergence speed and stability.
  • Sparse SNN models: Demonstrates that LBI can automatically prune away unnecessary weights, yielding models with half the active parameters while preserving performance.
  • Comprehensive empirical evaluation: Experiments on three neuromorphic datasets (SHD, SSC, PSMNIST) show parity with Adam‑trained baselines in terms of accuracy and superior parameter efficiency.
  • Open‑source implementation: The authors release code and training scripts, facilitating reproducibility and adoption by the community.

Methodology

  1. Problem formulation: Training an SNN is cast as minimizing a loss function plus an ℓ₁‑type sparsity term, encouraging many weights to become exactly zero.
  2. Linearized Bregman Iterations:
    • At each step, compute the gradient of the loss (as in standard back‑propagation through time).
    • Perform a proximal soft‑thresholding update on a dual variable, which implicitly drives small weights toward zero.
    • Update the primal weight vector by integrating the dual variable, yielding a linearized Bregman step.
  3. AdaBreg: Mirrors Adam’s adaptive learning rates and momentum but operates on the Bregman dual variables, providing bias‑correction and smoother convergence.
  4. Training pipeline: The authors use surrogate gradient methods for spike‑based back‑propagation, plug in LBI/AdaBreg as the optimizer, and monitor the active‑parameter ratio (non‑zero weights) throughout training.

The approach requires only minor modifications to existing SNN training codebases, making it developer‑friendly.

Results & Findings

DatasetBaseline (Adam)LBI / AdaBregActive‑parameter reduction
SHD (speech)78.3 % accuracy77.9 %≈ 52 %
SSC (speech commands)92.1 %91.8 %≈ 48 %
PSMNIST (permuted MNIST)96.4 %96.0 %≈ 50 %
  • Accuracy: Within 0.5 % of Adam across all tasks, confirming that sparsity does not degrade performance.
  • Sparsity: Roughly half of the synaptic connections become zero, directly translating to lower memory footprint and fewer spike‑processing operations.
  • Convergence: AdaBreg reaches comparable loss values in a similar number of epochs to Adam, with slightly smoother loss curves due to the Bregman regularization.

These findings suggest that convex sparsity‑inducing methods can be effectively merged with the inherently non‑convex SNN training landscape.

Practical Implications

  • Energy‑efficient inference: Fewer active weights mean fewer multiply‑accumulate (MAC) operations per timestep, which is critical for low‑power neuromorphic hardware (e.g., Loihi, TrueNorth).
  • Model deployment on edge devices: The reduced memory footprint enables SNNs to fit into tighter on‑chip SRAM budgets, opening doors for real‑time audio or sensor processing on micro‑controllers.
  • Simplified network design: Developers can start from a dense architecture and let LBI automatically prune it, avoiding manual sparsity heuristics or post‑hoc pruning pipelines.
  • Compatibility with existing frameworks: Since LBI is implemented as an optimizer plug‑in, it can be used with PyTorch‑based SNN libraries (e.g., BindsNET, Norse) without rewriting the network definition.

Overall, the technique offers a drop‑in replacement for Adam that yields leaner, hardware‑friendly SNNs while preserving the model’s predictive power.

Limitations & Future Work

  • Scalability to very large SNNs: Experiments are limited to medium‑scale benchmarks; it remains to be seen how LBI behaves on networks with millions of neurons/synapses.
  • Hardware‑specific validation: The paper reports theoretical sparsity benefits; actual energy savings on neuromorphic chips need empirical measurement.
  • Extension to other regularizers: The current ℓ₁ formulation induces unstructured sparsity; exploring structured (e.g., channel‑wise) sparsity could further improve hardware mapping.
  • Hybrid training regimes: Combining LBI with other compression techniques (quantization, low‑rank factorization) is an open avenue for even greater efficiency.

The authors suggest that future research will address these points and investigate LBI’s applicability to reinforcement‑learning‑style spiking agents.

Authors

  • Daniel Windhager
  • Bernhard A. Moser
  • Michael Lunglmayr

Paper Information

  • arXiv ID: 2603.16462v1
  • Categories: eess.SP, cs.NE
  • Published: March 17, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »

[Paper] Matryoshka Gaussian Splatting

The ability to render scenes at adjustable fidelity from a single model, known as level of detail (LoD), is crucial for practical deployment of 3D Gaussian Spla...