[Paper] STEMNIST: Spiking Tactile Extended MNIST Neuromorphic Dataset

Published: (January 4, 2026 at 03:26 PM EST)
3 min read
Source: arXiv

Source: arXiv - 2601.01658v1

Overview

The authors present STEMNIST, a large‑scale neuromorphic tactile dataset that expands the original ST‑MNIST digit set to 35 alphanumeric classes (A–Z and 1–9). By capturing over 1 M spike events from a 16 × 16 tactile sensor array, the dataset offers a realistic, event‑driven benchmark for haptic perception in robotics, prosthetics, and other human‑machine interfaces.

Key Contributions

  • Dataset expansion – Extends the tactile benchmark from 10 digits to 35 alphanumeric symbols, matching the EMNIST visual protocol.
  • High‑resolution event encoding – 34 participants generated 7 700 samples, recorded at 120 Hz and transformed into 1 005 592 spike events via adaptive temporal differentiation.
  • Open‑source release – Full dataset, documentation, and baseline code are publicly available, encouraging reproducibility.
  • Baseline performance – Provides reference accuracies for conventional CNNs (90.91 %) and spiking neural networks (SNNs, 89.16 %).
  • Hardware‑friendly format – Event‑based representation aligns with neuromorphic chips (e.g., Loihi, TrueNorth), enabling low‑power tactile inference.

Methodology

  1. Data collection – A 16 × 16 pressure‑sensitive array (120 Hz) was mounted on a handheld probe. Thirty‑four volunteers traced each alphanumeric character on the sensor surface, producing raw pressure frames.
  2. Spike conversion – An adaptive temporal‑differencing algorithm detected significant changes in pressure over time, emitting a binary “spike” (1) for each active pixel‑time pair. This yields a sparse, asynchronous event stream similar to neuromorphic vision sensors.
  3. Dataset split – Following EMNIST, the data were divided into training (≈ 6 200 samples) and test (≈ 1 500 samples) sets, preserving participant diversity across splits.
  4. Baseline models
    • CNN – Standard 2‑D convolutions applied to a frame‑based accumulation of spikes.
    • SNN – Leaky‑integrate‑and‑fire neurons trained with surrogate‑gradient back‑propagation on the raw event stream.

All preprocessing steps and hyper‑parameters are detailed in the accompanying repository.

Results & Findings

ModelTest Accuracy
Conventional CNN (frame‑based)90.91 %
Spiking Neural Network (event‑based)89.16 %

Interpretation

  • The modest gap (~1.8 %) shows that SNNs can approach CNN performance while preserving the energy‑saving benefits of event‑driven processing.
  • Misclassifications are concentrated among visually similar characters (e.g., “O” vs. “0”, “I” vs. “1”), indicating that tactile shape discrimination still faces ambiguity that could be mitigated with richer temporal cues or multimodal sensing.

Practical Implications

  • Robotic manipulation – Robots equipped with neuromorphic tactile skins can now recognize alphanumeric labels on objects (e.g., tool IDs, medication packaging) without vision, enabling operation in low‑light or occluded environments.
  • Prosthetic feedback – SNN‑based controllers can decode user‑drawn symbols on a prosthetic fingertip, opening avenues for on‑the‑fly command entry without external devices.
  • Edge AI hardware – The spike‑based format is ready for deployment on low‑power neuromorphic processors, allowing continuous tactile monitoring with milliwatt‑scale energy budgets.
  • Human‑machine interfaces – Touch‑based password entry or gesture vocabularies become feasible on devices that prioritize privacy (no camera) and power efficiency.

Developers can plug the dataset into existing neuromorphic frameworks (e.g., Lava, BindsNET) to benchmark custom learning rules, hardware accelerators, or hybrid CNN‑SNN pipelines.

Limitations & Future Work

  • Sensor geometry – The 16 × 16 grid limits spatial resolution; scaling to larger arrays may uncover new challenges.
  • User variability – While 34 participants provide diversity, real‑world deployments will encounter broader pressure ranges and hand dynamics.
  • Temporal richness – The current adaptive differentiation compresses some fine‑grained timing information; future versions could retain higher‑frequency events to exploit the full potential of SNNs.
  • Multimodal fusion – Combining tactile spikes with vision or auditory cues is left for subsequent studies, promising more robust object identification.

Overall, STEMNIST fills a critical gap in neuromorphic tactile research and offers a solid foundation for energy‑efficient haptic perception in next‑generation interactive systems.

Authors

  • Anubhab Tripathi
  • Li Gaishan
  • Zhengnan Fu
  • Chiara Bartolozzi
  • Bert E. Shi
  • Arindam Basu

Paper Information

  • arXiv ID: 2601.01658v1
  • Categories: cs.NE
  • Published: January 4, 2026
  • PDF: Download PDF
Back to Blog

Related posts

Read more »