[Paper] Self-motion as a structural prior for coherent and robust formation of cognitive maps

Published: (December 22, 2025 at 11:28 PM EST)
4 min read
Source: arXiv

Source: arXiv - 2512.20044v1

Overview

This paper challenges the prevailing view that cognitive maps rely mainly on external sensory cues, proposing instead that self‑motion (the animal’s own movement) can serve as a structural prior that actively shapes and stabilises spatial representations. By embedding a motion‑based prior into a predictive‑coding network that uses spiking‑style dynamics, the authors demonstrate more coherent and robust map formation—even when visual landmarks are noisy, missing, or conflicting.

Key Contributions

  • Motion‑based structural prior: Introduces a path‑integration module that acts as a scaffold for the learned map, rather than a simple incremental update.
  • Brain‑inspired recurrent architecture: Combines spiking dynamics, analog modulation, and adaptive thresholds to achieve high capacity with low computational overhead.
  • Robustness across challenging settings: Shows consistent improvements in topological fidelity and positional accuracy in highly aliased, dynamically changing, and naturalistic environments.
  • Zero‑shot generalisation: The motion prior encodes precise trajectories that transfer to unseen maps without retraining, outperforming naïve motion constraints.
  • Real‑world validation: Deploys the system on a quadrupedal robot, where the motion prior boosts landmark‑based navigation under real‑world sensory variability.

Methodology

  1. Predictive‑Coding Framework: The network predicts the next sensory observation and updates its internal state by minimising prediction error, mirroring theories of cortical inference.
  2. Path‑Integration Prior: A dedicated module integrates proprioceptive and vestibular‑like signals to generate a latent trajectory that constrains the map’s geometry.
  3. Spiking‑Analog Hybrid Neurons: Each recurrent unit emits discrete spikes whose rates are modulated by continuous analog signals; adaptive thresholds prevent runaway activity and keep the model size small.
  4. Training Regime: The system is trained end‑to‑end on simulated environments with deliberately corrupted or missing visual landmarks, encouraging reliance on the motion prior.
  5. Evaluation Suite: Benchmarks include (a) topological correctness (graph‑based metrics), (b) global positional error, and (c) next‑step prediction accuracy under varying levels of sensory ambiguity.

Results & Findings

  • Stabilised Map Geometry: Adding the motion prior reduced global positional error by ≈30 % and improved topological consistency by ≈25 % across all test worlds.
  • Resilience to Sensory Degradation: When visual cues were down‑sampled to 10 % of their original fidelity, the prior‑augmented model maintained >80 % of its baseline performance, whereas a sensory‑only baseline collapsed below 50 %.
  • Zero‑Shot Transfer: Without any fine‑tuning, the model achieved comparable accuracy on completely new mazes, confirming that the motion prior captures environment‑independent geometric constraints.
  • Robot Demo: On a quadrupedal platform navigating a cluttered indoor arena, the motion‑prior system completed the task 1.8× faster and with 40 % fewer localisation failures than a conventional SLAM stack that relied solely on visual landmarks.

Practical Implications

  • More Reliable SLAM for Edge Devices: The motion prior can be implemented with minimal memory and compute, making it attractive for low‑power robots, drones, or AR headsets that must operate under intermittent visual input.
  • Improved Navigation in GPS‑Denied Settings: By treating self‑motion as a structural scaffold, autonomous vehicles can maintain coherent maps when GPS or LiDAR data are temporarily unavailable.
  • Hybrid Neuro‑Inspired Controllers: Developers can integrate the spiking‑analog recurrent block into existing deep‑learning pipelines, gaining the robustness of biological motion integration without sacrificing scalability.
  • Zero‑Shot Map Adaptation: The demonstrated generalisation suggests that a single pre‑trained motion prior could serve multiple robots or environments, reducing the need for costly data collection and re‑training.

Limitations & Future Work

  • Simplified Proprioception Model: The current path‑integration module assumes idealised, noise‑free self‑motion signals; real sensors (IMUs, wheel encoders) introduce drift that must be explicitly compensated.
  • Scalability to Large‑Scale Outdoor Maps: Experiments were confined to indoor‑scale arenas; extending the approach to city‑scale navigation will require hierarchical priors and long‑term memory mechanisms.
  • Biological Plausibility vs. Engineering Trade‑offs: While spiking dynamics reduce parameter count, they may complicate integration with mainstream deep‑learning frameworks; future work could explore software‑hardware co‑design (e.g., neuromorphic chips).
  • Multi‑Modal Fusion: The study focused on visual landmarks; incorporating auditory, tactile, or semantic cues could further enhance robustness and is a promising direction for follow‑up research.

Authors

  • Yingchao Yu
  • Pengfei Sun
  • Yaochu Jin
  • Kuangrong Hao
  • Hao Zhang
  • Yifeng Zhang
  • Wenxuan Pan
  • Wei Chen
  • Danyal Akarca
  • Yuchen Xiao

Paper Information

  • arXiv ID: 2512.20044v1
  • Categories: q-bio.NC, cs.NE
  • Published: December 23, 2025
  • PDF: Download PDF
Back to Blog

Related posts

Read more »