[Paper] Self-motion as a structural prior for coherent and robust formation of cognitive maps
Source: arXiv - 2512.20044v1
Overview
This paper challenges the prevailing view that cognitive maps rely mainly on external sensory cues, proposing instead that self‑motion (the animal’s own movement) can serve as a structural prior that actively shapes and stabilises spatial representations. By embedding a motion‑based prior into a predictive‑coding network that uses spiking‑style dynamics, the authors demonstrate more coherent and robust map formation—even when visual landmarks are noisy, missing, or conflicting.
Key Contributions
- Motion‑based structural prior: Introduces a path‑integration module that acts as a scaffold for the learned map, rather than a simple incremental update.
- Brain‑inspired recurrent architecture: Combines spiking dynamics, analog modulation, and adaptive thresholds to achieve high capacity with low computational overhead.
- Robustness across challenging settings: Shows consistent improvements in topological fidelity and positional accuracy in highly aliased, dynamically changing, and naturalistic environments.
- Zero‑shot generalisation: The motion prior encodes precise trajectories that transfer to unseen maps without retraining, outperforming naïve motion constraints.
- Real‑world validation: Deploys the system on a quadrupedal robot, where the motion prior boosts landmark‑based navigation under real‑world sensory variability.
Methodology
- Predictive‑Coding Framework: The network predicts the next sensory observation and updates its internal state by minimising prediction error, mirroring theories of cortical inference.
- Path‑Integration Prior: A dedicated module integrates proprioceptive and vestibular‑like signals to generate a latent trajectory that constrains the map’s geometry.
- Spiking‑Analog Hybrid Neurons: Each recurrent unit emits discrete spikes whose rates are modulated by continuous analog signals; adaptive thresholds prevent runaway activity and keep the model size small.
- Training Regime: The system is trained end‑to‑end on simulated environments with deliberately corrupted or missing visual landmarks, encouraging reliance on the motion prior.
- Evaluation Suite: Benchmarks include (a) topological correctness (graph‑based metrics), (b) global positional error, and (c) next‑step prediction accuracy under varying levels of sensory ambiguity.
Results & Findings
- Stabilised Map Geometry: Adding the motion prior reduced global positional error by ≈30 % and improved topological consistency by ≈25 % across all test worlds.
- Resilience to Sensory Degradation: When visual cues were down‑sampled to 10 % of their original fidelity, the prior‑augmented model maintained >80 % of its baseline performance, whereas a sensory‑only baseline collapsed below 50 %.
- Zero‑Shot Transfer: Without any fine‑tuning, the model achieved comparable accuracy on completely new mazes, confirming that the motion prior captures environment‑independent geometric constraints.
- Robot Demo: On a quadrupedal platform navigating a cluttered indoor arena, the motion‑prior system completed the task 1.8× faster and with 40 % fewer localisation failures than a conventional SLAM stack that relied solely on visual landmarks.
Practical Implications
- More Reliable SLAM for Edge Devices: The motion prior can be implemented with minimal memory and compute, making it attractive for low‑power robots, drones, or AR headsets that must operate under intermittent visual input.
- Improved Navigation in GPS‑Denied Settings: By treating self‑motion as a structural scaffold, autonomous vehicles can maintain coherent maps when GPS or LiDAR data are temporarily unavailable.
- Hybrid Neuro‑Inspired Controllers: Developers can integrate the spiking‑analog recurrent block into existing deep‑learning pipelines, gaining the robustness of biological motion integration without sacrificing scalability.
- Zero‑Shot Map Adaptation: The demonstrated generalisation suggests that a single pre‑trained motion prior could serve multiple robots or environments, reducing the need for costly data collection and re‑training.
Limitations & Future Work
- Simplified Proprioception Model: The current path‑integration module assumes idealised, noise‑free self‑motion signals; real sensors (IMUs, wheel encoders) introduce drift that must be explicitly compensated.
- Scalability to Large‑Scale Outdoor Maps: Experiments were confined to indoor‑scale arenas; extending the approach to city‑scale navigation will require hierarchical priors and long‑term memory mechanisms.
- Biological Plausibility vs. Engineering Trade‑offs: While spiking dynamics reduce parameter count, they may complicate integration with mainstream deep‑learning frameworks; future work could explore software‑hardware co‑design (e.g., neuromorphic chips).
- Multi‑Modal Fusion: The study focused on visual landmarks; incorporating auditory, tactile, or semantic cues could further enhance robustness and is a promising direction for follow‑up research.
Authors
- Yingchao Yu
- Pengfei Sun
- Yaochu Jin
- Kuangrong Hao
- Hao Zhang
- Yifeng Zhang
- Wenxuan Pan
- Wei Chen
- Danyal Akarca
- Yuchen Xiao
Paper Information
- arXiv ID: 2512.20044v1
- Categories: q-bio.NC, cs.NE
- Published: December 23, 2025
- PDF: Download PDF