[Paper] Equivalence of Personalized PageRank and Successor Representations
Source: arXiv - 2512.24722v1
Overview
In a concise yet thought‑provoking note, Beren Millidge shows that two seemingly unrelated algorithms—Personalized PageRank (PPR) and Successor Representations (SR)—are mathematically identical when applied to a graph. Both have been championed as computational models of hippocampal functions: PPR for episodic memory retrieval and SR for planning/navigation. The paper argues that the hippocampus may simply be computing the stationary distribution of a random walk, unifying memory and planning under a single representation.
Key Contributions
- Isomorphism proof: Demonstrates a formal equivalence between PPR and SR, revealing they are two faces of the same stationary‑distribution computation.
- Unified hippocampal hypothesis: Proposes that the hippocampus’s core operation is to estimate the stationary distribution of arbitrary input graphs, thereby supporting both memory recall and navigation.
- Conceptual bridge: Connects two research communities (graph‑based ranking and reinforcement‑learning planning) that have historically treated these algorithms separately.
- Compact exposition: Provides a clear, single‑page derivation that can serve as a reference for future interdisciplinary work.
Methodology
- Graph formalism – The paper models the environment (or memory network) as a directed, weighted graph (G = (V, E)).
- Personalized PageRank – Defined as the solution to
[ \mathbf{p} = (1-\alpha)\mathbf{e}_s + \alpha \mathbf{P}^\top \mathbf{p}, ]
where (\mathbf{P}) is the transition matrix, (\alpha) the teleport probability, and (\mathbf{e}_s) a one‑hot source vector. - Successor Representation – Defined for a policy (\pi) as
[ \mathbf{M} = (\mathbf{I} - \gamma \mathbf{P}_\pi)^{-1}, ]
where (\gamma) is the discount factor. - Equivalence step – By setting (\alpha = \gamma) and interpreting the source vector (\mathbf{e}_s) as a one‑step reward, the fixed‑point equation of PPR matches the linear system solved for SR. Both reduce to computing the stationary distribution (\mathbf{\pi}) that satisfies (\mathbf{\pi} = \mathbf{P}^\top \mathbf{\pi}).
- Interpretation – The stationary distribution captures the long‑run visitation frequency of nodes, which can be read as either a “memory relevance score” (PPR) or a “future occupancy map” (SR).
The derivation is purely algebraic; no new experiments are required. The author’s contribution lies in recognizing and formalizing the overlap.
Results & Findings
- Mathematical identity: The stationary distribution (\mathbf{\pi}) simultaneously solves the Personalized PageRank and Successor Representation equations when the teleport/discount parameters are aligned.
- Interpretive unification: In the hippocampal analogy, (\mathbf{\pi}) can be read as a probability distribution over recalled memories (high‑probability nodes are more likely to be retrieved) and as a predictive map for future states during navigation.
- Implication for neural coding: The same neural substrate could support both functions without needing separate specialized circuits.
Practical Implications
- Unified libraries: Developers building graph‑based recommendation, search, or RL systems can reuse a single implementation (e.g., a power‑iteration solver) for both ranking and predictive planning tasks.
- Memory‑augmented RL: In model‑based reinforcement learning, storing the stationary distribution of the transition graph offers a compact, reusable “memory core” that can be queried for both recall (e.g., case‑based reasoning) and planning (e.g., value estimation).
- Explainable AI: Because the stationary distribution is interpretable as long‑run visitation frequencies, it can serve as a transparent feature for debugging recommendation pipelines or navigation agents.
- Neuro‑inspired architectures: Hardware accelerators or neuromorphic chips that efficiently compute random‑walk stationary distributions could simultaneously support memory retrieval and planning modules, reducing architectural redundancy.
- Cross‑domain transfer: Techniques from web‑search (fast PPR approximations) can be directly applied to RL environments, and vice‑versa (SR‑based policy improvement can inform graph‑ranking heuristics).
Limitations & Future Work
- Scope of the proof: The equivalence holds under the assumption of a fixed transition matrix and a single source/teleport vector; dynamic or stochastic policies may break the exact isomorphism.
- Biological realism: The paper does not address how the brain might compute the stationary distribution efficiently, nor does it model noise, spiking dynamics, or anatomical constraints.
- Empirical validation: No simulations or neural data are presented to test whether hippocampal activity truly reflects the stationary distribution in realistic tasks.
- Extension to hierarchical graphs: Future work could explore whether multi‑scale or hierarchical representations (e.g., options in RL, community detection in graphs) preserve the equivalence and how they map onto hippocampal sub‑regions.
Bottom line: By revealing that Personalized PageRank and Successor Representations are two sides of the same coin, Millidge opens the door for tighter integration between graph‑ranking systems and reinforcement‑learning planners—both in software and, intriguingly, in our understanding of the brain.
Authors
- Beren Millidge
Paper Information
- arXiv ID: 2512.24722v1
- Categories: cs.NE
- Published: December 31, 2025
- PDF: Download PDF