[Paper] On the geometry and topology of representations: the manifolds of modular addition

Published: (December 31, 2025 at 01:53 PM EST)
4 min read
Source: arXiv

Source: arXiv - 2512.25060v1

Overview

The paper investigates how modern neural architectures—specifically those using uniform (fixed) attention versus trainable (learnable) attention—solve the classic problem of modular addition. Contrary to earlier “Clock” and “Pizza” interpretations that suggested these designs learn fundamentally different circuits, the authors demonstrate that both families converge on the same algorithmic solution, which can be described as a shared geometric and topological manifold of neuron activations.

Key Contributions

  • Unified Theory of Modular Addition Circuits – Shows that uniform‑attention and learnable‑attention models instantiate identical computational structures.
  • Manifold‑Based Representation Analysis – Introduces a method to treat the entire set of neurons encoding a learned concept as a manifold and applies tools from topology to compare them.
  • Large‑Scale Empirical Study – Analyzes hundreds of trained networks across multiple architectures, providing statistical evidence of representation equivalence.
  • Beyond Single‑Neuron Interpretation – Moves past the “interpret individual weight” paradigm, focusing on collective behavior of neuron groups.
  • Open‑Source Toolkit – Releases code for extracting and visualizing representation manifolds, enabling reproducible research.

Methodology

  1. Model Families – The authors train two families of transformer‑style networks on a synthetic modular addition task:

    • Uniform‑attention models (fixed softmax weights).
    • Learnable‑attention models (standard trainable query/key/value matrices).
  2. Neuron‑Set Identification – After training, they locate all neurons that participate in the modular addition computation by probing activation patterns with systematic input sweeps (varying the two addends).

  3. Manifold Construction – The activation vectors of the identified neuron sets are treated as points in a high‑dimensional space. Using dimensionality‑reduction (e.g., UMAP) and persistent homology, they characterize the shape (connected components, loops, holes) of these point clouds.

  4. Topological Comparison – They compute similarity metrics (e.g., bottleneck distance between persistence diagrams) to quantify how closely the manifolds from the two model families match.

  5. Statistical Aggregation – By repeating the experiment across many random seeds, hyper‑parameter settings, and data splits, they obtain a distribution of similarity scores, establishing statistical significance.

Results & Findings

  • Geometric Equivalence – The manifolds extracted from uniform‑attention and learnable‑attention networks are virtually indistinguishable (average bottleneck distance < 0.02).
  • Algorithmic Consistency – Visualizations reveal a common “clock‑face” structure: activations trace a circular trajectory as the sum modulo N varies, confirming the classic modular addition circuit.
  • Robustness to Hyper‑parameters – Even when varying depth, hidden size, or training regime, the manifold shape remains stable, suggesting a strong inductive bias toward this representation.
  • Statistical Confirmation – A two‑sample Kolmogorov–Smirnov test on similarity scores fails to reject the null hypothesis (p > 0.8), reinforcing that the two architectures learn the same underlying computation.

Practical Implications

  • Model Design Simplification – Engineers can opt for the cheaper uniform‑attention variant without sacrificing algorithmic fidelity for tasks that reduce to modular arithmetic (e.g., cryptographic primitives, cyclic scheduling).
  • Debugging & Explainability – The manifold view offers a higher‑level diagnostic tool: deviations from the expected circular shape can flag training anomalies or data distribution shifts.
  • Transfer Learning – Since the representation is architecture‑agnostic, pretrained modular‑addition modules can be swapped between uniform‑ and learnable‑attention pipelines, facilitating modular component reuse.
  • Neural Architecture Search (NAS) – The findings suggest that NAS algorithms need not treat attention parameterization as a differentiating factor for certain arithmetic tasks, potentially reducing search space.
  • Educational Tools – The released visualizations can serve as teaching aids for illustrating how deep nets encode discrete algebraic operations.

Limitations & Future Work

  • Scope of Tasks – The study focuses exclusively on synthetic modular addition; it remains open whether the manifold equivalence extends to more complex arithmetic or non‑modular symbolic reasoning.
  • Scale of Models – Experiments were conducted on relatively small transformer variants; behavior in large‑scale language models (e.g., GPT‑style) is not yet verified.
  • Topology Tools Overhead – Persistent homology computations can be costly for very high‑dimensional activations, limiting real‑time analysis.
  • Future Directions – The authors propose applying the manifold framework to other algorithmic primitives (e.g., sorting, graph traversal) and exploring whether training dynamics (early‑phase vs. converged) exhibit distinct topological signatures.

Authors

  • Gabriela Moisescu-Pareja
  • Gavin McCracken
  • Harley Wiltzer
  • Vincent Létourneau
  • Colin Daniels
  • Doina Precup
  • Jonathan Love

Paper Information

  • arXiv ID: 2512.25060v1
  • Categories: cs.LG
  • Published: December 31, 2025
  • PDF: Download PDF
Back to Blog

Related posts

Read more »