[Paper] Visualizing LLM Latent Space Geometry Through Dimensionality Reduction

Published: 2 months ago (November 26, 2025 at 12:11 PM EST)

4 min read

Source: arXiv

Source: arXiv - 2511.21594v1

Overview

The paper Visualizing LLM Latent Space Geometry Through Dimensionality Reduction dives into the hidden “thought process” of transformer‑based language models such as GPT‑2 and LLaMa. By extracting layer‑wise activations and projecting them into 2‑D/3‑D visualizations, the authors reveal geometric patterns that were previously invisible, offering a new lens for developers to reason about model behavior.

Key Contributions

Systematic extraction pipeline for layer‑wise activations from both attention heads and MLP blocks in large language models.
Dual‑dimensionality‑reduction analysis using Principal Component Analysis (PCA) and Uniform Manifold Approximation and Projection (UMAP) to expose latent‑space geometry.
Discovery of a clear separation between attention‑output and MLP‑output representations across intermediate layers—an observation not reported before.
Visualization of positional embedding geometry, showing a high‑dimensional helical structure in GPT‑2’s positional vectors.
Layer‑wise evolution maps that track how token representations morph as they travel through the network, including the unusually high norm of the first token’s latent state.
Open‑source tooling (Python library) released on GitHub to enable reproducible analysis for the community.

Methodology

Activation Capture – The authors instrumented GPT‑2 and LLaMa models to record the output tensors of each sub‑module (self‑attention, feed‑forward MLP, and embeddings) for a given input sequence.
Pre‑processing – Raw tensors are flattened per token and normalized to mitigate scale differences across layers.
Dimensionality Reduction –
- PCA provides a linear, globally optimal projection that highlights the dominant variance directions.
- UMAP offers a non‑linear embedding that preserves local neighborhood structure, making clusters and separations more visually apparent.
Visualization – The reduced vectors are plotted with color‑coding for layer, component type (attention vs. MLP), and token position, allowing developers to spot patterns such as separation, spirals, or high‑norm outliers.
Qualitative Experiments – The pipeline is applied to (a) standard prompts, (b) repeating token sequences, and (c) positional‑embedding‑only inputs to isolate specific geometric phenomena.

Results & Findings

Attention vs. MLP Split: Starting around the middle layers, the UMAP plots show two distinct clouds—one for attention outputs, another for MLP outputs—suggesting that the model processes contextual information and feed‑forward transformations in largely orthogonal subspaces.
Helical Positional Embeddings: When visualizing only the positional vectors of GPT‑2, the reduced space forms a smooth helix, confirming that the learned embeddings encode position in a continuous, rotational manner.
First‑Token Norm Spike: The latent representation of the first token (often the start‑of‑sequence token) consistently exhibits a much larger Euclidean norm than subsequent tokens, hinting at a “signal amplification” role early in the forward pass.
Layerwise Trajectories: Tokens trace coherent paths through the reduced space as they ascend layers, with early layers showing rapid dispersion and later layers converging toward tighter clusters—mirroring the model’s gradual abstraction of meaning.
Sequence‑wise Patterns in LLaMa: Unlike GPT‑2, LLaMa’s token embeddings display a more grid‑like arrangement, reflecting differences in training data or architecture that could affect downstream tasks.

Practical Implications

Debugging & Auditing: Developers can now spot abnormal activation patterns (e.g., unexpected clustering or outlier norms) that may indicate bugs, data leakage, or adversarial manipulation.
Model Compression & Pruning: The clear separation between attention and MLP subspaces suggests that these components could be quantized or pruned independently without heavily affecting each other’s representational capacity.
Prompt Engineering: Understanding how the start‑of‑sequence token dominates early layers can guide the design of more effective prompts or prefix tokens for few‑shot learning.
Custom Embedding Design: The helical nature of positional embeddings opens avenues for designing alternative positional encodings that are more interpretable or hardware‑friendly.
Transfer Learning Diagnostics: By visualizing how a fine‑tuned model’s latent geometry shifts relative to its base version, engineers can assess whether fine‑tuning is truly adapting representations or merely over‑fitting.
Educational Tooling: The open‑source visualizer can be integrated into workshops or internal ML curricula to demystify transformer internals for non‑research engineers.

Limitations & Future Work

Scalability: The current pipeline works well for models up to ~7 B parameters; extending to larger LLMs (e.g., 70 B) would require memory‑efficient sampling or distributed activation logging.
Quantitative Metrics: The study is primarily qualitative; future work could define metrics (e.g., cluster separation scores) to automatically detect architectural anomalies.
Causal Interpretation: While geometric patterns are observed, linking them to specific linguistic phenomena or downstream performance remains an open challenge.
Broader Architectures: The authors focus on vanilla Transformers; applying the methodology to encoder‑decoder models, retrieval‑augmented LLMs, or sparsely‑gated mixtures could uncover new insights.

The authors’ codebase is publicly available, making it easy for developers to plug the visualizer into their own pipelines and start exploring the hidden geometry of the models they rely on.

Authors

Alex Ning
Vainateya Rangaraju

Paper Information

arXiv ID: 2511.21594v1
Categories: cs.LG
Published: November 26, 2025
PDF: Download PDF

[Paper] Visualizing LLM Latent Space Geometry Through Dimensionality Reduction

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Thinking by Doing: Building Efficient World Model Reasoning in LLMs via Multi-turn Interaction

[Paper] ThetaEvolve: Test-time Learning on Open Problems

[Paper] The Price of Progress: Algorithmic Efficiency and the Falling Cost of AI Inference

[Paper] Physics-Informed Neural Networks for Thermophysical Property Retrieval