[Paper] Fluid Representations in Reasoning Models

Published: 2 months ago (February 4, 2026 at 01:34 PM EST)

5 min read

Source: arXiv

Source: arXiv - 2602.04843v1

Overview

The paper investigates why reasoning‑augmented language models (LLMs) excel at abstract problem solving. By dissecting a 32‑billion‑parameter model (QwQ‑32B) that is explicitly trained to generate long “chain‑of‑thought” traces, the authors reveal that the model continuously reshapes its internal token embeddings while it reasons. This dynamic, structure‑focused encoding—dubbed Fluid Reasoning Representations—appears to be a key driver of the model’s superior performance on a deliberately opaque planning benchmark called Mystery Blocksworld.

Key Contributions

Mechanistic analysis of a reasoning LLM: First detailed study of how a large model refines its internal representations during a reasoning episode.
Discovery of Fluid Reasoning Representations (FRR): Empirical evidence that token embeddings evolve in‑context to capture abstract relational structure rather than surface lexical forms.
Steering experiments: Demonstrated causal impact of FRR by (a) injecting refined embeddings from successful traces into failing runs, which boosts accuracy, and (b) swapping symbolic (hand‑crafted) representations for the model’s obfuscated encodings with negligible loss.
New benchmark – Mystery Blocksworld: A planning domain where action names are deliberately scrambled, forcing models to rely on structural reasoning instead of memorized vocabularies.
Insights for future model design: Highlights the importance of in‑context representation plasticity as a design target for next‑generation reasoning systems.

Methodology

Model & Training: The authors fine‑tuned a 32‑B parameter transformer (QwQ‑32B) on a massive chain‑of‑thought dataset, encouraging it to output detailed reasoning steps.
Benchmark – Mystery Blocksworld: A synthetic planning environment where objects, actions, and goals are described using random token strings, removing any semantic clues from the surface text.
Representation Tracking: During inference, hidden states (token embeddings) are extracted after each reasoning step. The authors compute similarity metrics and perform probing classifiers to see what information each layer encodes over time.
Steering Experiments:
- Injection: Take the refined embeddings from a successful trace and replace the corresponding embeddings in a failing trace at the same reasoning step.
- Symbolic Replacement: Substitute the model’s learned encodings with explicit symbolic vectors (e.g., one‑hot action IDs) to test whether the model truly needs its own fluid representations.
Analysis Tools: Dimensionality reduction (t‑SNE/UMAP), linear probes for action/concept identification, and ablation studies on the number of reasoning steps.

Results & Findings

Progressive Structuring: Early reasoning steps contain noisy, surface‑level embeddings; by the middle of the chain, embeddings cluster tightly around abstract concepts such as “move”, “stack”, or “goal‑state”, regardless of the random token names.
Performance Boost from Injection: When refined embeddings from a correct trace are injected into a failing trace, success rates jump from ~42 % to ~71 %, confirming a causal role.
Symbolic Substitution Works: Replacing fluid embeddings with clean symbolic vectors retains ~85 % of the original accuracy, indicating that the model’s reasoning algorithm can operate on abstract representations even if they are supplied externally.
Quantitative Gains: QwQ‑32B solves 78 % of Mystery Blocksworld puzzles, far surpassing a baseline non‑reasoning LLM (≈30 %).
Fluidity Metric: The authors propose a “representation drift” score that quantifies how much token embeddings change across reasoning steps; higher drift correlates strongly with correct solutions.

Practical Implications

Designing More Efficient Reasoning Models: If fluid representation refinement is a core ingredient, future architectures could expose a dedicated “representation‑update” module, reducing the need for massive chain‑of‑thought outputs.
Debugging & Interpretability Tools: Monitoring embedding drift offers a lightweight diagnostic to spot when a model is “stuck” on surface cues versus abstract reasoning, useful for developers building AI assistants that need to plan or troubleshoot.
Hybrid Symbolic‑Neural Systems: The successful symbolic substitution suggests that developers can combine LLMs with external planners or knowledge graphs, feeding them abstract representations rather than raw text, potentially lowering inference costs.
Robustness to Obfuscation: Systems deployed in environments with noisy or adversarial naming (e.g., code obfuscation, proprietary APIs) can benefit from models that learn to ignore lexical noise and focus on relational structure.
Few‑Shot Adaptation: Since FRR emerges in‑context, developers can prompt LLMs with a few high‑quality reasoning examples to “prime” the fluid representations for a new domain, accelerating adaptation without full fine‑tuning.

Limitations & Future Work

Scale & Generality: Experiments are limited to a single 32 B model and a synthetic benchmark; it remains unclear how FRR behaves in real‑world tasks (e.g., software debugging, scientific reasoning).
Computational Overhead: Extracting and manipulating intermediate embeddings adds latency, which may be prohibitive for production APIs.
Interpretability Gap: While clustering shows abstract structure, the exact semantics of the fluid vectors are still opaque; more fine‑grained probing is needed.
Future Directions: The authors propose (1) extending FRR analysis to multimodal models, (2) designing training objectives that explicitly encourage representation fluidity, and (3) integrating FRR‑aware adapters that can be swapped in at inference time for faster, more controllable reasoning.

Authors

Dmitrii Kharlapenko
Alessandro Stolfo
Arthur Conmy
Mrinmaya Sachan
Zhijing Jin

Paper Information

arXiv ID: 2602.04843v1
Categories: cs.AI
Published: February 4, 2026
PDF: Download PDF

[Paper] Fluid Representations in Reasoning Models

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning

[Paper] Optimal Derivative Feedback Control for an Active Magnetic Levitation System: An Experimental Study on Data-Driven Approaches

[Paper] Optimal Turkish Subword Strategies at Scale: Systematic Evaluation of Data, Vocabulary, Morphology Interplay

[Paper] Reliable Mislabel Detection for Video Capsule Endoscopy Data