[Paper] Coherence in the brain unfolds across separable temporal regimes
Source: arXiv - 2512.20481v1
Overview
This study investigates how the brain keeps a narrative coherent while we listen to natural speech. By pairing ultra‑high‑field (7 T) fMRI recordings with signals derived from a large language model (LLM), the authors show that the brain simultaneously runs two distinct temporal processes: a slow “drift” that integrates meaning over minutes, and a fast “shift” that re‑configures representations at event boundaries.
Key Contributions
- Annotation‑free neural markers: Introduced two LLM‑based time‑series—contextual drift and event shift—that capture gradual meaning accumulation and abrupt narrative changes without manual labeling.
- High‑resolution encoding model: Collected >7 h of 7 T fMRI from a single participant listening to 13 crime stories, enabling voxel‑wise prediction of BOLD responses with stable regularized regression.
- Functional dissociation: Demonstrated that drift signals primarily drive activity in default‑mode network (DMN) hubs, while shift signals dominate primary auditory cortex and higher‑order language areas.
- Mechanistic link to coherence: Provided a concrete neural account of how the brain balances long‑range integration and rapid re‑orientation, offering a framework for studying language breakdowns in psychiatric conditions.
Methodology
- Stimuli & Data Acquisition – The participant listened to 13 hour‑long crime narratives while whole‑brain BOLD signals were recorded at 7 T (≈1 mm isotropic voxels).
- LLM‑derived Features – A transformer‑based language model processed the raw audio transcripts. Two continuous signals were extracted:
- Contextual drift: the cosine similarity between successive hidden‑state vectors, reflecting smooth semantic evolution.
- Event shift: the magnitude of hidden‑state change across a sliding window, highlighting abrupt contextual jumps (e.g., scene cuts).
- Encoding Framework – Each voxel’s BOLD time‑course was modeled as a linear combination of the drift and shift regressors, convolved with a canonical hemodynamic response function. Ridge regression with cross‑validated regularization ensured robust weight estimates.
- Validation – The fitted models were tested on held‑out stories (different from the training set) to confirm generalization.
Results & Findings
- Predictive Power: Drift explained a significant portion of variance in DMN regions (medial prefrontal cortex, posterior cingulate, angular gyrus), while shift accounted for variance in bilateral auditory cortex and left inferior frontal gyrus.
- Temporal Profiles: DMN activity tracked the slow decay of meaning over the narrative, consistent with “semantic integration.” Auditory and language association areas responded sharply to shift peaks, aligning with “event boundary detection.”
- Cross‑Story Generalization: The same voxel‑wise weights successfully predicted responses to completely new stories, indicating that the drift/shift decomposition captures stimulus‑independent processing modes.
Practical Implications
- Neuro‑AI Interfaces: The drift/shift signals can serve as lightweight, annotation‑free features for brain‑computer interfaces that need to monitor comprehension state in real time (e.g., adaptive audiobooks or tutoring systems).
- Improved NLP Evaluation: Aligning LLM hidden‑state dynamics with human neural data offers a new benchmark for assessing whether language models capture human‑like temporal integration.
- Clinical Tools: Because the two regimes map onto distinct networks, deviations in drift‑related DMN activity could become biomarkers for coherence deficits in schizophrenia or autism, guiding targeted neurofeedback or pharmacological interventions.
- Content Design: Understanding that rapid “shift” cues drive auditory and language cortices suggests that storytellers, game designers, and UI developers can strategically place salient event boundaries to maintain user engagement.
Limitations & Future Work
- Single‑subject design: While the dense sampling yields high statistical power, replication across a larger, more diverse cohort is needed to confirm generality.
- Model specificity: The drift/shift definitions depend on the chosen LLM architecture; exploring other models (e.g., recurrent vs. transformer) could refine the neural mapping.
- Temporal resolution: fMRI’s sluggish hemodynamics limit precise timing of rapid shifts; complementary modalities such as MEG/EEG would help resolve sub‑second dynamics.
- Clinical translation: The current work is exploratory; future studies should test whether the identified neural signatures predict language‑coherence impairments in patient populations.
Authors
- Davide Stauba
- Finn Rabe
- Akhil Misra
- Yves Pauli
- Roya Hüppi
- Nils Lang
- Lars Michels
- Victoria Edkins
- Sascha Frühholz
- Iris Sommer
- Wolfram Hinzen
- Philipp Homan
Paper Information
- arXiv ID: 2512.20481v1
- Categories: q-bio.NC, cs.CL
- Published: December 23, 2025
- PDF: Download PDF