[Paper] Circuit Synchronization Precedes Generalization: Causal Evidence from Fourier Structure in Grokking Transformers

Published: (June 11, 2026 at 02:52 AM EDT)
2 min read
Source: arXiv

Source: arXiv - 2606.12966v1

Overview

Grokking — where a transformer on modular arithmetic suddenly transitions from near-chance to near-perfect validation accuracy — is attributed to a Fourier circuit, but its timing, causal structure, and controllability remain poorly understood. We introduce the Frequency Synchronization Degree (FSD), a normalised, permutation-tested metric for Fourier circuit synchronisation requiring no prior circuit knowledge. Across nine modular addition configurations (primes p in {53, 71, 97, 113, 131}, three seeds), FSD synchronises 500-3,000 steps before grokking (mean lead +1,722 steps; all nine positive, sign-test p~0.004), and precedes a restricted-logit loss baseline (Nanda et al.’s excluded loss) in all nine cases, making it the earliest available predictor. We provide direct causal evidence that the inter-phase gap is a regularisation phenomenon: forking training at the FSD-ceiling step and varying weight decay lambda produces strictly monotone earlier grokking, with Delta_t proportional to 1/lambda. This law replicates across three primes (p in {53,97,131}; R^2=1.00 and R^2=0.99 for two clean cases), captured as Delta_t ~ C/lambda, consistent with (1/lambda)*log(||W_mem||/tau). Architecture ablations show an attention-only model groks with a strong FSD precursor; an MLP-only model never groks; a single-layer model’s FSD lags, confirming the precursor is a multi-block circuit property.

Key Contributions

This paper presents research in the following areas:

  • cs.LG
  • cs.NE

Methodology

Please refer to the full paper for detailed methodology.

Practical Implications

This research contributes to the advancement of cs.LG.

Authors

  • Achyuthan Sivasankar

Paper Information

  • arXiv ID: 2606.12966v1
  • Categories: cs.LG, cs.NE
  • Published: June 11, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »