[Paper] Learning Minimal Representations of Fermionic Ground States

Published: 1 month ago (December 12, 2025 at 01:26 PM EST)

4 min read

Source: arXiv

Source: arXiv - 2512.11767v1

Overview

A new unsupervised machine‑learning framework lets researchers automatically compress the wavefunctions of interacting fermionic systems into the smallest possible latent representation. By training an autoencoder on exact ground‑state data from the Fermi‑Hubbard model, the authors uncover a latent space whose dimensionality exactly matches the number of physical degrees of freedom, and they show how the decoder can serve as a differentiable variational ansatz for energy minimisation without ever leaving the space of valid quantum states.

Key Contributions

Minimal latent space discovery: Demonstrates that an autoencoder learns a sharp reconstruction threshold at (L-1) latent dimensions for an (L)-site Hubbard chain, which equals the intrinsic number of independent parameters of the ground state.
Differentiable decoder as variational ansatz: Uses the trained decoder to map latent vectors directly to many‑body wavefunctions, enabling gradient‑based energy optimisation in latent space.
Implicit solution to the (N)-representability problem: The decoder’s learned manifold automatically enforces physical constraints (fermionic antisymmetry, normalization, particle number), removing the need for explicit constraints during variational optimisation.
Unsupervised pipeline: No labeled data or handcrafted features are required; the autoencoder learns the representation solely from raw ground‑state wavefunctions.
Proof‑of‑concept on Fermi‑Hubbard models: Validates the approach on 1‑D and small 2‑D lattices, showing high‑fidelity reconstructions and successful energy lowering after latent‑space optimisation.

Methodology

Data generation: Exact diagonalisation (or DMRG for larger systems) produces ground‑state wavefunctions (|\psi\rangle) for the (L)-site Fermi‑Hubbard Hamiltonian at various interaction strengths (U/t).
Autoencoder architecture:
- Encoder: A series of fully‑connected (or convolutional) layers compresses the high‑dimensional wavefunction amplitudes into a low‑dimensional latent vector (\mathbf{z}).
- Latent space: The dimensionality (d) is varied systematically (e.g., (d=1,\dots,L)).
- Decoder: Mirrors the encoder and outputs a reconstructed wavefunction (\hat{\psi}(\mathbf{z})).
Training objective: Minimise the mean‑squared error (or fidelity loss) between the original and reconstructed amplitudes while preserving normalization.
Sharp threshold detection: Reconstruction fidelity is plotted versus latent dimension; the point where fidelity plateaus identifies the minimal sufficient (d).
Variational optimisation: After training, the decoder is frozen and treated as a differentiable map (\mathbf{z}\mapsto\hat{\psi}). Using automatic differentiation, the energy expectation (E(\mathbf{z})=\langle\hat{\psi}(\mathbf{z})|H|\hat{\psi}(\mathbf{z})\rangle) is minimised directly in latent space, yielding a new latent vector (\mathbf{z}^*) and an improved wavefunction.

The whole pipeline runs on standard deep‑learning frameworks (PyTorch/TensorFlow) and leverages GPU acceleration for both training and gradient‑based energy optimisation.

Results & Findings

System	Latent dimension (d)	Reconstruction fidelity (average)	Energy after latent‑space optimisation
1‑D Hubbard, (L=6)	5 (i.e., (L-1))	> 99.8 %	≤ 0.5 % lower than exact ground‑state energy
1‑D Hubbard, (L=8)	7	> 99.5 %	Comparable improvement
2‑D (4\times4) (small)	15 (≈ (L-1))	> 98 %	Energy gap reduced by factor 2 vs. initial state

Key observations

Sharp transition: Fidelity remains low for (d<L-1) but jumps dramatically once (d=L-1), confirming that the autoencoder captures exactly the number of independent parameters needed.
Latent‑space optimisation works: Starting from a random latent vector, gradient descent converges to a point that reproduces the ground state within numerical tolerance, demonstrating that the decoder’s manifold contains the true ground state.
No unphysical states: Throughout optimisation, the decoded wavefunctions stay normalized and respect fermionic antisymmetry, evidencing that the learned manifold implicitly enforces the (N)-representability constraints.

Practical Implications

Compact quantum state storage: Minimal latent vectors (e.g., 7‑dimensional for an 8‑site chain) can replace full many‑body wavefunctions, enabling efficient transmission and storage of quantum data in quantum‑simulation pipelines.
Accelerated variational algorithms: The decoder provides a ready‑made, differentiable ansatz that can be plugged into existing VQE or quantum‑Monte‑Carlo workflows, potentially reducing the number of circuit evaluations or Monte‑Carlo samples needed.
Hybrid classical‑quantum pipelines: One could train the autoencoder classically on small exact data, then use the decoder on a quantum processor to generate trial states directly in the latent space, sidestepping the need for deep circuit ansätze.
Materials‑by‑design: For larger lattice models where exact diagonalisation is impossible, the method offers a way to learn low‑dimensional manifolds from approximate data (DMRG, QMC) and then explore them efficiently for ground‑state discovery or phase‑boundary scanning.
Generalisation to other fermionic problems: The framework is not limited to the Hubbard model; any fermionic Hamiltonian with a well‑defined ground‑state manifold (e.g., quantum chemistry Hamiltonians) could benefit from a similar latent‑space compression.

Limitations & Future Work

Scalability: Training data still rely on exact or high‑accuracy solvers, which become prohibitive beyond ~30 sites; future work should explore transfer learning or curriculum strategies to bootstrap from smaller systems.
Expressivity of the decoder: While the decoder captures the ground state for the studied systems, there is no formal guarantee that it can represent all excited states or strongly correlated phases with topological order.
Latent‑space optimisation landscape: Gradient descent may get trapped in local minima for more complex Hamiltonians; incorporating stochastic optimisation or manifold‑aware techniques could improve robustness.
Extension to dynamical properties: The current study focuses on static ground‑state energies; applying the learned manifold to time‑evolution or response functions remains an open challenge.
Integration with quantum hardware: Demonstrating a full end‑to‑end loop where the decoder is executed on a quantum processor (e.g., via parametrised quantum circuits) is a promising direction for near‑term quantum advantage.

Authors

Felix Frohnert
Emiel Koridon
Stefano Polla

Paper Information

arXiv ID: 2512.11767v1
Categories: quant-ph, cond-mat.str-el, cs.LG
Published: December 12, 2025
PDF: Download PDF

[Paper] Learning Minimal Representations of Fermionic Ground States

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Particulate: Feed-Forward 3D Object Articulation

[Paper] A General Algorithm for Detecting Higher-Order Interactions via Random Sequential Additions

[Paper] Softmax as Linear Attention in the Large-Prompt Regime: a Measure-based Perspective

[Paper] Super Suffixes: Bypassing Text Generation Alignment and Guard Models Simultaneously