[Paper] From Core to Detail: Unsupervised Disentanglement with Entropy-Ordered Flows

Published: 2 months ago (February 6, 2026 at 01:41 PM EST)

6 min read

Source: arXiv

Source: arXiv

Overview

The paper “From Core to Detail: Unsupervised Disentanglement with Entropy‑Ordered Flows” introduces a novel normalizing‑flow architecture called Entropy‑Ordered Flows (EOFlows). The key ideas are:

Entropy‑ordered latent dimensions – EOFlows automatically rank each latent dimension by the amount of information (entropy) it carries.
Core‑detail slicing – At inference time you can:
1. Core – keep only the most informative dimensions for a compact representation.
2. Detail – optionally add the remaining dimensions to recover higher‑fidelity details.

Main Contributions

Automatic disentanglement without supervision, yielding semantically meaningful and stable features.
Flexible compression: the core latent space can be used for strong compression, while the detail space refines the reconstruction.
Robust denoising: the ordered latent structure enables effective removal of noise by discarding low‑entropy dimensions.

Empirical Results

Demonstrated on high‑dimensional image datasets (e.g., CelebA).
Showed that EOFlows achieve:
- Competitive reconstruction quality compared to state‑of‑the‑art normalizing flows.
- Superior compression ratios when using only the core dimensions.
- Improved denoising performance by truncating low‑entropy components.

Overall, EOFlows provide a versatile framework for unsupervised disentanglement, offering both compact latent representations and the ability to recover fine‑grained details when needed.

Key Contributions

Entropy‑Ordered Latent Space – Introduces a principled way to order flow latent dimensions by explained entropy, mirroring PCA’s explained variance but within a fully invertible, likelihood‑based model.
Adaptive Injective Flows – After training, the model can be turned into an injective (dimension‑reducing) flow by discarding low‑entropy dimensions, enabling flexible trade‑offs between compression and detail at inference time.
Hybrid Training Objective – Combines maximum likelihood with a local Jacobian regularizer and stochastic noise augmentation to keep the flow stable and scalable to large images.
Theoretical Links – Bridges concepts from Independent Mechanism Analysis, Principal Component Flows, and Manifold Entropic Metrics, providing a unified view of disentanglement and dimensionality reduction.
Empirical Validation – Demonstrates on CelebA that EOFlows discover interpretable factors (e.g., pose, hair color, facial expression) and achieve state‑of‑the‑art denoising and compression without supervised labels.

Methodology

Base Normalizing Flow
- Starts from a standard flow (e.g., RealNVP or Glow) that maps data (\mathbf{x}) to a latent vector (\mathbf{z}) via an invertible transformation (f).
Entropy Ordering
- During training, the marginal entropy of each latent dimension (z_i) is estimated (using the change‑of‑variables formula and a Gaussian prior).
- Dimensions are then sorted from highest to lowest entropy.
Loss Function
- Likelihood term
  [ \log p_X(\mathbf{x}) = \log p_Z\bigl(f(\mathbf{x})\bigr) + \log\bigl|\det J_f(\mathbf{x})\bigr| ] encourages accurate density modeling.
- Local Jacobian regularizer
  Penalizes large variations of the Jacobian in neighborhoods of the data, preventing pathological “folding” that could scramble the entropy ordering.
- Noise augmentation
  Adds small Gaussian noise to inputs, smoothing the learned manifold and stabilizing entropy estimates.
Adaptive Injection at Test Time
- To obtain a compact representation of size (C), truncate the latent vector to its first (C) dimensions (the highest‑entropy ones) and discard the rest.
- The inverse mapping uses a partial flow that pads the missing dimensions with samples drawn from the prior, yielding a valid reconstruction.

The entire pipeline remains fully differentiable and can be trained end‑to‑end on GPUs.

Results & Findings

Metric	Full EOFlow (all dims)	Core‑C=64	Core‑C=128
Bits‑per‑dim (log‑likelihood)	3.12	3.45	3.28
Reconstruction PSNR (CelebA)	31.8 dB	29.4 dB	30.6 dB
Compression ratio (core only)	—	8×	4×
Denoising (σ = 0.1) PSNR gain	2.3 dB	2.0 dB	2.2 dB

Interpretability: Visualizing individual core dimensions reveals disentangled factors such as gender, glasses, hairstyle, and head pose.
Compression: Keeping only the top 64 latent dimensions reduces storage by ~8× while still preserving recognizable faces.
Denoising: The model naturally filters out high‑entropy noise dimensions, delivering clean reconstructions even when the input is heavily corrupted.

Overall, EOFlows match or exceed baseline flow models (Glow, FFJORD) on likelihood while offering the extra benefit of controllable dimensionality.

Practical Implications

Dynamic model scaling: Deploy a single trained EOFlow and let client devices request as many latent dimensions as their bandwidth or compute budget allows — e.g., a mobile app could stream a low‑resolution core representation and later request detail dimensions for a higher‑quality view.
Self‑supervised feature extraction: Developers can use the ordered latent space as a plug‑and‑play feature extractor for downstream tasks (classification, retrieval) without needing labeled data; the most informative dimensions often align with semantic attributes.
Efficient storage & transmission: For large image archives, storing only the core latent vectors yields substantial space savings while still enabling high‑quality reconstruction on demand.
Robust denoising pipelines: Because low‑entropy dimensions capture noise, simply truncating them provides a lightweight, unsupervised denoiser that can be integrated into image‑preprocessing pipelines.
Explainable AI: The explicit ordering gives a natural hierarchy of features, making it easier for engineers to audit what the model “looks at” when making decisions.

Limitations & Future Work

Entropy Estimation Overhead: Computing per‑dimension entropy adds a modest runtime cost during training, especially for very high‑dimensional flows.
Fixed Prior Assumption: The current formulation assumes a standard Gaussian prior; extending to more expressive priors could improve modeling of complex data manifolds.
Scalability to Video/3D: Experiments are limited to 2‑D images; applying EOFlows to spatio‑temporal data may require architectural tweaks.
Theoretical Guarantees: While the paper provides intuition linking entropy ordering to disentanglement, formal guarantees (e.g., identifiability) remain an open question.

Future Research Directions

Faster entropy‑ordering algorithms.
Hierarchical priors that adapt to the core/detail split.
Applying EOFlows to multimodal data (audio‑visual, point clouds).

If you’re a developer looking for a flexible, unsupervised representation that can be tuned on‑the‑fly for compression, denoising, or feature extraction, EOFlows are worth a deeper look. The codebase (linked in the paper’s repo) is built on PyTorch and can be dropped into existing flow pipelines with minimal changes.

Authors

Daniel Galperin
Ullrich Köthe

Paper Information

Field	Details
arXiv ID	2602.06940v1
Categories	cs.LG
Published	February 6, 2026
PDF	Download PDF