[Paper] Domain Feature Collapse: Implications for Out-of-Distribution Detection and Solutions

Published: (December 3, 2025 at 01:17 PM EST)
4 min read
Source: arXiv

Source: arXiv - 2512.04034v1

Overview

State‑of‑the‑art out‑of‑distribution (OOD) detectors often crumble when the underlying classifier is trained on data that comes from a single domain (e.g., only chest‑X‑rays). This paper offers the first information‑theoretic explanation: supervised learning on a single domain inevitably collapses domain‑specific features in the learned representation, leaving the model blind to anything that looks “out‑of‑domain.” The authors back the theory with a new benchmark (Domain Bench) and a simple fix—domain filtering—that restores OOD performance.

Key Contributions

  • Theoretical proof of “Domain Feature Collapse.” Shows that, under the information bottleneck objective, a model trained on a single domain drives the mutual information between the input’s domain label (x_d) and the latent representation (z) to zero ((I(x_d;z)=0)).
  • Extension with Fano’s inequality to quantify partial collapse that occurs in realistic, noisy training regimes.
  • Domain Bench, a curated suite of single‑domain datasets (medical imaging, satellite imagery, etc.) for systematic OOD evaluation.
  • Domain filtering technique: a lightweight pre‑processing step that injects domain‑level information (via frozen pretrained embeddings) before the classifier, empirically raising OOD detection scores (e.g., from 53 % to >90 % FPR@95 on MNIST‑style OOD).
  • Broader insight into when to fine‑tune versus freeze pretrained models for transfer learning, highlighting the hidden cost of discarding domain cues.

Methodology

  1. Information‑Bottleneck Formalism – The authors model supervised training as minimizing

    $$\mathcal{L}=I(x;z)-\beta I(y;z)$$

    where (x) is the full input, (y) the class label, and (\beta) balances compression vs. prediction. By splitting (x) into class‑specific ((x_c)) and domain‑specific ((x_d)) parts, they prove that the optimal solution drives (I(x_d;z)) to zero when the training data contains only one domain.

  2. Partial Collapse Analysis – Using Fano’s inequality, they bound the residual domain information when the bottleneck is not perfectly tight (e.g., due to finite data, regularization).

  3. Domain Bench Construction – They collect 8 single‑domain datasets, each paired with OOD test sets from unrelated domains (e.g., training on retinal scans, testing on natural images).

  4. Domain Filtering – Before feeding data to the classifier, they prepend a frozen feature extractor (e.g., a ResNet‑50 pretrained on ImageNet) that preserves domain cues. The downstream classifier is then trained on the concatenated representation ([z_{\text{frozen}}, z_{\text{train}}]).

  5. Evaluation – Standard OOD metrics (FPR@95 %TPR, AUROC, AUPR) are reported for baseline OOD detectors (MSP, ODIN, Energy) with and without domain filtering.

Results & Findings

Dataset (train)OOD test setBaseline FPR@95 %TPRWith Domain Filtering
MNIST (digits)Fashion‑MNIST53 %12 %
Chest‑X‑rayCheXpert (different hospital)48 %8 %
Satellite (Sentinel‑2)Aerial photos (Drone)61 %15 %
  • Across all benchmarks, domain filtering consistently reduces false‑positive rates by 70‑85 % and improves AUROC by ~0.2.
  • Ablation shows that any frozen encoder that retains domain variance works; the method does not require task‑specific fine‑tuning.
  • The theoretical bounds derived from Fano’s inequality closely match the empirical residual domain information measured via mutual‑information estimators.

Practical Implications

  • Robust OOD detection in narrow‑domain products – Medical AI, satellite monitoring, and industrial inspection can adopt domain filtering as a plug‑and‑play module to avoid catastrophic OOD failures.
  • Guidance for transfer learning pipelines – When fine‑tuning a pretrained model on a single domain, keep the early layers frozen (or add a parallel frozen branch) to preserve domain cues that are useful for downstream safety checks.
  • Simplified deployment – The technique adds negligible latency (a single forward pass through a frozen network) and no extra training data, making it attractive for edge devices.
  • Better model auditing – By explicitly measuring (I(x_d;z)) during training, engineers can flag models that are likely to suffer from feature collapse before they are shipped.

Limitations & Future Work

  • The analysis assumes a clear separation between class‑specific and domain‑specific factors, which may blur in highly entangled data (e.g., natural images with diverse backgrounds).
  • Domain filtering relies on the availability of a general‑purpose pretrained encoder; performance may degrade if the pretraining domain is too dissimilar.
  • The current benchmark focuses on image data; extending the theory and experiments to text, audio, or multimodal settings remains an open avenue.
  • Future work could explore learnable domain adapters that dynamically adjust the amount of domain information retained, or integrate the mutual‑information regularizer directly into the loss function.

Authors

  • Hong Yang
  • Devroop Kar
  • Qi Yu
  • Alex Ororbia
  • Travis Desell

Paper Information

  • arXiv ID: 2512.04034v1
  • Categories: cs.LG
  • Published: December 3, 2025
  • PDF: Download PDF
Back to Blog

Related posts

Read more »