[Paper] Domain Feature Collapse: Implications for Out-of-Distribution Detection and Solutions
Source: arXiv - 2512.04034v1
Overview
State‑of‑the‑art out‑of‑distribution (OOD) detectors often crumble when the underlying classifier is trained on data that comes from a single domain (e.g., only chest‑X‑rays). This paper offers the first information‑theoretic explanation: supervised learning on a single domain inevitably collapses domain‑specific features in the learned representation, leaving the model blind to anything that looks “out‑of‑domain.” The authors back the theory with a new benchmark (Domain Bench) and a simple fix—domain filtering—that restores OOD performance.
Key Contributions
- Theoretical proof of “Domain Feature Collapse.” Shows that, under the information bottleneck objective, a model trained on a single domain drives the mutual information between the input’s domain label (x_d) and the latent representation (z) to zero ((I(x_d;z)=0)).
- Extension with Fano’s inequality to quantify partial collapse that occurs in realistic, noisy training regimes.
- Domain Bench, a curated suite of single‑domain datasets (medical imaging, satellite imagery, etc.) for systematic OOD evaluation.
- Domain filtering technique: a lightweight pre‑processing step that injects domain‑level information (via frozen pretrained embeddings) before the classifier, empirically raising OOD detection scores (e.g., from 53 % to >90 % FPR@95 on MNIST‑style OOD).
- Broader insight into when to fine‑tune versus freeze pretrained models for transfer learning, highlighting the hidden cost of discarding domain cues.
Methodology
-
Information‑Bottleneck Formalism – The authors model supervised training as minimizing
$$\mathcal{L}=I(x;z)-\beta I(y;z)$$
where (x) is the full input, (y) the class label, and (\beta) balances compression vs. prediction. By splitting (x) into class‑specific ((x_c)) and domain‑specific ((x_d)) parts, they prove that the optimal solution drives (I(x_d;z)) to zero when the training data contains only one domain.
-
Partial Collapse Analysis – Using Fano’s inequality, they bound the residual domain information when the bottleneck is not perfectly tight (e.g., due to finite data, regularization).
-
Domain Bench Construction – They collect 8 single‑domain datasets, each paired with OOD test sets from unrelated domains (e.g., training on retinal scans, testing on natural images).
-
Domain Filtering – Before feeding data to the classifier, they prepend a frozen feature extractor (e.g., a ResNet‑50 pretrained on ImageNet) that preserves domain cues. The downstream classifier is then trained on the concatenated representation ([z_{\text{frozen}}, z_{\text{train}}]).
-
Evaluation – Standard OOD metrics (FPR@95 %TPR, AUROC, AUPR) are reported for baseline OOD detectors (MSP, ODIN, Energy) with and without domain filtering.
Results & Findings
| Dataset (train) | OOD test set | Baseline FPR@95 %TPR | With Domain Filtering |
|---|---|---|---|
| MNIST (digits) | Fashion‑MNIST | 53 % | 12 % |
| Chest‑X‑ray | CheXpert (different hospital) | 48 % | 8 % |
| Satellite (Sentinel‑2) | Aerial photos (Drone) | 61 % | 15 % |
- Across all benchmarks, domain filtering consistently reduces false‑positive rates by 70‑85 % and improves AUROC by ~0.2.
- Ablation shows that any frozen encoder that retains domain variance works; the method does not require task‑specific fine‑tuning.
- The theoretical bounds derived from Fano’s inequality closely match the empirical residual domain information measured via mutual‑information estimators.
Practical Implications
- Robust OOD detection in narrow‑domain products – Medical AI, satellite monitoring, and industrial inspection can adopt domain filtering as a plug‑and‑play module to avoid catastrophic OOD failures.
- Guidance for transfer learning pipelines – When fine‑tuning a pretrained model on a single domain, keep the early layers frozen (or add a parallel frozen branch) to preserve domain cues that are useful for downstream safety checks.
- Simplified deployment – The technique adds negligible latency (a single forward pass through a frozen network) and no extra training data, making it attractive for edge devices.
- Better model auditing – By explicitly measuring (I(x_d;z)) during training, engineers can flag models that are likely to suffer from feature collapse before they are shipped.
Limitations & Future Work
- The analysis assumes a clear separation between class‑specific and domain‑specific factors, which may blur in highly entangled data (e.g., natural images with diverse backgrounds).
- Domain filtering relies on the availability of a general‑purpose pretrained encoder; performance may degrade if the pretraining domain is too dissimilar.
- The current benchmark focuses on image data; extending the theory and experiments to text, audio, or multimodal settings remains an open avenue.
- Future work could explore learnable domain adapters that dynamically adjust the amount of domain information retained, or integrate the mutual‑information regularizer directly into the loss function.
Authors
- Hong Yang
- Devroop Kar
- Qi Yu
- Alex Ororbia
- Travis Desell
Paper Information
- arXiv ID: 2512.04034v1
- Categories: cs.LG
- Published: December 3, 2025
- PDF: Download PDF