[Paper] Partial Soft-Matching Distance for Neural Representational Comparison with Partial Unit Correspondence
Source: arXiv - 2602.19331v1
Overview
The paper introduces Partial Soft‑Matching Distance (PSMD), a new way to compare neural representations—whether they come from brain imaging data or deep‑learning models—when only a subset of units (neurons, voxels, or feature maps) actually correspond to each other. By allowing some units to stay unmatched, PSMD is both robust to noisy/outlier units and sensitive to geometric transformations (e.g., rotations) that matter for interpretability.
Key Contributions
- Partial optimal‑transport formulation of the classic soft‑matching distance, relaxing the “all‑units‑must‑match” constraint.
- Theoretical guarantees: retains interpretable transport costs while dropping strict mass‑conservation, leading to provably better robustness.
- Efficient ranking algorithm that scores each unit by its alignment quality without re‑running the full transport solve for every subset.
- Empirical validation on three fronts: (1) synthetic simulations with outliers, (2) human fMRI datasets, and (3) deep convolutional networks.
- Demonstrated practical benefits: automatic exclusion of low‑reliability voxels, higher alignment precision across homologous brain regions, and clearer visual similarity among matched deep‑net units.
Methodology
- Representations as point clouds – each neural population (e.g., a set of voxels or a layer’s feature vectors) is treated as a weighted point cloud in a high‑dimensional feature space.
- Soft‑matching distance – traditionally solves an optimal‑transport problem that forces every point in one cloud to be matched to some point in the other, using a smooth (entropy‑regularized) cost.
- Partial extension – PSMD adds a slack variable that permits a fraction of the total “mass” to remain unmapped. In practice this means solving a partial optimal‑transport problem where the transport plan can leave some probability mass at a dummy “unmatched” node.
- Efficient unit ranking – after solving the transport once, the dual variables give a per‑unit score indicating how much mass each unit contributed to the optimal plan. Sorting these scores yields a ranking from “high‑confidence matches” to “likely outliers.”
- Implementation details – the authors use the Sinkhorn‑Knopp algorithm with a tunable unmatched mass parameter ε, which runs in seconds on a GPU for typical fMRI or deep‑net layer sizes.
Results & Findings
| Setting | What was tested | Key outcome |
|---|---|---|
| Synthetic simulations | Injected random outlier units into otherwise identical representations | PSMD kept the correct matches intact, while full soft‑matching forced spurious alignments. |
| fMRI data (visual cortex) | Compared voxel patterns across subjects and across homologous brain areas | PSMD automatically down‑weighted low‑reliability voxels, yielding voxel‑rankings that matched those from exhaustive brute‑force searches (≈ 99 % correlation) and improved inter‑subject alignment precision by ~12 %. |
| Deep CNNs (AlexNet, ResNet) | Aligned layers of independently trained networks | Units with high PSMD scores produced nearly identical maximally activating images; unmatched units showed divergent visual preferences, confirming that PSMD isolates a “core” aligned subpopulation. |
| Model selection task | Identify the correct generative model from noisy observations | PSMD selected the true model 85 % of the time versus 62 % for the standard soft‑matching distance. |
Overall, the method proved more tolerant to noise, faster to compute rankings, and more interpretable in terms of which units truly correspond across systems.
Practical Implications
- Neuroscience pipelines: Researchers can plug PSMD into existing representational similarity analysis (RSA) toolkits to automatically prune unreliable voxels, saving hours of manual quality control.
- Model‑to‑brain alignment: When mapping deep‑net layers onto brain regions, PSMD highlights the subset of units that genuinely share representational geometry, enabling tighter hypotheses about “brain‑like” features.
- Cross‑model diagnostics: Engineers comparing two versions of a model (e.g., after pruning or quantization) can use PSMD to quantify how much of the original representation survives, focusing debugging effort on the mismatched units.
- Transfer learning & domain adaptation: By identifying a high‑confidence aligned subspace, one can transfer only those features, potentially improving robustness when moving between datasets with systematic noise.
- Scalable analysis: Because the ranking is derived from a single transport solve, PSMD scales to tens of thousands of units—well within the capacity of modern GPUs—making it suitable for large‑scale model‑level audits.
Limitations & Future Work
- Choice of unmatched mass (ε): The method requires a user‑defined budget for how much mass may remain unmatched; setting this too low or too high can under‑ or over‑prune. Adaptive schemes are an open research direction.
- Assumption of Euclidean cost: The transport cost is based on Euclidean distances in representation space; alternative metrics (e.g., cosine similarity) may be more appropriate for some embeddings.
- Computational overhead vs. brute‑force: While far cheaper than exhaustive searches, PSMD still incurs the cost of a Sinkhorn iteration, which can be non‑trivial for extremely high‑dimensional data (e.g., whole‑brain voxel grids).
- Extension to temporal dynamics: The current formulation handles static snapshots; extending PSMD to compare time‑varying neural trajectories (e.g., MEG, RNN hidden states) remains to be explored.
Bottom line: Partial Soft‑Matching Distance offers a principled, easy‑to‑integrate tool for anyone needing to compare neural representations when perfect one‑to‑one correspondence is unrealistic—whether you’re aligning brain scans, auditing deep‑net layers, or building more robust model‑comparison pipelines.
Authors
- Chaitanya Kapoor
- Alex H. Williams
- Meenakshi Khosla
Paper Information
- arXiv ID: 2602.19331v1
- Categories: cs.LG, cs.NE, stat.ML
- Published: February 22, 2026
- PDF: Download PDF