[Paper] Exchange Is All You Need for Remote Sensing Change Detection

Published: 1 week ago (January 12, 2026 at 01:36 PM EST)

4 min read

Source: arXiv

Source: arXiv - 2601.07805v1

Overview

The paper introduces SEED (Siamese Encoder‑Exchange‑Decoder), a minimalist architecture for remote‑sensing change detection that discards the usual “subtract‑or‑concatenate” tricks and instead relies on a parameter‑free feature exchange between two identical encoders/decoders. By treating the exchange as an orthogonal permutation, the authors show that the model retains all the mutual information needed for optimal detection while being dramatically simpler to train and deploy.

Key Contributions

Exchange‑only fusion: Proposes a weight‑sharing, permutation‑based feature exchange that replaces explicit differencing modules.
Theoretical guarantee: Proves that under pixel‑wise consistency the exchange operator preserves mutual information and Bayes‑optimal risk, unlike common arithmetic fusions that can lose information.
Unified SEED framework: Demonstrates that a single set of parameters can serve both the Siamese encoder and decoder, turning the whole pipeline into a “single‑model” solution.
SEG2CD recipe: Shows how any off‑the‑shelf semantic segmentation network can be turned into a competitive change detector simply by inserting the exchange layer.
Strong empirical results: Matches or exceeds state‑of‑the‑art on five public change‑detection benchmarks (SYSU‑CD, LEVIR‑CD, PX‑CLCD, WaterCD, CDD) using three backbones (Swin‑Transformer, EfficientNet, ResNet).
Open source: Full code, training scripts, and evaluation protocols are released publicly.

Methodology

Siamese Encoder – Two identical encoders process the pre‑ and post‑event images in parallel, sharing all weights.
Feature Exchange Layer – Instead of computing a difference, the two feature maps are permuted (i.e., swapped) channel‑wise according to an orthogonal permutation matrix. This operation is parameter‑free and invertible, guaranteeing no loss of information.
Shared Decoder – A single decoder, again weight‑shared, receives the exchanged features and produces a binary change mask.
Training – Standard cross‑entropy loss on the change mask; no extra supervision or auxiliary branches are required.
SEG2CD – To convert a segmentation model, the authors insert the exchange layer between the encoder and decoder stages, re‑using the existing segmentation head for change detection.

The whole pipeline can be visualized as a single‑parameter network that processes two images simultaneously, swaps their latent representations, and decodes the result.

Results & Findings

Dataset	Backbone	SEED mIoU / F1	Prior SOTA (average)
SYSU‑CD	Swin‑T	0.842 / 0.915	0.828 / 0.902
LEVIR‑CD	EfficientNet	0.791 / 0.877	0.783 / 0.869
PX‑CLCD	ResNet	0.734 / 0.812	0.721 / 0.795
WaterCD	Swin‑T	0.681 / 0.754	0.672 / 0.743
CDD	ResNet	0.702 / 0.771	0.695 / 0.764

Parity with heavy models: SEED reaches or beats the best published numbers despite having far fewer trainable parameters (the exchange layer adds zero parameters).
Robustness across backbones: The same exchange mechanism works with CNNs (ResNet), hybrid CNN‑Transformer (EfficientNet), and pure transformer (Swin‑T).
Ablation studies confirm that removing the exchange (i.e., using plain concatenation) drops performance by 3–5 % mIoU, validating the theoretical claim about information preservation.
Inference speed: Because the two branches share weights, memory footprint is roughly that of a single encoder, enabling real‑time processing on a single RTX 3080 (≈ 25 fps for 512×512 tiles).

Practical Implications

Simplified pipelines – Developers no longer need to hand‑craft differencing modules or maintain separate encoder/decoder weights for each temporal view.
Easier deployment – A single‑parameter model reduces model‑size, simplifies containerization, and cuts GPU memory usage, which is valuable for edge devices (e.g., UAVs, on‑board satellite processors).
Transferability – Existing segmentation codebases (e.g., DeepLab, UNet) can be upgraded to change detection with a one‑line insertion of the exchange layer, accelerating product development cycles.
Interpretability – The exchange operation is a bijective permutation, making it straightforward to trace how information from each timestamp contributes to the final mask—useful for auditability in regulated remote‑sensing applications (e.g., disaster response, land‑use monitoring).
Potential for multimodal fusion – The same principle could be extended to fuse SAR and optical imagery, or to handle more than two temporal snapshots by chaining exchange operations.

Limitations & Future Work

Pixel‑level alignment assumption – The theoretical guarantees rely on perfectly co‑registered images; misregistration can degrade the exchange’s effectiveness.
Binary change focus – The current formulation targets binary change/no‑change masks; extending to multi‑class change semantics (e.g., “urban expansion vs. vegetation loss”) requires additional labeling and possibly a more expressive decoder.
Temporal scalability – While the paper hints at chaining exchanges for multi‑temporal data, experiments are limited to bi‑temporal pairs; future work could explore scalable architectures for long time series.
Real‑world robustness – Benchmarks are curated and relatively clean; testing SEED on noisy, cloud‑covered, or low‑resolution satellite streams would further validate its practicality.

If you’re building a change‑detection service or looking to retrofit an existing segmentation model, SEED offers a surprisingly simple yet theoretically sound shortcut. The authors’ open‑source release makes it easy to experiment and integrate into production pipelines.

Authors

Sijun Dong
Siming Fu
Kaiyu Li
Xiangyong Cao
Xiaoliang Meng
Bo Du

Paper Information

arXiv ID: 2601.07805v1
Categories: cs.CV
Published: January 12, 2026
PDF: Download PDF

[Paper] Exchange Is All You Need for Remote Sensing Change Detection

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] UniX: Unifying Autoregression and Diffusion for Chest X-Ray Understanding and Generation

[Paper] ShapeR: Robust Conditional 3D Shape Generation from Casual Captures

[Paper] ReScene4D: Temporally Consistent Semantic Instance Segmentation of Evolving Indoor 3D Scenes

[Paper] CTest-Metric: A Unified Framework to Assess Clinical Validity of Metrics for CT Report Generation