EoCD: Encoder-only Remote Sensing
Change Detection

Mubashir Noman1Mustansar Fiaz2Hiyam Debary2Abdul Hannan3Shah Nawaz4Fahad Shahbaz Khan1Salman Khan1
1MBZUAI, UAE  ·  2IBM Research, UAE  ·  3University of Trento, Italy  ·  4Johannes Kepler University Linz, Austria
Paper PDF Code

Key Insights & Contributions

01 —

Encoder-only CD Framework

EoCD is a simple and efficient change detection model based on early fusion that entirely eliminates the need for a sophisticated decoder.

02 —

Parameter-free EMFF

The Efficient Multiscale Feature Fusion module contains no learnable parameters yet effectively aggregates multiscale encoder features for optimal CD performance.

03 —

Encoder Dominance Insight

We demonstrate that CD performance is predominantly dependent on the encoder, making the decoder an additional and often unnecessary component — a new direction for the RS community.

04 —

Broad Encoder Compatibility

Experiments with CNN, ViT, and Swin-based encoders confirm EoCD achieves the optimal balance between performance and prediction speed across architectures.

Comparison of Various CD Frameworks

CD framework comparison

Comparison of various CD frameworks. (a) In late fusion, Ipre and Ipost are fed to a Siamese encoder causing the backbone to process each image separately, leading to increased computational cost. (b) Early fusion prevents this by concatenating the bitemporal images before passing to the backbone; however, the sophisticated decoder still adds undesirable complexity. (c) EoCD introduces a simple design that bypasses the extra overhead of the Siamese encoder and sophisticated decoder.

EoCD Training Paradigm

EoCD architecture

Overall architecture of the proposed EoCD. It characterizes a student–teacher framework with a decoder-less student network. The student network performs early fusion of temporal images and optimally combines the multiscale representations, thereby significantly improving the efficiency of the network.

State-of-the-art on LEVIR-CD

Performance comparison of state-of-the-art methods on LEVIR-CD. FLOPs and latency are computed using RGB image size of 224×224. EoCD achieves superior performance in IoU, F1, and overall accuracy while showing favorable efficiency against existing approaches.

Best result
Second best
EoCD (Ours)
Method Backbone Params (M) FLOPs (G) Latency (ms) IoU (%) F1 (%) Accuracy (%)
EATDerCustom7.1221.3026.291.2098.75
ELGCNet-LWCustom6.7815.1724.582.3690.3399.03
ChangeFormerCustom41.03106.0026.682.4890.4099.04
Convformer-CD/48Custom49.315.3048.884.2391.4499.13
RSMamba †Mamba27.9015.7083.6691.10
CDMamba †Mamba11.9149.2654.7783.0790.7599.06
BITResNet-1812.408.3213.380.6889.3198.92
STRobustNetResNet-1813.7319.3212.783.6691.1199.10
TMSFResNet-1812.928.9040.483.2990.88
FSG-NetResNet-1813.7683.9491.2799.10
RHighNet †ResNet-50 + ViT-B/16120.8069.4798.984.0191.3199.13
SFEARNetSegFormer5.563.6426.783.2390.8599.07
DSFDcdU-Net8.9480.3489.1198.93
EoCD Oursmit-b113.372.498.183.2090.8399.08
EoCD OursResNet-3421.504.393.883.3390.9199.09
EoCD OursFocalNet-T30.326.4612.184.7891.7699.17

† FLOPs and Latency reported using image size 256×256 due to model configuration constraints.

State-of-the-art on CDD-CD

EoCD achieves superior performance across all metrics on the CDD-CD dataset.

MethodIoU (%)F1 (%)Acc (%)
BIT80.0188.9097.47
ChangeFormer81.5389.8397.68
ChangeMamba81.9990.1097.72
STRobustNet88.0893.6698.50
ConvFormer-CD88.6393.9698.59
CDMamba88.8194.0698.57
DSFDcd88.8194.0698.57
FSG-Net88.9694.1698.56
TMSF90.4494.98
EATDer95.9798.97
ELGCNet-LW93.4896.6399.21
RHighNet94.6597.2599.35
EoCD Ours94.8397.3499.37

State-of-the-art on SYSU-CD

EoCD performs significantly better compared to existing CD methods on SYSU-CD.

MethodIoU (%)F1 (%)Acc (%)
ChangeFormer60.6075.4689.20
BIT61.4076.0888.95
ConvFormer-CD/4865.7679.3590.98
ELGCNet66.6279.9790.72
ChangeMamba66.3979.8090.85
DSFDcd67.3180.4691.00
RHighNet67.5380.6291.33
STRobustNet67.5980.6691.13
LCD-Net68.3881.22
EoCD Ours68.6781.4291.67

State-of-the-art on WHU-CD

EoCD exhibits substantial progress across all metrics on WHU-CD, indicating better capabilities to capture semantic changes.

MethodIoU (%)F1 (%)Accuracy (%)
BIT72.3983.9898.75
ChangeFormer73.8084.9398.82
ELGCNet80.8689.4299.20
EATDer90.0198.58
TMSF80.0988.95
RSMamba84.9691.87
STRobustNet83.2990.8999.32
RHighNet83.7991.1899.32
ScratchFormer84.9791.8999.37
ConvFormer-CD85.4192.1399.26
SFEARNet85.8192.3699.38
EoCD Ours87.1793.1599.47

Qualitative Comparison

Qualitative comparison of EoCD with BIT, ChangeFormer, and ELGC-Net

Qualitative comparison of EoCD with BIT, ChangeFormer, and ELGC-Net CD methods. Data samples shown from row one to four correspond to LEVIR-CD, CDD-CD, SYSU-CD, and WHU-CD datasets, respectively. Notably, our approach demonstrates its capabilities to better detect the semantic changes highlighted in the yellow dotted boxes compared to existing methods.

BibTeX

@article{noman2026eocd,
  title   = {EoCD: Encoder only Remote Sensing Change Detection},
  author  = {Noman, Mubashir and Fiaz, Mustansar and Debary, Hiyam
             and Hannan, Abdul and Nawaz, Shah and Khan, Fahad Shahbaz
             and Khan, Salman},
  journal = {arXiv preprint arXiv:2602.05882},
  year    = {2026},
  url     = {https://arxiv.org/abs/2602.05882}
}