[Paper] Confidence is Not Reliability: Rethinking MC Dropout in Brain Tumour Segmentation

Published: 1 day ago (June 17, 2026 at 01:24 PM EDT)

2 min read

Source: arXiv

Source: arXiv - 2606.19300v1

Overview

Glioma segmentation in multiparametric MRI is a critical component of treatment planning. A segmentation model that fails silently on treatment-critical sub-regions represents a patient safety risk that overlap-based metrics such as Dice scores cannot expose. We ask whether voxel-level uncertainty estimation via Monte Carlo (MC) Dropout can reliably identify segmentation errors in clinically critical sub-regions, and whether calibration failure modes are detectable from standard reporting metrics alone. In an empirical two-model case study on 126 BraTS21 patients, we evaluate a high-performance pretrained SegResNet and a locally trained UNet with residual units (UNet-Res). MC dropout preserved segmentation accuracy ($|Δ\text{Dice}|$ $<0.01$) while achieving strong uncertainty-error alignment (AUROC for entropy (H) $\approx$0.97), indicating uncertainty correctly ranks erroneous voxels above correct ones. Entropy-based patient stratification identified a high-uncertainty subgroup with substantially lower segmentation performance (median whole-tumour Dice $0.835$ vs. $0.925$), supporting uncertainty as a practical triage signal. However, global alignment can mask important region-specific differences. Despite similar AUROC, UNet-Res exhibited near-zero enhancing tumour entropy ($0.054$) and Expected Calibration Error (ECE) of $0.915$, with a Dice of only $0.714$, indicating severely miscalibrated confidence on the most clinically critical sub-region, a failure mode invisible to standard Dice and AUROC reporting. These findings demonstrate that strong uncertainty-error alignment is necessary but insufficient for clinical safety: sub-region-specific calibration assessment must accompany AUROC evaluation when selecting models for clinical deployment.

Key Contributions

This paper presents research in the following areas:

cs.CV
cs.LG

Methodology

Please refer to the full paper for detailed methodology.

Practical Implications

This research contributes to the advancement of cs.CV.

Authors

Xin Ci Wong
Duygu Sarikaya
Kieran Zucker
Marc De Kamps
Nishant Ravikumar

Paper Information

arXiv ID: 2606.19300v1
Categories: cs.CV, cs.LG
Published: June 17, 2026
PDF: Download PDF

[Paper] Confidence is Not Reliability: Rethinking MC Dropout in Brain Tumour Segmentation

Overview

Key Contributions

Methodology

Practical Implications

Authors

Paper Information

Related posts

[Paper] Reference-Driven Multi-Speaker Audio Scene Generation from In-the-Wild Priors

[Paper] A Multi-Domain Benchmark for Detecting AI-Generated Text-Rich Images from GPT-Image-2

[Paper] OneCanvas: 3D Scene Understanding via Panoramic Reprojection

[Paper] Transformer Geometry Observatory TGO-I: Spectral Geometry Observatory