[Paper] Multimodal Brain Tumour Classification Using Feature Fusion
Source: arXiv - 2606.11107v1
Overview
Clinicians diagnose brain tumors by synthesizing patient symptoms, medical history, and quantitative imaging data from modalities such as MRI and CT scans into a unified clinical judgement. However, most deep learning models rely on MRI/CT images alone, failing to replicate the clinicians multimodal reasoning. We explore a two-branch multimodal network combining raw MRI scans with 91 extracted radiomic features (intensity, texture, shape, and boundary descriptors) to classify brain tumors into glioma, meningioma, pituitary, and no-tumor. A pre-trained CNN backbone encodes the image stream, whereas a dedicated MLP encodes the radiomic stream. Both streams are fused via concatenation, gated, or bidirectional cross-modal attention strategies. Across nine experimental runs on a balanced 7,200 image dataset, all multimodal configurations outperform unimodal baselines with gated fusion achieving the best accuracy of 96.13%.
Key Contributions
This paper presents research in the following areas:
- eess.IV
- cs.CV
- cs.LG
Methodology
Please refer to the full paper for detailed methodology.
Practical Implications
This research contributes to the advancement of eess.IV.
Authors
- Wajih ul Islam
- Muhammad Yaqoob
- Javed Ali Khan
- Volker Steuber
Paper Information
- arXiv ID: 2606.11107v1
- Categories: eess.IV, cs.CV, cs.LG
- Published: June 9, 2026
- PDF: Download PDF