[Paper] XtraLight-MedMamba for Classification of Neoplastic Tubular Adenomas
Source: arXiv - 2602.04819v1
Overview
A new ultra‑lightweight deep‑learning framework called XtraLight‑MedMamba tackles a real‑world problem in digital pathology: automatically distinguishing neoplastic tubular adenomas that are likely to progress to colorectal cancer from those that are not. By achieving >97 % accuracy with only ~32 k trainable parameters, the work demonstrates that high‑performance histopathology AI can be both accurate and deployable on modest hardware—an attractive prospect for hospitals and health‑tech startups alike.
Key Contributions
- Hybrid architecture that fuses a ConvNext shallow feature extractor with a parallel Vision‑Mamba module, capturing both local texture and long‑range spatial dependencies.
- Spatial‑and‑Channel Attention Bridge (SCAB) for multiscale feature enrichment without a heavy computational burden.
- Fixed Non‑Negative Orthogonal Classifier (FNOClassifier) that drastically reduces the number of trainable parameters while improving generalization.
- State‑of‑the‑art performance on a curated whole‑slide image (WSI) dataset of low‑grade tubular adenomas: 97.18 % accuracy and an F1‑score of 0.9767.
- Parameter efficiency: ~32 k parameters, far fewer than comparable transformer‑based or vanilla Mamba models, enabling inference on edge devices or low‑cost GPUs.
Methodology
- Data preparation – Whole‑slide images from colonoscopy biopsies were split into case (patients later developing CRC) and control groups. Standard tiling and color normalization were applied to create a balanced training set.
- Feature extraction – A shallow ConvNext block extracts low‑level visual cues (edges, nuclei shapes). In parallel, a Vision‑Mamba (state‑space) module processes the same tiles, learning long‑range dependencies that are crucial for spotting subtle dysplastic patterns.
- Attention bridging – The SCAB module receives outputs from both streams, applying spatial attention (where to look) and channel attention (what features matter) across multiple scales. This step amplifies discriminative signals without adding many layers.
- Classification head – Instead of a conventional fully‑connected layer, the authors employ an FNOClassifier. Its weights are fixed, orthogonal, and constrained to be non‑negative, which forces the network to learn robust, linearly separable representations while keeping the parameter count tiny.
- Training & evaluation – Standard cross‑entropy loss with class‑balanced weighting was used. Performance was measured via accuracy, precision, recall, and F1‑score on a held‑out test set.
Results & Findings
| Metric | XtraLight‑MedMamba | Prior Transformer‑based | Vanilla Mamba |
|---|---|---|---|
| Accuracy | 97.18 % | 93.4 % | 94.1 % |
| F1‑Score | 0.9767 | 0.938 | 0.951 |
| Parameters | ≈32 k | ~2 M | ~1.5 M |
| Inference time (per tile, 1080 Ti) | ~3 ms | ~12 ms | ~10 ms |
The model not only outperforms heavier baselines but does so with a 30‑fold reduction in model size, confirming that the SCAB + FNO design yields strong regularization and better generalization on limited medical data.
Practical Implications
- Edge deployment – With < 50 k parameters, the model can run on commodity CPUs, low‑end GPUs, or even specialized inference chips in endoscopy suites, enabling real‑time decision support during colonoscopies.
- Cost‑effective screening – Hospitals can integrate the AI into existing digital pathology pipelines without needing expensive GPU clusters, lowering the barrier for AI‑assisted risk stratification.
- Standardization of pathology – By providing an objective, reproducible readout for low‑grade dysplasia, the tool can reduce inter‑observer variability among pathologists and support tele‑pathology workflows.
- Data‑efficient training – The architecture’s parameter efficiency makes it suitable for other histopathology tasks where annotated WSIs are scarce, encouraging broader adoption across cancer types.
- Regulatory pathway – A lightweight, well‑behaved model (fixed orthogonal classifier) is easier to validate and audit, potentially accelerating FDA or CE marking processes.
Limitations & Future Work
- Dataset scope – The study uses a single‑institution, curated cohort of low‑grade tubular adenomas; external validation on multi‑center datasets is needed to confirm robustness across staining protocols and scanner models.
- Binary focus – The current formulation distinguishes “high‑risk” vs. “low‑risk” adenomas; extending to multi‑class grading (e.g., high‑grade dysplasia, serrated lesions) would broaden clinical utility.
- Explainability – While attention maps are provided, deeper interpretability tools (e.g., concept activation vectors) could help clinicians understand why a tile is flagged as high‑risk.
- Integration with clinical data – Combining image features with patient metadata (age, genetics, lifestyle) may further improve predictive power and personalize surveillance intervals.
Overall, XtraLight‑MedMamba showcases how clever architectural choices can deliver state‑of‑the‑art performance in medical imaging while staying lightweight enough for real‑world deployment.
Authors
- Aqsa Sultana
- Rayan Afsar
- Ahmed Rahu
- Surendra P. Singh
- Brian Shula
- Brandon Combs
- Derrick Forchetti
- Vijayan K. Asari
Paper Information
- arXiv ID: 2602.04819v1
- Categories: cs.CV, cs.LG
- Published: February 4, 2026
- PDF: Download PDF