[Paper] MORPHFED: Federated Learning for Cross-institutional Blood Morphology Analysis
Source: arXiv - 2601.04121v1
Overview
The paper presents MORPHFED, a federated learning (FL) framework that lets hospitals and labs train a shared AI model for white‑blood‑cell (WBC) morphology classification without ever moving patient images off‑site. By keeping data local, the approach respects privacy laws while still capturing the wide variability in staining, imaging hardware, and rare cell types that typically hampers a single‑center solution.
Key Contributions
- Privacy‑preserving cross‑institutional training – Demonstrates a practical FL pipeline for medical imaging that complies with data‑sharing restrictions common in LMICs.
- Domain‑invariant feature learning – Shows that models trained across heterogeneous sites learn representations robust to staining and scanner differences.
- Comprehensive empirical study – Benchmarks both convolutional neural networks (CNNs) and vision transformers (ViTs) under federated vs. centralized training across multiple clinical sites.
- Improved generalization to unseen institutions – Federated models outperform centrally trained baselines when evaluated on data from hospitals not involved in training.
- Open‑source reference implementation – Provides code and a simulated multi‑site dataset to accelerate reproducibility and adoption in the community.
Methodology
- Data Partitioning – Blood‑film images from several hospitals are kept on their respective servers. Each site holds its own labeled WBC patches, reflecting local staining protocols and microscope settings.
- Model Architecture – The authors experiment with two families:
- Classic CNNs (ResNet‑50, EfficientNet‑B3)
- Vision Transformers (ViT‑Base, Swin‑Transformer)
- Federated Learning Loop –
- Local Update: Each site trains the current global model on its private data for a few epochs (FedAvg style).
- Secure Aggregation: Model weight updates are encrypted and sent to a central server that averages them, producing a new global model.
- Repeat: The cycle runs for 50–100 communication rounds.
- Evaluation Protocol – After training, the global model is tested on:
- In‑site test sets (same hospitals)
- Cross‑site test sets (other participating hospitals)
- Hold‑out institutions (completely unseen labs)
All steps are implemented with standard FL libraries (PySyft, Flower) and run on commodity GPUs, making the pipeline reproducible for other medical imaging tasks.
Results & Findings
| Setup | In‑site Accuracy | Cross‑site Accuracy | Unseen‑site Accuracy |
|---|---|---|---|
| Centralized CNN | 92.1 % | 78.4 % | 71.2 % |
| Federated CNN (FedAvg) | 91.8 % | 84.7 % | 78.5 % |
| Centralized ViT | 93.3 % | 80.1 % | 73.0 % |
| Federated ViT | 93.0 % | 86.2 % | 81.4 % |
- Cross‑site performance jumps 6–8 % with FL, indicating better handling of domain shift.
- Unseen‑site generalization improves by ~10 %, suggesting the global model learns truly transferable features.
- Communication overhead stays modest (≈ 2 MB per round), and training time is comparable to centralized baselines because local epochs are performed in parallel.
Practical Implications
- Scalable AI for LMICs: Clinics can contribute to a shared diagnostic model without violating patient confidentiality or needing high‑bandwidth data transfers.
- Rapid Deployment: New hospitals can join the federation, download the latest global weights, and start local fine‑tuning instantly, shortening the time‑to‑clinical‑use.
- Robust Diagnostics: The domain‑invariant model reduces false‑negative rates caused by staining variations, leading to more reliable automated blood‑film reads in resource‑constrained labs.
- Regulatory Alignment: By keeping raw images on‑premises, the approach aligns with GDPR, HIPAA, and emerging data‑sovereignty laws, easing legal clearance for AI‑assisted diagnostics.
- Reusable Blueprint: The same FL pipeline can be adapted to other microscopy tasks (e.g., malaria detection, histopathology), encouraging a broader ecosystem of privacy‑first medical AI.
Limitations & Future Work
- Simulated Network Conditions: Experiments used a stable LAN; real‑world WAN latency and intermittent connectivity could affect convergence speed.
- Label Heterogeneity: The study assumes consistent annotation guidelines across sites; future work should explore federated learning with noisy or partially overlapping label sets.
- Model Compression: While communication costs are low, deploying large ViTs on edge devices in low‑resource settings may require pruning or quantization, which the authors plan to investigate.
- Clinical Validation: The current evaluation is retrospective; prospective trials in actual diagnostic workflows are needed to confirm real‑world impact.
Bottom line: MORPHFED shows that federated learning isn’t just a theoretical privacy tool—it can materially improve the robustness and reach of AI‑driven blood‑cell analysis, paving the way for equitable, data‑secure medical imaging solutions worldwide.
Authors
- Gabriel Ansah
- Eden Ruffell
- Delmiro Fernandez-Reyes
- Petru Manescu
Paper Information
- arXiv ID: 2601.04121v1
- Categories: cs.LG, cs.CV
- Published: January 7, 2026
- PDF: Download PDF