[Paper] FL-MHSM: Spatially-adaptive Fusion and Ensemble Learning for Flood-Landslide Multi-Hazard Susceptibility Mapping at Regional Scale
Source: arXiv - 2604.16265v1
Overview
The paper introduces FL‑MHSM, a deep‑learning workflow that jointly predicts flood and landslide susceptibility over large regions. By combining spatially‑aware data partitioning with both early‑ and late‑fusion strategies, and wrapping them in a soft‑gating Mixture‑of‑Experts (MoE) model, the authors achieve higher accuracy and better uncertainty handling than traditional uniform‑grid approaches.
Key Contributions
- Two‑level spatial partitioning that respects natural heterogeneity (zonal partitions + overlapping lattice grids) while still enabling data‑parallel inference.
- Probabilistic Early Fusion (EF) and a tree‑based Late Fusion (LF) baseline, demonstrating complementary strengths for different hazards.
- Soft‑gating Mixture‑of‑Experts (MoE) that dynamically selects the best expert (EF or LF) per location, delivering the strongest overall performance.
- Extensive empirical validation on two contrasting regions (Kerala, India and Nepal) showing consistent gains in AUC‑ROC, recall, and Brier score.
- Interpretability layer using GeoDetector to reveal how dominant drivers (topography, land‑cover, drainage, glacier proximity) vary across zones.
Methodology
- Data preparation – The authors gather raster layers (elevation, slope, land‑cover, precipitation, etc.) and hazard occurrence points for floods and landslides.
- Spatial partitioning –
- Zonal level: The study area is split into homogeneous zones based on terrain and climate clusters.
- Lattice level: Within each zone, overlapping square tiles are created so that predictions can be run in parallel on GPUs.
- Model families –
- Early Fusion (EF): All raster channels are concatenated and fed into a single convolutional neural network (CNN). The network outputs probabilistic hazard maps for both flood and landslide simultaneously.
- Late Fusion (LF): Separate CNNs are trained per hazard; their predictions are later combined by a gradient‑boosted decision tree (GBDT) that learns how to weight each hazard’s output.
- Mixture‑of‑Experts (MoE) – A lightweight gating network looks at the same input tiles and learns a soft probability of “which expert (EF or LF) should dominate here.” The final prediction is a weighted sum of the two experts’ outputs, allowing the system to adapt locally.
- Training & evaluation – Standard cross‑entropy loss with class‑balancing, plus Brier score for calibration. Performance is measured with AUC‑ROC, recall, and F1‑score on held‑out test tiles.
- Interpretability – GeoDetector quantifies the explanatory power of each covariate per zone on the MoE’s final susceptibility scores.
Results & Findings
| Region | Hazard | Model | AUC‑ROC | Recall | Brier Score | F1 |
|---|---|---|---|---|---|---|
| Kerala | Flood | EF vs LF (EF better) | – | ↑0.816 → 0.840 | ↓0.092 → 0.086 | – |
| Kerala | Flood | MoE (final) | 0.905 | 0.930 | – | 0.722 |
| Nepal | Flood | EF vs LF (EF better) | – | ↑0.820 → 0.858 | ↓0.057 → 0.049 | – |
| Nepal | Landslide | MoE (final) | 0.914 | 0.901 | – | 0.559 |
- Early Fusion consistently improved flood recall and calibration compared with Late Fusion.
- Mixture‑of‑Experts outperformed both individual experts, delivering the highest AUC‑ROC for both hazards.
- GeoDetector revealed that in Kerala, different zones are driven by distinct factor combos (e.g., drainage in coastal zones, land‑cover in mid‑hills), whereas Nepal’s zones share a strong topographic/glacier signal.
Practical Implications
| Audience | Takeaway |
|---|---|
| GIS / Remote‑Sensing Engineers | The two‑level tiling scheme lets you run massive susceptibility maps on commodity GPU clusters without sacrificing spatial nuance. |
| Disaster‑Risk Planners | Joint flood‑landslide maps with calibrated probabilities enable more informed, multi‑hazard evacuation and infrastructure siting decisions. |
| ML Practitioners | The MoE gating pattern is a lightweight way to blend heterogeneous models (CNN vs tree‑based) and can be repurposed for other multi‑task geospatial problems (e.g., wildfire‑earthquake risk). |
| Software Vendors | The workflow can be packaged as a micro‑service: ingest raster tiles → run EF/LF → MoE gating → output GeoTIFFs with per‑pixel uncertainty. |
| Policy Makers | Zone‑level factor analysis (via GeoDetector) provides a data‑driven narrative to justify region‑specific mitigation investments. |
Limitations & Future Work
- Data dependence – The approach assumes high‑quality, uniformly processed raster layers; noisy or missing covariates could degrade MoE gating.
- Model complexity – Training two deep CNNs plus a GBDT and a gating network increases compute cost; scaling to continental extents may require model pruning or distillation.
- Temporal dynamics – The current pipeline is static; incorporating time‑varying predictors (e.g., seasonal precipitation forecasts) could improve early warning capabilities.
- Generalization – Validation is limited to two regions; testing on other tectonic or climatic settings would confirm robustness.
- Interpretability depth – GeoDetector offers zone‑level importance but does not capture interaction effects; future work could integrate SHAP or counterfactual analyses for richer explanations.
TL;DR: FL‑MHSM shows that a spatially‑aware mixture‑of‑experts can fuse early‑ and late‑fusion deep models to produce more accurate, calibrated flood‑landslide risk maps, while also revealing which environmental drivers matter where. For developers building large‑scale geospatial AI pipelines, the paper offers a practical blueprint for balancing performance, interpretability, and scalability.
Authors
- Aswathi Mundayatt
- Jaya Sreevalsan-Nair
Paper Information
- arXiv ID: 2604.16265v1
- Categories: cs.LG
- Published: April 17, 2026
- PDF: Download PDF