[Paper] Interpretable Plant Leaf Disease Detection Using Attention-Enhanced CNN

Published: (December 19, 2025 at 01:11 PM EST)
4 min read
Source: arXiv

Source: arXiv - 2512.17864v1

Overview

Plant diseases can devastate crops and threaten food security, yet many growers still rely on manual visual inspections that are slow and error‑prone. The paper Interpretable Plant Leaf Disease Detection Using Attention‑Enhanced CNN proposes a new deep‑learning model—CBAM‑VGG16—that not only pushes detection accuracy past 98 % on several benchmark datasets but also offers clear visual explanations of why it makes each decision. By marrying a classic VGG16 backbone with modern attention modules, the authors deliver a system that is both high‑performing and trustworthy for real‑world farming applications.

Key Contributions

  • Attention‑augmented architecture: Integrates a Convolutional Block Attention Module (CBAM) after every convolutional block of VGG16, improving focus on disease‑relevant leaf regions.
  • State‑of‑the‑art performance: Achieves up to 98.87 % accuracy across five heterogeneous plant disease datasets, outperforming recent CNN‑based baselines.
  • Interpretability pipeline: Combines CBAM attention maps with post‑hoc explainability tools (Grad‑CAM, Grad‑CAM++, LRP) to produce human‑readable visualizations of disease cues.
  • Robust generalization: Demonstrates consistent results on cross‑dataset validation, indicating the model can handle variations in lighting, background, and leaf morphology.
  • Open‑source release: Provides full training and inference code (GitHub link) to accelerate adoption and reproducibility in the ag‑tech community.

Methodology

  1. Base Network – The authors start with VGG16, a well‑understood CNN known for its simplicity and strong feature hierarchy.
  2. CBAM Integration – After each convolutional block, a lightweight CBAM is inserted. CBAM sequentially applies channel‑wise and spatial‑wise attention, letting the network amplify informative feature maps (e.g., spots, discolorations) while suppressing background noise.
  3. Training Regime – Standard data augmentation (random flips, rotations, color jitter) is used to mimic field conditions. The model is trained with cross‑entropy loss and Adam optimizer, fine‑tuned on each dataset separately.
  4. Interpretability Suite – During inference, the built‑in CBAM attention maps are visualized alongside Grad‑CAM, Grad‑CAM++, and Layer‑wise Relevance Propagation (LRP) heatmaps. This multi‑view approach helps users verify that the model’s focus aligns with agronomists’ expectations.

The pipeline remains straightforward: feed a leaf image → CBAM‑VGG16 predicts disease class → visual explanations are generated automatically.

Results & Findings

Dataset (samples)AccuracyF1‑ScoreNotable Observation
Apple Scab (2 k)98.87 %0.987CBAM highlights lesion edges, matching expert annotations
Tomato Early Blight (3 k)97.94 %0.979Spatial attention suppresses soil/background clutter
Grape Black Rot (1.5 k)98.31 %0.982Grad‑CAM++ confirms focus on vein discoloration
… (3 other datasets)>96 %>0.95Consistent cross‑dataset performance

Across the board, the CBAM‑enhanced model outperformed a vanilla VGG16 and several recent attention‑based classifiers by 1.5–3 % absolute accuracy. The interpretability analysis showed that attention maps consistently overlapped with disease symptoms identified by domain experts, reinforcing trustworthiness.

Practical Implications

  • Smart Farming Apps – Developers can embed the pre‑trained CBAM‑VGG16 model into mobile or edge devices for on‑site disease scouting, delivering instant, explainable results to farmers.
  • Decision Support Systems – The visual heatmaps can be displayed alongside predictions in farm management dashboards, helping agronomists validate AI suggestions before taking action (e.g., targeted pesticide application).
  • Reduced Data Collection Costs – Because the model generalizes well across varied lighting and backgrounds, growers need not invest heavily in controlled imaging setups; ordinary smartphone photos suffice.
  • Regulatory & Trust Barriers – Explainable AI is increasingly required for agricultural AI tools. The built‑in attention visualizations satisfy many compliance and adoption hurdles by making the “black box” transparent.
  • Open‑source Ecosystem – The released codebase enables rapid prototyping, fine‑tuning for region‑specific crops, and integration with existing IoT pipelines (e.g., drone‑based imaging).

Limitations & Future Work

  • Dataset Diversity – While five datasets were used, they still represent a limited set of crops and disease stages; performance on rare or mixed infections remains untested.
  • Computation Overhead – Adding CBAM modules increases inference time modestly (~10 % slower than vanilla VGG16), which may be a bottleneck on ultra‑low‑power edge hardware.
  • Explainability Depth – The current visualizations are qualitative; quantitative metrics linking attention intensity to disease severity were not explored.
  • Future Directions – The authors suggest extending the architecture to lightweight backbones (e.g., MobileNet) for real‑time deployment, incorporating multi‑spectral imagery (e.g., NIR), and developing a severity‑estimation module that leverages attention scores.

Authors

  • Balram Singh
  • Ram Prakash Sharma
  • Somnath Dey

Paper Information

  • arXiv ID: 2512.17864v1
  • Categories: cs.CV, cs.AI
  • Published: December 19, 2025
  • PDF: Download PDF
Back to Blog

Related posts

Read more »