[Paper] An explainable hybrid deep learning-enabled intelligent fault detection and diagnosis approach for automotive software systems validation

Published: (March 9, 2026 at 05:46 AM EDT)
5 min read
Source: arXiv

Source: arXiv - 2603.08165v1

Overview

The paper presents an explainable hybrid deep‑learning framework for fault detection, diagnosis, and localization in automotive software systems (ASS) during validation. By marrying a 1‑D CNN‑GRU model with state‑of‑the‑art XAI techniques, the authors deliver a solution that not only spots failures in real‑time test drives but also tells engineers why a particular fault was flagged—bridging the gap between black‑box accuracy and practical interpretability.

Key Contributions

  • Hybrid 1‑D CNN‑GRU architecture tailored for time‑series sensor data from hardware‑in‑the‑loop (HiL) test drives.
  • Integration of multiple XAI methods (Integrated Gradients, DeepLIFT, Gradient SHAP, DeepLIFT‑SHAP) to generate per‑feature attribution maps for each fault prediction.
  • End‑to‑end pipeline that goes from raw validation recordings to fault detection, identification, and precise localization of the root cause.
  • Real‑world evaluation on a live HiL dataset collected during a virtual test drive, demonstrating both high detection accuracy and actionable explanations.
  • Guidelines for model adaptation based on the explanations, enabling developers to iteratively refine the fault detection model without retraining from scratch.

Methodology

  1. Data Collection – The authors recorded multivariate time‑series signals (e.g., CAN bus messages, sensor readings) from a virtual test drive executed on a HiL platform.
  2. Pre‑processing – Signals were synchronized, normalized, and segmented into fixed‑length windows that capture the temporal dynamics of a driving scenario.
  3. Hybrid Model
    • 1‑D CNN layers act as feature extractors, learning local patterns (spikes, ramps) across the sensor channels.
    • GRU (Gated Recurrent Unit) layers capture longer‑range temporal dependencies, crucial for detecting faults that manifest over several seconds.
    • The combined network outputs a softmax over fault classes (including “no‑fault”).
  4. Explainability Layer – After a prediction, the model is probed with four XAI algorithms:
    • Integrated Gradients (IG) – accumulates gradients along a straight‑line path from a baseline to the input.
    • DeepLIFT – compares activation differences to a reference, attributing contributions to each input.
    • Gradient SHAP – merges SHAP values with gradient information for smoother attributions.
    • DeepLIFT‑SHAP – a hybrid that leverages both DeepLIFT and SHAP concepts.
      These produce heat‑maps that highlight which sensor channels and time steps drove the decision.
  5. Root‑Cause Analysis (RCA) – By overlaying attribution maps on the original signals, engineers can pinpoint the exact subsystem (e.g., brake controller, throttle actuator) responsible for the fault.

Results & Findings

MetricValue
Fault detection accuracy≈ 96 % (across 5 fault types)
False‑positive rate< 2 %
Explanation fidelity (measured by deletion/insertion tests)> 0.85 for all XAI methods
Localization precision (distance between true fault source and top‑attributed sensor)≤ 0.3 s of signal time

Key takeaways:

  • The hybrid CNN‑GRU outperformed pure CNN or pure RNN baselines, especially on faults with subtle temporal signatures.
  • Attribution maps were consistent across XAI methods, giving developers confidence that the model was focusing on the right signals.
  • Using the explanations, the team could manually adjust a mis‑behaving subsystem and observe the model’s prediction shift without retraining, confirming the “model‑adaptation” claim.

Practical Implications

  • Faster Validation Cycles – Engineers can run automated fault detection on HiL test runs and instantly receive a diagnostic report, cutting manual log‑analysis time by up to 70 %.
  • Safety‑Critical Confidence – Explainable outputs satisfy regulatory demands (e.g., ISO 26262) for traceability of safety decisions, making it easier to certify autonomous driving software.
  • Continuous Integration – The pipeline can be embedded into CI/CD for automotive ECUs, flagging regressions the moment a new firmware version is deployed to the test bench.
  • Model Maintenance – Because the XAI layer reveals which features dominate predictions, developers can prune irrelevant sensors or recalibrate thresholds, reducing model size and inference latency for on‑board deployment.
  • Cross‑Domain Reuse – The same architecture can be repurposed for other cyber‑physical systems (e.g., aerospace, industrial robotics) where time‑series fault detection with interpretability is required.

Limitations & Future Work

  • Dataset Scope – The study used a single HiL scenario; broader validation across diverse vehicle platforms and real‑world driving data is needed to confirm generalizability.
  • Explanation Granularity – While the XAI methods highlight influential sensors, they do not yet provide causal graphs linking multiple subsystems, which could further aid root‑cause isolation.
  • Real‑Time Constraints – The current implementation runs offline; optimizing the model and XAI calculations for on‑board, sub‑millisecond inference remains an open challenge.
  • User Study – The paper does not include a formal evaluation of how engineers interact with the explanations; future work could assess usability and decision‑making impact.

Bottom line: By delivering both high‑accuracy fault detection and transparent, actionable explanations, this hybrid deep‑learning approach paves the way for more trustworthy, efficient validation pipelines in modern automotive software development.

Authors

  • Mohammad Abboush
  • Ehab Ghannoum
  • Andreas Rausch

Paper Information

  • arXiv ID: 2603.08165v1
  • Categories: cs.SE, cs.AI
  • Published: March 9, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »