[Paper] An explainable hybrid deep learning-enabled intelligent fault detection and diagnosis approach for automotive software systems validation
Source: arXiv - 2603.08165v1
Overview
The paper presents an explainable hybrid deep‑learning framework for fault detection, diagnosis, and localization in automotive software systems (ASS) during validation. By marrying a 1‑D CNN‑GRU model with state‑of‑the‑art XAI techniques, the authors deliver a solution that not only spots failures in real‑time test drives but also tells engineers why a particular fault was flagged—bridging the gap between black‑box accuracy and practical interpretability.
Key Contributions
- Hybrid 1‑D CNN‑GRU architecture tailored for time‑series sensor data from hardware‑in‑the‑loop (HiL) test drives.
- Integration of multiple XAI methods (Integrated Gradients, DeepLIFT, Gradient SHAP, DeepLIFT‑SHAP) to generate per‑feature attribution maps for each fault prediction.
- End‑to‑end pipeline that goes from raw validation recordings to fault detection, identification, and precise localization of the root cause.
- Real‑world evaluation on a live HiL dataset collected during a virtual test drive, demonstrating both high detection accuracy and actionable explanations.
- Guidelines for model adaptation based on the explanations, enabling developers to iteratively refine the fault detection model without retraining from scratch.
Methodology
- Data Collection – The authors recorded multivariate time‑series signals (e.g., CAN bus messages, sensor readings) from a virtual test drive executed on a HiL platform.
- Pre‑processing – Signals were synchronized, normalized, and segmented into fixed‑length windows that capture the temporal dynamics of a driving scenario.
- Hybrid Model –
- 1‑D CNN layers act as feature extractors, learning local patterns (spikes, ramps) across the sensor channels.
- GRU (Gated Recurrent Unit) layers capture longer‑range temporal dependencies, crucial for detecting faults that manifest over several seconds.
- The combined network outputs a softmax over fault classes (including “no‑fault”).
- Explainability Layer – After a prediction, the model is probed with four XAI algorithms:
- Integrated Gradients (IG) – accumulates gradients along a straight‑line path from a baseline to the input.
- DeepLIFT – compares activation differences to a reference, attributing contributions to each input.
- Gradient SHAP – merges SHAP values with gradient information for smoother attributions.
- DeepLIFT‑SHAP – a hybrid that leverages both DeepLIFT and SHAP concepts.
These produce heat‑maps that highlight which sensor channels and time steps drove the decision.
- Root‑Cause Analysis (RCA) – By overlaying attribution maps on the original signals, engineers can pinpoint the exact subsystem (e.g., brake controller, throttle actuator) responsible for the fault.
Results & Findings
| Metric | Value |
|---|---|
| Fault detection accuracy | ≈ 96 % (across 5 fault types) |
| False‑positive rate | < 2 % |
| Explanation fidelity (measured by deletion/insertion tests) | > 0.85 for all XAI methods |
| Localization precision (distance between true fault source and top‑attributed sensor) | ≤ 0.3 s of signal time |
Key takeaways:
- The hybrid CNN‑GRU outperformed pure CNN or pure RNN baselines, especially on faults with subtle temporal signatures.
- Attribution maps were consistent across XAI methods, giving developers confidence that the model was focusing on the right signals.
- Using the explanations, the team could manually adjust a mis‑behaving subsystem and observe the model’s prediction shift without retraining, confirming the “model‑adaptation” claim.
Practical Implications
- Faster Validation Cycles – Engineers can run automated fault detection on HiL test runs and instantly receive a diagnostic report, cutting manual log‑analysis time by up to 70 %.
- Safety‑Critical Confidence – Explainable outputs satisfy regulatory demands (e.g., ISO 26262) for traceability of safety decisions, making it easier to certify autonomous driving software.
- Continuous Integration – The pipeline can be embedded into CI/CD for automotive ECUs, flagging regressions the moment a new firmware version is deployed to the test bench.
- Model Maintenance – Because the XAI layer reveals which features dominate predictions, developers can prune irrelevant sensors or recalibrate thresholds, reducing model size and inference latency for on‑board deployment.
- Cross‑Domain Reuse – The same architecture can be repurposed for other cyber‑physical systems (e.g., aerospace, industrial robotics) where time‑series fault detection with interpretability is required.
Limitations & Future Work
- Dataset Scope – The study used a single HiL scenario; broader validation across diverse vehicle platforms and real‑world driving data is needed to confirm generalizability.
- Explanation Granularity – While the XAI methods highlight influential sensors, they do not yet provide causal graphs linking multiple subsystems, which could further aid root‑cause isolation.
- Real‑Time Constraints – The current implementation runs offline; optimizing the model and XAI calculations for on‑board, sub‑millisecond inference remains an open challenge.
- User Study – The paper does not include a formal evaluation of how engineers interact with the explanations; future work could assess usability and decision‑making impact.
Bottom line: By delivering both high‑accuracy fault detection and transparent, actionable explanations, this hybrid deep‑learning approach paves the way for more trustworthy, efficient validation pipelines in modern automotive software development.
Authors
- Mohammad Abboush
- Ehab Ghannoum
- Andreas Rausch
Paper Information
- arXiv ID: 2603.08165v1
- Categories: cs.SE, cs.AI
- Published: March 9, 2026
- PDF: Download PDF