[Paper] Data-Driven Methods and AI in Engineering Design: A Systematic Literature Review Focusing on Challenges and Opportunities

Published: 2 months ago (November 25, 2025 at 06:16 AM EST)

4 min read

Source: arXiv

Source: arXiv - 2511.20730v1

Overview

A new systematic literature review maps how data‑driven methods (DDMs) and AI are being used across the engineering design lifecycle. By analysing 114 peer‑reviewed studies from the past decade, the authors expose where machine‑learning (ML) techniques are thriving, where they’re still rare, and what hurdles engineers face when trying to embed AI into real‑world product development.

Key Contributions

Comprehensive mapping of DDMs (ML, statistical, deep learning, surrogate models) onto the four stages of the V‑model: system design, implementation, integration, and validation.
Quantitative trends showing ML and classic statistical tools dominate today, while deep learning usage is accelerating.
Identification of gaps: validation stage receives far fewer AI‑driven contributions, and cross‑stage traceability remains weak.
Challenge taxonomy covering interpretability, data quality, model transferability, and real‑world validation.
Road‑map for future work, calling for interpretable hybrid models and tighter alignment between computer‑science algorithms and engineering design tasks.

Methodology

The authors followed the PRISMA systematic review protocol:

Scope definition – Adopted a simplified V‑model (design → implementation → integration → validation) as the reference framework.
Database search – Queried Scopus, Web of Science, and IEEE Xplore for papers published between 2014‑2024 using keywords around “data‑driven”, “AI”, and “engineering design”.
Screening – From an initial 1,689 records, duplicates and out‑of‑scope papers were removed, leaving 114 studies for full‑text analysis.
Classification – Each study was coded for:
- Type of DDM (e.g., supervised learning, clustering, deep learning, surrogate modeling)
- Lifecycle stage where it was applied
- Reported challenges and validation approach
Synthesis – Aggregated counts, trend lines, and thematic analysis produced the final insights.

The process is deliberately transparent so that other researchers can replicate or extend the review.

Results & Findings

Lifecycle Stage	Dominant DDMs	Emerging Techniques	Notable Gaps
System Design	Supervised learning (regression, classification), clustering, surrogate models	Deep learning (e.g., generative design) – still <10%	Limited use of reinforcement learning for concept exploration
System Implementation	Regression, statistical DOE, surrogate modeling	DL for component‑level prediction	Few studies address real‑time model updates
System Integration	Multi‑objective optimization, clustering, surrogate models	DL for system‑level performance prediction	Sparse coverage of cross‑disciplinary data fusion
Validation	Mostly statistical validation (cross‑validation, error metrics)	Almost no DL‑based validation	Lack of field‑testing or hardware‑in‑the‑loop experiments

Trend: Deep learning citations grew from <2 % in 2014 to >15 % in 2023, indicating rising confidence but still early‑stage adoption.
Challenges:
- Interpretability: Engineers struggle to trust black‑box models for safety‑critical decisions.
- Traceability: Linking a model’s output back to design requirements across stages is cumbersome.
- Real‑world validation: Most papers stop at simulated validation; few deploy models on physical prototypes.

Practical Implications

For Product Development Teams – Off‑the‑shelf ML tools (regression, clustering) are mature enough for early‑stage design and integration tasks. Teams can start small, using these methods to accelerate trade‑off studies without heavy infrastructure.
Tool Vendors – There’s a market opportunity for platforms that embed interpretability (e.g., SHAP, LIME) and version‑controlled model provenance directly into PLM/ALM systems, closing the traceability gap.
AI Engineers – The upward trend in deep learning suggests a need to develop domain‑specific architectures (e.g., graph neural nets for CAD geometry) that can be safely transferred to later stages.
Quality & Safety Assurance – The scarcity of validation‑stage AI calls for new testing frameworks that combine simulation, digital twins, and hardware‑in‑the‑loop experiments, enabling regulators and certifiers to assess AI‑augmented designs.
Education & Training – Curriculum designers should emphasize hybrid modeling (combining physics‑based and data‑driven components) to produce engineers who can both build and critically evaluate AI models.

Limitations & Future Work

Scope restriction – The review only considered papers indexed in three major databases; relevant industry white‑papers or conference demos may be missing.
V‑model simplification – Real‑world development often follows iterative or agile cycles, which the four‑stage mapping may not fully capture.
Depth of analysis – While the study quantifies method prevalence, it does not evaluate the comparative performance of different DDMs on identical design problems.

Future research should (1) create a taxonomy that directly links specific AI algorithms to concrete engineering design tasks, (2) develop guidelines for building interpretable hybrid models, and (3) design robust validation pipelines that bring AI from simulation into physical prototypes.

Authors

Nehal Afifi
Christoph Wittig
Lukas Paehler
Andreas Lindenmann
Kai Wolter
Felix Leitenberger
Melih Dogru
Patric Grauberger
Tobias Düser
Albert Albers
Sven Matthiesen

Paper Information

arXiv ID: 2511.20730v1
Categories: cs.SE, cs.AI, cs.LG
Published: November 25, 2025

[Paper] Data-Driven Methods and AI in Engineering Design: A Systematic Literature Review Focusing on Challenges and Opportunities

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

The Machine Learning and Deep Learning “Advent Calendar” Series: The Blueprint

Building production AI on Google Cloud TPUs with JAX

Building AI Agents with Google Gemini 3 and Open Source Frameworks

The Machine Learning Lessons I’ve Learned This Month