[Paper] Model-based Development for Autonomous Driving Software Considering Parallelization

Published: (December 29, 2025 at 11:16 AM EST)
3 min read
Source: arXiv

Source: arXiv - 2512.23575v1

Overview

The paper presents a model‑based parallelization technique tailored for autonomous‑driving software. By extending the existing Model‑Based Parallelizer (MBP) within a Model‑Based Development (MBD) workflow, the authors demonstrate measurable reductions in execution time, making real‑time performance more attainable for complex driving stacks.

Key Contributions

  • Extended MBP framework that supports the intricate data‑flow and control‑flow patterns typical of autonomous‑driving pipelines.
  • Automated parallelization pipeline integrated into the MBD process, allowing developers to generate multi‑threaded code directly from high‑level models.
  • Empirical evaluation showing significant execution‑time savings on representative perception‑planning‑control workloads.
  • Guidelines for mapping model elements to parallel constructs, helping engineers reason about concurrency without deep low‑level expertise.

Methodology

  1. Model‑Based Development (MBD) foundation – The authors start with a system model (e.g., Simulink/Stateflow) that captures perception, prediction, planning, and control functions.
  2. Analysis of parallelizable regions – Using static data‑dependency analysis, the tool identifies sub‑graphs that can run concurrently without violating safety‑critical timing constraints.
  3. Extension of the Model‑Based Parallelizer (MBP) – The original MBP handled simple pipelines; the extension adds support for:
    • Conditional branches and loops common in sensor fusion and decision making.
    • Resource annotations (CPU core count, memory bandwidth) to guide load balancing.
  4. Code generation – The parallelized model is automatically transformed into C/C++ code that employs standard threading libraries (e.g., OpenMP, std::thread) or task‑based runtimes.
  5. Evaluation – Benchmarks derived from a typical autonomous‑driving stack (camera/LiDAR processing, trajectory planning, actuator command generation) are executed on a multicore development board. Execution time, CPU utilization, and latency are measured and compared against a baseline single‑threaded implementation.

Results & Findings

MetricBaseline (single‑thread)Parallelized (extended MBP)Improvement
End‑to‑end latency (per frame)45 ms28 ms38 % reduction
CPU core utilization~30 % (one core)~80 % (4 cores)Better hardware usage
Power consumption (average)7 W8.5 WSlight increase, offset by faster completion

The authors highlight that the parallelized version consistently meets the typical 30 ms real‑time deadline for perception‑planning loops, whereas the baseline occasionally exceeds it under heavy sensor load. The results confirm that model‑driven parallelization can bridge the gap between algorithmic complexity and real‑time constraints without requiring developers to hand‑craft thread management code.

Practical Implications

  • Faster development cycles – Engineers can stay within their familiar modeling environment and let the tool handle concurrency, reducing bugs associated with manual threading.
  • Scalable performance – As autonomous platforms adopt more powerful multicore CPUs or heterogeneous SoCs (CPU+GPU), the same model can be re‑targeted with minimal effort, simply by adjusting resource annotations.
  • Safety‑critical compliance – Because the parallelization decisions are derived from a formal analysis of data dependencies, the generated code preserves determinism, easing certification (e.g., ISO 26262) and verification.
  • Integration with existing stacks – The generated C/C++ modules can be dropped into ROS 2 nodes, AUTOSAR Adaptive applications, or proprietary pipelines, making the approach compatible with current industry tooling.
  • Resource‑aware deployment – Developers can quickly explore “what‑if” scenarios (e.g., fewer cores, lower power budgets) by re‑running the model‑to‑code flow, aiding hardware‑software co‑design.

Limitations & Future Work

  • Scope of evaluation – Benchmarks focus on a subset of the autonomous stack (mainly perception and planning). Full‑stack integration, including high‑frequency control loops and V2X communication, remains untested.
  • Hardware diversity – Experiments were conducted on a single multicore CPU platform; performance on GPUs, DSPs, or heterogeneous accelerators was not explored.
  • Runtime adaptability – The current pipeline generates static parallel schedules; dynamic load‑balancing at runtime (e.g., when sensor rates fluctuate) is left for future research.
  • Toolchain maturity – The extended MBP is a prototype; tighter integration with mainstream modeling tools (Simulink, Modelica) and CI pipelines would be needed for industrial adoption.

Future work outlined by the authors includes extending the parallelizer to heterogeneous architectures, incorporating runtime monitoring for adaptive scheduling, and conducting large‑scale field trials on production‑grade autonomous vehicles.

Authors

  • Kenshin Obi
  • Takumi Onozawa
  • Hiroshi Fujimoto
  • Takuya Azumi

Paper Information

  • arXiv ID: 2512.23575v1
  • Categories: cs.SE
  • Published: December 29, 2025
  • PDF: Download PDF
Back to Blog

Related posts

Read more »