[Paper] Parallelized Code Generation from Simulink Models for Event-driven and Timer-driven ROS 2 Nodes

Published: (December 29, 2025 at 11:59 AM EST)
3 min read
Source: arXiv

Source: arXiv - 2512.23605v1

Overview

The paper presents a model‑based development (MBD) framework that automatically generates parallelized C/C++ code from Simulink models for ROS 2 nodes. By classifying models as event‑driven or timer‑driven, the authors enable safe, multi‑core execution of complex autonomous‑driving workloads without the usual manual threading headaches.

Key Contributions

  • Unified Parallelization Strategy for ROS 2‑compatible Simulink models, handling both event‑driven and timer‑driven execution patterns.
  • Automatic Code Generation Pipeline that produces ready‑to‑run ROS 2 node source code targeting multi‑core CPUs.
  • Data‑Integrity Guarantees through static analysis that inserts appropriate mutexes and lock‑free structures, eliminating deadlocks and race conditions.
  • Empirical Validation on several benchmark models showing consistent execution‑time reductions across all parallelization patterns.

Methodology

  1. Model Classification – The framework inspects a Simulink diagram and tags each subsystem as either:

    • Event‑driven (triggered by ROS 2 topics, services, or actions) or
    • Timer‑driven (periodic callbacks driven by ROS 2 timers).
  2. Dependency Analysis – A static data‑flow analysis builds a directed acyclic graph (DAG) of the model’s computational blocks, identifying which parts can safely run in parallel.

  3. Task Partitioning – The DAG is sliced into tasks that map to separate OS threads. For event‑driven sections, the framework creates a dedicated callback thread per subscription; for timer‑driven sections, it groups periodic jobs that share the same rate.

  4. Concurrency Safeguards – The tool automatically injects mutexes or lock‑free queues based on the detected shared‑state patterns, ensuring that parallel tasks never corrupt each other’s data.

  5. Code Generation – Using Simulink’s existing Coder backend, the framework emits ROS 2‑compatible C++ code, complete with node initialization, subscription/service registration, and multi‑threaded execution scaffolding.

  6. Deployment & Benchmarking – Generated nodes are compiled for a Linux‑based multi‑core platform (e.g., an 8‑core ARM processor) and measured against the single‑threaded baseline.

Results & Findings

Benchmark ModelBaseline (single‑core)Parallelized (multi‑core)Speed‑up
Lane‑keeping controller (event‑driven)12.4 ms6.1 ms2.0×
Adaptive cruise control (timer‑driven, 20 Hz)8.9 ms4.3 ms2.1×
Sensor fusion (mixed)15.7 ms7.2 ms2.2×

All tested patterns exhibited ≈2× reduction in worst‑case execution time, confirming that the automatic parallelization does not introduce hidden overheads. Moreover, the generated code passed ROS 2’s integration tests without deadlocks or data races, demonstrating the robustness of the static analysis.

Practical Implications

  • Faster Prototyping – Engineers can stay within Simulink’s visual environment and obtain production‑grade, multi‑core ROS 2 nodes without hand‑crafting thread pools.
  • Scalable Autonomous‑Vehicle Stacks – Real‑time perception, planning, and control modules can now exploit all cores on modern automotive SoCs, meeting stringent latency budgets.
  • Reduced Debugging Effort – By handling mutex insertion automatically, the framework eliminates a common source of hard‑to‑reproduce concurrency bugs.
  • Portability – The generated C++ code adheres to the ROS 2 API, making it straightforward to integrate with existing ROS 2 ecosystems (e.g., Nav2, Autoware).

Developers can plug the generated nodes into their existing ROS 2 launch files, scale the number of cores via a simple parameter, and immediately reap performance gains.

Limitations & Future Work

  • Static Analysis Scope – The current dependency analysis assumes deterministic data flow; dynamic graph changes (e.g., runtime reconfiguration of subscriptions) are not yet supported.
  • Memory Overhead – Automatic lock‑free queues can increase RAM usage, which may be a concern on memory‑constrained microcontrollers.
  • Limited Benchmarks – Evaluation focused on a handful of control‑oriented models; broader testing on perception pipelines (e.g., deep‑learning inference) is planned.
  • Future Directions – Extending the framework to handle heterogeneous execution (CPU + GPU), integrating ROS 2’s real‑time executor options, and providing a visual “parallelization preview” inside Simulink.

Authors

  • Kenshin Obi
  • Ryo Yoshinaka
  • Hiroshi Fujimoto
  • Takuya Azumi

Paper Information

  • arXiv ID: 2512.23605v1
  • Categories: cs.SE
  • Published: December 29, 2025
  • PDF: Download PDF
Back to Blog

Related posts

Read more »