[Paper] Drivora: A Unified and Extensible Infrastructure for Search-based Autonomous Driving Testing

Published: 1 month ago (January 9, 2026 at 05:08 AM EST)

4 min read

Source: arXiv

Source: arXiv - 2601.05685v1

Overview

The paper introduces Drivora, an open‑source platform that unifies and streamlines search‑based testing for autonomous‑driving systems (ADSs) on the popular CARLA simulator. By providing a single, extensible scenario language and a modular architecture, Drivora cuts down the engineering overhead that currently hampers large‑scale, reproducible testing across different simulators, scenario spaces, and ADS implementations.

Key Contributions

Unified scenario definition (OpenScenario) – a low‑level, parameter‑driven format that works with existing test‑generation methods while remaining open to new testing paradigms (e.g., multi‑vehicle interactions).
Modular architecture – clean separation of the evolutionary testing engine, scenario execution layer, and ADS integration layer, enabling plug‑and‑play of components.
Scalable parallel execution – a batch‑simulation scheduler that maximizes CPU/GPU utilization for massive scenario runs.
Multi‑ADS support – out‑of‑the‑box connectors for 12 different autonomous‑driving stacks through a common API, simplifying comparative studies and regression testing.
Open‑source release – full code, documentation, and example workloads are publicly available on GitHub, encouraging community contributions and reproducibility.

Methodology

Drivora builds on CARLA, a high‑fidelity open‑source driving simulator. The workflow is:

Scenario Specification – Test engineers write scenarios in OpenScenario, a JSON/YAML‑based schema that lists concrete, actionable parameters (e.g., vehicle speed, lane offset, weather).
Search‑Based Engine – An evolutionary algorithm (EA) treats each scenario as a chromosome. The EA mutates and recombines parameters to explore the space, guided by fitness functions such as collision count, lane‑departure distance, or safety‑metric violations.
Parallel Execution Layer – Drivora launches many CARLA instances in parallel, each consuming a distinct scenario from the EA’s population. Results are streamed back to the engine in real time, allowing the EA to evolve the next generation quickly.
ADS Integration – A thin adaptor abstracts the communication between CARLA and any supported ADS (e.g., Apollo, Autoware, proprietary stacks). The adaptor translates sensor feeds and control commands, making the ADS appear as a black‑box module.

The design deliberately keeps each component interchangeable: developers can swap the EA for a reinforcement‑learning generator, replace CARLA with another simulator, or add a new ADS by implementing the unified interface.

Results & Findings

Efficiency Gains – In benchmark experiments, Drivora’s parallel scheduler achieved up to 6× speed‑up compared with naïve sequential simulation, enabling the generation of thousands of test scenarios within a few hours on a modest GPU cluster.
Scenario Diversity – The unified OpenScenario format allowed the same evolutionary engine to produce both single‑vehicle edge cases (e.g., sudden pedestrian crossing) and multi‑vehicle interaction cases (e.g., cut‑in maneuvers) without code changes.
Cross‑ADS Comparisons – Using the 12 bundled ADS connectors, the authors demonstrated that the same set of generated scenarios exposed distinct failure patterns across different stacks, highlighting the value of a common testing harness for comparative safety analysis.
Reproducibility – All experiments were fully reproducible from the public repository, confirming that the infrastructure can be adopted by external teams with minimal setup effort.

Practical Implications

Accelerated QA Pipelines – Companies can plug Drivora into their continuous‑integration pipelines to automatically generate high‑risk driving scenarios each night, catching regressions before on‑road testing.
Standardized Benchmarking – Researchers and OEMs can use the unified scenario language and ADS API to run head‑to‑head safety benchmarks, fostering transparent performance reporting.
Cost‑Effective Scaling – By leveraging commodity hardware for parallel simulation, firms can run large‑scale search‑based testing without investing in expensive proprietary simulators.
Extensibility for New Domains – The modular design makes it straightforward to add emerging testing dimensions—such as V2X communication failures or sensor spoofing attacks—by extending the OpenScenario schema and providing a custom fitness function.

Limitations & Future Work

Simulator Dependency – Drivora currently hinges on CARLA; while the architecture is theoretically simulator‑agnostic, porting to other platforms would require non‑trivial effort.
Fitness Function Design – The quality of generated scenarios heavily depends on well‑crafted fitness metrics; the paper notes that automated metric synthesis remains an open challenge.
Real‑World Transferability – As with any simulation‑based testing, the fidelity gap between CARLA and real‑world driving conditions can limit the direct applicability of discovered bugs.
Future Directions – The authors plan to (i) add support for additional simulators (e.g., LGSVL), (ii) integrate learning‑based test generators alongside the EA, and (iii) develop a cloud‑native orchestration layer to further simplify large‑scale deployments.

Authors

Mingfei Cheng
Lionel Briand
Yuan Zhou

Paper Information

arXiv ID: 2601.05685v1
Categories: cs.SE
Published: January 9, 2026
PDF: Download PDF

[Paper] Drivora: A Unified and Extensible Infrastructure for Search-based Autonomous Driving Testing

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] SSR: Safeguarding Staking Rewards by Defining and Detecting Logical Defects in DeFi Staking

[Paper] EET: Experience-Driven Early Termination for Cost-Efficient Software Engineering Agents

[Paper] StriderSPD: Structure-Guided Joint Representation Learning for Binary Security Patch Detection

[Paper] From Issues to Insights: RAG-based Explanation Generation from Software Engineering Artifacts