[Paper] LEAD: Minimizing Learner-Expert Asymmetry in End-to-End Driving

Published: (December 23, 2025 at 01:07 PM EST)
4 min read
Source: arXiv

Source: arXiv - 2512.20563v1

Overview

The paper LEAD investigates why imitation‑learning (IL) agents trained in high‑fidelity simulators still stumble when they have to drive autonomously. The authors pinpoint a fundamental “learner‑expert asymmetry”: the expert driver in the simulator enjoys privileged information (perfect visibility, knowledge of other agents’ intents) that the sensor‑limited student never sees. By narrowing this information gap, they push end‑to‑end driving performance to new heights on the CARLA benchmark and even improve real‑world vision‑based driving tests.

Key Contributions

  • Empirical analysis of learner‑expert asymmetry – quantifies how the expert’s perfect perception and low uncertainty hurt IL when the student only has raw camera/LiDAR data.
  • Practical interventions to reduce the asymmetry, including:
    • Adding realistic occlusion handling for the expert.
    • Providing the student with richer navigational cues (beyond a single target point).
    • Aligning uncertainty modeling between expert and student.
  • TransFuser v6 (TFv6) – a revised end‑to‑end architecture that incorporates the above fixes and achieves state‑of‑the‑art closed‑loop scores on all major CARLA benchmarks (e.g., 95 DS on Bench2Drive, >2× prior scores on Longest6 v2 and Town13).
  • Cross‑domain validation – integrates the same perception supervision into a sim‑to‑real pipeline, yielding consistent gains on NAVSIM and Waymo Vision‑Based End‑to‑End driving challenges.
  • Open‑source release – code, data, and pretrained models are publicly available, encouraging reproducibility and further research.

Methodology

  1. Diagnosing the asymmetry

    • The authors compare the expert’s observation space (full 3‑D map, perfect detection of other agents) with the student’s sensor suite (front‑camera, LiDAR, limited field‑of‑view).
    • They measure performance drops when the expert’s “privilege” is removed (e.g., artificially occluding the expert’s view).
  2. Bridging the gap

    • Perception alignment: augment the expert’s data with realistic sensor noise and occlusions, making its demonstrations more representative of what the student will see.
    • Intent specification: feed the student a short‑term waypoint trajectory derived from the navigation graph instead of a single target point.
    • Uncertainty modeling: train both expert and student to predict a distribution over future actions, encouraging the student to cope with ambiguous situations.
  3. Model architecture (TFv6)

    • Builds on the TransFuser backbone (multi‑modal transformer that fuses camera, LiDAR, and map inputs).
    • Adds a navigation encoder for the waypoint sequence and a confidence head that outputs action uncertainty.
    • Trains with a combined loss: imitation loss on expert actions + perception loss (segmentation, depth) + uncertainty regularization.
  4. Evaluation pipeline

    • Closed‑loop driving tests in CARLA (Bench2Drive, Longest6 v2, Town13).
    • Sim‑to‑real transfer experiments on NAVSIM and Waymo Vision‑Based benchmarks, using the same perception‑supervised weights.

Results & Findings

BenchmarkMetric (higher is better)TFv6 ScorePrior SOTAImprovement
Bench2Drive (CARLA)Driving Score (DS)9578+22 %
Longest6 v2 (CARLA)Success Rate92 %44 %>2×
Town13 (CARLA)Completion %88 %41 %>2×
NAVSIM (sim‑to‑real)Route Completion+8 % over baseline
Waymo Vision‑BasedCollision Rate ↓0.12 %0.27 %
  • Removing expert privilege (adding occlusions) drops the expert’s own performance by ~15 %, confirming that the asymmetry is a real bottleneck.
  • The perception‑supervised TFv6 model learns more robust visual features, leading to fewer off‑road events and collisions in both simulation and real‑world datasets.

Practical Implications

  • Better data generation pipelines: When creating synthetic expert demonstrations, deliberately inject realistic sensor noise and occlusions to make the data more “student‑friendly.”
  • Richer navigation inputs: Providing a short waypoint horizon (instead of a single goal) is a low‑cost way to dramatically improve IL stability for autonomous driving stacks.
  • Uncertainty‑aware policies: Training the model to output confidence estimates helps downstream safety modules (e.g., fallback planners) make smarter decisions.
  • Sim‑to‑real transfer: The same perception supervision that improves simulation performance also boosts real‑world benchmarks, suggesting a unified training regime for companies building vision‑based driving stacks.
  • Open‑source toolkit: The released LEAD repository can be plugged into existing end‑to‑end pipelines (e.g., CARLA, AirSim) to quickly evaluate the impact of learner‑expert alignment on any new model.

Limitations & Future Work

  • The study is confined to the CARLA simulator and two real‑world benchmarks; broader validation on diverse sensor suites (radar, event cameras) remains open.
  • The navigation encoder relies on a pre‑computed waypoint graph; dynamic route changes (e.g., traffic‑aware re‑planning) are not yet explored.
  • Uncertainty modeling is limited to a simple Gaussian head; richer distributional predictions (mixture models, Bayesian networks) could further improve safety.
  • Scaling the approach to full‑scale city‑wide simulations and long‑duration drives will require more efficient data pipelines and possibly curriculum learning strategies.

LEAD demonstrates that the “secret sauce” for high‑performing imitation‑learning drivers isn’t just more data—it’s making the expert’s perspective realistic enough for the student to actually learn from it. By aligning perception, intent, and uncertainty, the authors set a new benchmark for end‑to‑end autonomous driving and provide a practical roadmap for developers looking to bridge the simulation‑to‑reality gap.

Authors

  • Long Nguyen
  • Micha Fauth
  • Bernhard Jaeger
  • Daniel Dauner
  • Maximilian Igl
  • Andreas Geiger
  • Kashyap Chitta

Paper Information

  • arXiv ID: 2512.20563v1
  • Categories: cs.CV, cs.AI, cs.LG, cs.RO
  • Published: December 23, 2025
  • PDF: Download PDF
Back to Blog

Related posts

Read more »