[Paper] PredMapNet: Future and Historical Reasoning for Consistent Online HD Vectorized Map Construction
Source: arXiv - 2602.16669v1
Overview
The paper presents PredMapNet, an end‑to‑end framework that builds high‑definition (HD) vectorized maps on‑the‑fly while keeping them temporally consistent. By explicitly reasoning about both past observations and short‑term future motion, the system overcomes the jitter and drift that plague existing query‑based map construction pipelines.
Key Contributions
- Semantic‑Aware Query Generator – initializes map queries with spatially aligned semantic masks, giving the model a global scene context right from the start.
- History Rasterized Map Memory – a lightweight, per‑instance raster store that preserves fine‑grained past map geometry for explicit temporal priors.
- History‑Map Guidance Module – injects the rasterized history into current track queries, dramatically improving continuity across frames.
- Short‑Term Future Guidance – predicts immediate future positions of map elements and feeds them back as hints, preventing implausible jumps.
- State‑of‑the‑art performance on nuScenes and Argoverse‑2 with competitive inference speed, demonstrating that richer temporal reasoning does not have to sacrifice efficiency.
Methodology
- Input & Backbone – A sequence of front‑camera or LiDAR frames is processed by a standard vision backbone (e.g., ResNet or Swin) to extract dense feature maps.
- Semantic‑Aware Query Generation
- A semantic segmentation head produces class masks (road, lane, crosswalk, etc.).
- These masks are flattened into a set of queries that are already aligned with the underlying geometry, replacing the random initialization used in prior works.
- History Rasterized Map Memory
- For each tracked map instance (e.g., a lane segment), the system maintains a small raster canvas that accumulates its past vectorized shape.
- This memory is updated every timestep, providing a high‑resolution “ghost” of where the instance has been.
- History‑Map Guidance Module
- The raster memory is projected back into the query space via a cross‑attention layer, allowing the current query to “see” its own history.
- Short‑Term Future Guidance
- A lightweight motion predictor (e.g., a GRU‑based regressor) forecasts the next few meters of each instance based on its stored trajectory.
- The predicted future points are concatenated to the query, nudging the decoder toward temporally plausible outputs.
- Decoder & Vectorization
- A transformer decoder refines the queries and outputs a set of Bézier curves or polylines that constitute the vectorized HD map.
- The whole pipeline is trained end‑to‑end with a combination of classification, regression, and consistency losses.
Results & Findings
| Dataset | Metric (mAP) | Improvement vs. SOTA |
|---|---|---|
| nuScenes | 71.4 % | +3.2 pts |
| Argoverse‑2 | 68.9 % | +2.8 pts |
| Inference time (per frame) | ≈ 45 ms | comparable to prior query‑based methods |
- Temporal Consistency: Qualitative visualizations show smooth lane continuations across frames, with far fewer flickering artifacts.
- Efficiency: The added history memory and future predictor increase FLOPs by < 10 %, keeping the system suitable for real‑time deployment on automotive GPUs.
Practical Implications
- Robust Map Updates for Autonomous Vehicles: Fleet operators can continuously refine HD maps from live sensor streams without needing offline batch processing, reducing map latency from days to minutes.
- Developer‑Friendly API: Because the model consumes raw sensor frames and outputs vectorized polylines, it can be wrapped as a plug‑and‑play service in existing perception stacks (e.g., ROS nodes or NVIDIA DRIVE SDK).
- Better Planning & Control: Consistent lane geometry over time leads to more reliable downstream trajectory planning, especially in complex urban intersections where temporary occlusions often cause map drift.
- Edge Deployment: The modest computational overhead means the approach can run on embedded platforms (e.g., NVIDIA Jetson AGX) alongside other perception modules, enabling on‑vehicle map construction without cloud off‑load.
Limitations & Future Work
- Short‑Term Horizon: The future guidance predicts only a few meters ahead; longer‑range anticipation (e.g., for highway merges) remains unexplored.
- Memory Scaling: While the rasterized memory is lightweight per instance, handling thousands of concurrent instances in dense cityscapes may require hierarchical or compressed storage schemes.
- Sensor Modality: Experiments focus on camera‑centric pipelines; extending the framework to fuse LiDAR or radar could further improve robustness under adverse weather.
- Generalization: The model is evaluated on two datasets; testing on unseen cities or map styles would help assess cross‑domain adaptability.
PredMapNet demonstrates that marrying explicit historical priors with short‑term prediction yields more stable, accurate HD maps—an advance that could bring truly online map maintenance within reach of today’s autonomous driving stacks.
Authors
- Bo Lang
- Nirav Savaliya
- Zhihao Zheng
- Jinglun Feng
- Zheng‑Hang Yeh
- Mooi Choo Chuah
Paper Information
- arXiv ID: 2602.16669v1
- Categories: cs.CV
- Published: February 18, 2026
- PDF: Download PDF