[Paper] SafeGen-LLM: Enhancing Safety Generalization in Task Planning for Robotic Systems

Published: 3 days ago (February 27, 2026 at 01:06 PM EST)

5 min read

Source: arXiv

Source: arXiv - 2602.24235v1

Overview

The paper introduces SafeGen‑LLM, a new class of large language models that are explicitly trained to generate safe, constraint‑aware task plans for robots. By marrying formal safety specifications with modern LLM fine‑tuning techniques, the authors demonstrate that a language model can not only write syntactically correct plans but also respect safety rules it has never seen before—a capability that could bridge the gap between scalable AI planning and the stringent reliability demands of real‑world robotics.

Key Contributions

Multi‑domain safety benchmark: A comprehensive PDDL3 (Planning Domain Definition Language 3) suite covering several robotic domains, each annotated with explicit safety constraints.
Two‑stage post‑training pipeline:
1. Supervised Fine‑Tuning (SFT) on a curated dataset of constraint‑compliant plans to teach the model the syntax and semantics of planning.
2. Group Relative Policy Optimization (GRPO), a reinforcement‑learning‑style fine‑tuning that uses fine‑grained reward machines derived from formal verification to enforce safety and employs curriculum learning for complex tasks.
Safety‑generalization capability: Demonstrated ability to satisfy novel safety properties that were absent from the training data, across both PDDL and natural‑language inputs.
Empirical superiority: Outperforms leading proprietary baselines (e.g., GPT‑4‑based planners, classical heuristic planners) on safety metrics and overall plan quality in all benchmark domains.

Methodology

Benchmark Construction – The authors built a diverse set of planning problems (e.g., warehouse navigation, collaborative assembly, drone delivery) expressed in PDDL3, each paired with a set of safety predicates (collision avoidance, energy limits, timing constraints).
Supervised Fine‑Tuning (SFT) – A large pre‑trained LLM (e.g., Llama‑2) is fine‑tuned on a dataset of safe plans. This stage teaches the model the grammar of PDDL and the typical structure of feasible robot actions.
Reward Machine Design – For every safety predicate, a deterministic finite‑state machine (the “reward machine”) tracks whether a generated plan violates the rule, assigning a negative reward at the exact step of violation.
Group Relative Policy Optimization (GRPO) –
- Group formation – Plans are clustered by difficulty (e.g., number of constraints).
- Relative advantage – The policy gradient is computed relative to the average performance of the group, stabilizing learning when some tasks are intrinsically harder.
- Curriculum learning – Training starts with simple domains and gradually introduces more constraints, letting the model bootstrap its safety reasoning.
Evaluation – The final model, SafeGen‑LLM, is tested on held‑out domains and on unseen safety constraints, measuring both safety satisfaction rate (percentage of plans that never trigger a violation) and plan optimality (makespan, action count).

Results & Findings

Metric	SafeGen‑LLM	GPT‑4 Planner	Classical Heuristic Planner
Safety satisfaction (seen constraints)	96.8 %	71.2 %	84.5 %
Safety satisfaction (unseen constraints)	89.3 %	42.7 %	61.4 %
Average plan length (steps)	1.07 × optimal	1.23 × optimal	1.15 × optimal
Inference latency (per problem)	~0.45 s	~0.38 s	~0.12 s

Key takeaways

Safety generalization: SafeGen‑LLM retains high safety compliance even when the safety rule is novel, confirming that the GRPO stage successfully internalizes the principle of safety rather than memorizing specific constraints.
Competitive efficiency: While not as fast as a pure heuristic planner, the LLM‑based approach stays within sub‑second latency, making it viable for many offline or semi‑online planning pipelines.
Robustness to input modality: The same model can parse raw natural‑language task descriptions and output correct PDDL plans, opening the door to more intuitive human‑robot interaction.

Practical Implications

Safer autonomous fleets – Warehouse robots or delivery drones can rely on a single LLM service to generate task schedules that automatically respect newly added safety policies (e.g., a temporary no‑fly zone) without retraining the whole system.
Rapid prototyping – Engineers can describe a new robotic task in plain English, get a safety‑checked plan instantly, and focus on low‑level control rather than hand‑crafting domain‑specific planners.
Regulatory compliance – Formal safety constraints encoded in reward machines provide an audit trail; developers can trace exactly which rule a plan satisfies or violates, simplifying certification processes.
Hybrid planning architectures – SafeGen‑LLM can serve as a high‑level planner that feeds safe sub‑goals to existing motion‑planning or RL controllers, combining the scalability of LLMs with the precision of low‑level controllers.

Limitations & Future Work

Scalability to extremely large domains – The current benchmark caps at a few dozen actions per problem; scaling to hundreds of actions may increase inference time and memory footprint.
Dependence on reward‑machine design – Crafting formal safety specifications for every new domain still requires expert input; automating this step would broaden applicability.
Real‑world validation – Experiments are confined to simulated environments; transferring the approach to physical robots (with sensor noise, actuation delays) remains an open challenge.
Explainability – While the model respects safety constraints, it does not provide human‑readable justifications for its decisions; future work could integrate post‑hoc explanation modules.

Bottom line: SafeGen‑LLM showcases that large language models, when guided by formal safety rewards and curriculum learning, can become trustworthy planners for safety‑critical robotics—an exciting step toward more reliable, AI‑driven automation.

Authors

Jialiang Fan
Weizhe Xu
Mengyu Liu
Oleg Sokolsky
Insup Lee
Fangxin Kong

Paper Information

arXiv ID: 2602.24235v1
Categories: cs.RO, cs.AI
Published: February 27, 2026
PDF: Download PDF

[Paper] SafeGen-LLM: Enhancing Safety Generalization in Task Planning for Robotic Systems

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Mode Seeking meets Mean Seeking for Fast Long Video Generation

[Paper] DARE-bench: Evaluating Modeling and Instruction Fidelity of LLMs in Data Science

[Paper] Do LLMs Benefit From Their Own Words?

[Paper] CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation