[Paper] HomeWorld: A Unified Floorplan-to-Furnished Framework for Generating Controllable, Densely Interactive Whole-Home Scenes
Source: arXiv - 2606.06390v1
Overview
Indoor scene generation is crucial for robot simulation and modern interior design. However, complex layouts together with scarce 3D scene data make learning‑based generation challenging. Existing methods often rely on hand‑crafted rules or focus on isolated sub‑tasks (e.g., floorplan synthesis or single‑room furnishing), producing whole‑home scenes that lack global coherence, realism, and simulation readiness.
To mitigate these limitations, we propose a unified hierarchical framework that decomposes indoor scene synthesis into controllable stages:
- Floorplan generation – We curate a large‑scale dataset of 300 K real residential floorplans and train a large language model for whole‑home floorplan generation. Detailed descriptions and a K‑D tree‑based representation enable fine‑grained, controllable floorplan synthesis.
- Furniture layout – Building upon the generated floorplan, image generation models draft furniture layouts from multi‑level roaming viewpoints.
- Object placement – Layouts of small manipulable objects on supporting surfaces (e.g., cabinets, desks, dining tables) are generated for embodied AI simulation.
During furniture and object layout generation, a VLM‑based refiner iteratively corrects placements, and a 3D generative model enables flexible replacement of individual assets. We also attach basic physical attributes, surface textures, and lighting setups to complete the pipeline for embodied AI use.
Experiments and user studies demonstrate that our pipeline produces indoor spaces with greater layout diversity and stronger 3D design appeal, outperforming prior methods on both quantitative and qualitative metrics. Alongside the generation pipeline, we will release the floorplan dataset and 5 K fully furnished scenes to the community.
Project Page: https://kairos-homeworld.github.io/
Key Contributions
- Research areas: cs.CV, cs.AI
- Unified hierarchical framework for whole‑home indoor scene synthesis
- Large‑scale floorplan dataset (300 K) and 5 K fully furnished scenes for public release
Methodology
Please refer to the full paper for detailed methodology.
Practical Implications
This research contributes to the advancement of computer vision (cs.CV) and artificial intelligence (cs.AI) in indoor scene generation and embodied AI simulation.
Authors
- Wenbo Li
- Xiaoliang Ju
- Zipeng Qin
- Rongyao Fang
- Hongsheng Li
Paper Information
- arXiv ID: 2606.06390v1
- Categories: cs.CV, cs.AI
- Published: June 4, 2026
- PDF: Download PDF