[Paper] Under One Sun: Multi-Object Generative Perception of Materials and Illumination

Published: (March 19, 2026 at 01:59 PM EDT)
2 min read
Source: arXiv

Source: arXiv - 2603.19226v1

Overview

We introduce Multi-Object Generative Perception (MultiGP), a generative inverse rendering method for stochastic sampling of all radiometric constituents — reflectance, texture, and illumination — underlying object appearance from a single image. Our key idea to solve this inherently ambiguous radiometric disentanglement is to leverage the fact that while their texture and reflectance may differ, objects in the same scene are all lit by the same illumination. MultiGP exploits this consensus to produce samples of reflectance, texture, and illumination from a single image of known shapes based on four key technical contributions: a cascaded end-to-end architecture that combines image-space and angular-space disentanglement; Coordinated Guidance for diffusion convergence to a single consistent illumination estimate; Axial Attention applied to facilitate “cross-talk” between objects of different reflectance; and a Texture Extraction ControlNet to preserve high-frequency texture details while ensuring decoupling from estimated lighting. Experimental results demonstrate that MultiGP effectively leverages the complementary spatial and frequency characteristics of multiple object appearances to recover individual texture and reflectance as well as the common illumination.

Key Contributions

This paper presents research in the following areas:

  • cs.CV

Methodology

Please refer to the full paper for detailed methodology.

Practical Implications

This research contributes to the advancement of cs.CV.

Authors

  • Nobuo Yoshii
  • Xinran Nicole Han
  • Ryo Kawahara
  • Todd Zickler
  • Ko Nishino

Paper Information

  • arXiv ID: 2603.19226v1
  • Categories: cs.CV
  • Published: March 19, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »

[Paper] Matryoshka Gaussian Splatting

The ability to render scenes at adjustable fidelity from a single model, known as level of detail (LoD), is crucial for practical deployment of 3D Gaussian Spla...