[Paper] Surflo: Consistent 3D Surface Flow Model with Global State
Source: arXiv - 2606.13644v1
Overview
Geometry is invariant to viewpoint, which makes any collection of images a redundant encoding of a single 3D state. Existing feed-forward reconstruction models fail to exploit this: per-view methods emit overlapping, unaligned pointmaps that grow linearly with input count, while global-latent methods commit to a fixed, low-resolution output. We introduce Surflo, which compresses a variable number of unposed RGB views into K latent tokens-one global state-and decodes oriented 3D surface points by independently transporting them from noise onto the surface via flow matching. This frees the output from any fixed grid or token budget: the same latent yields from a few thousand to a million points in a single forward pass. To suppress the local inconsistencies inherent to independent per-point decoding, an inference-time guidance term correlates nearby points by injecting a photometric gradient during ODE integration. Surflo matches or surpasses feed-forward baselines on surface metrics, runs an order of magnitude faster than optimization-based methods that require hundreds of views, and is the only feed-forward approach to combine a global latent with arbitrary-resolution decoding.
Key Contributions
This paper presents research in the following areas:
- cs.CV
Methodology
Please refer to the full paper for detailed methodology.
Practical Implications
This research contributes to the advancement of cs.CV.
Authors
- Antoine Guédon
- Shu Nakamura
- Nicolas Dufour
- Jiahui Lei
- Ko Nishino
- Angjoo Kanazawa
Paper Information
- arXiv ID: 2606.13644v1
- Categories: cs.CV
- Published: June 11, 2026
- PDF: Download PDF