[Paper] Reducing Domain Gap with Diffusion-Based Domain Adaptation for Cell Counting
Generating realistic synthetic microscopy images is critical for training deep learning models in label-scarce environments, such as cell counting with many cel...
Generating realistic synthetic microscopy images is critical for training deep learning models in label-scarce environments, such as cell counting with many cel...
Visual generation grounded in Visual Foundation Model (VFM) representations offers a highly promising unified pathway for integrating visual understanding, perc...
Reliable interpretation of multimodal data in dentistry is essential for automated oral healthcare, yet current multimodal large language models (MLLMs) struggl...
Key frame selection in video understanding presents significant challenges. Traditional top-K selection methods, which score frames independently, often fail to...
The growing demand for real-time DNN applications on edge devices necessitates faster inference of increasingly complex models. Although many devices include sp...
We introduce StereoSpace, a diffusion-based framework for monocular-to-stereo synthesis that models geometry purely through viewpoint conditioning, without expl...
Generative world models are reshaping embodied AI, enabling agents to synthesize realistic 4D driving environments that look convincing but often fail physicall...
The success of foundation models in language and vision motivated research in fully end-to-end robot navigation foundation models (NFMs). NFMs directly map mono...
Visual concept personalization aims to transfer only specific image attributes, such as identity, expression, lighting, and style, into unseen contexts. However...
We propose a decoupled 3D scene generation framework called SceneMaker in this work. Due to the lack of sufficient open-set de-occlusion and pose estimation pri...
Normalizing Flows (NFs) have been established as a principled framework for generative modeling. Standard NFs consist of a forward process and a reverse process...
In this work, we explore an untapped signal in diffusion model inference. While all previous methods generate images independently at inference, we instead ask ...