computer-vision — Page 28

Sort:

2 months ago · ai · - · -

AI image generators are getting better by getting worse

Real ones will know that Mount Rainier looks too big in this image, but the re-creation of a Washington State ferry in this AI image is uncanny. This is The Ste...

#AI image generation #diffusion models #generative AI #computer vision #deep learning #stable diffusion #AI art
2 months ago · ai · - · -

The Evolution of AI Surveillance

AI Surveillance on British Roads On a grey morning along the A38 near Plymouth, a white van equipped with twin cameras captures thousands of images per hour, i...

#AI surveillance #computer vision #privacy #road safety #emotion recognition
2 months ago · ai · - · -

AI Background Remover: Image Quality and Edge Accuracy

Introduction An AI background remover can feel almost magical when it works well—and frustrating when it doesn’t. The difference usually comes down to two thin...

#background removal #image quality #edge accuracy #computer vision #AI models #image segmentation #deep learning
2 months ago · ai · - · -

[Paper] Moment-Based 3D Gaussian Splatting: Resolving Volumetric Occlusion with Order-Independent Transmittance

The recent success of 3D Gaussian Splatting (3DGS) has reshaped novel view synthesis by enabling fast optimization and real-time rendering of high-quality radia...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] V-RGBX: Video Editing with Accurate Controls over Intrinsic Properties

Large-scale video generation models have shown remarkable potential in modeling photorealistic appearance and lighting interactions in real-world scenes. Howeve...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] Particulate: Feed-Forward 3D Object Articulation

We present Particulate, a feed-forward approach that, given a single static 3D mesh of an everyday object, directly infers all attributes of the underlying arti...

#research #paper #ai #machine-learning #computer-vision
2 months ago · ai · - · -

[Paper] AnchorDream: Repurposing Video Diffusion for Embodiment-Aware Robot Data Synthesis

The collection of large-scale and diverse robot demonstrations remains a major bottleneck for imitation learning, as real-world data acquisition is costly and s...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] Structure From Tracking: Distilling Structure-Preserving Motion for Video Generation

Reality is a dance between rigid constraints and deformable structures. For video models, that means generating motion that preserves fidelity as well as struct...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] Uncertainty-Aware Domain Adaptation for Vitiligo Segmentation in Clinical Photographs

Accurately quantifying vitiligo extent in routine clinical photographs is crucial for longitudinal monitoring of treatment response. We propose a trustworthy, f...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] MatAnyone 2: Scaling Video Matting via a Learned Quality Evaluator

Video matting remains limited by the scale and realism of existing datasets. While leveraging segmentation data can enhance semantic stability, the lack of effe...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] Smudged Fingerprints: A Systematic Evaluation of the Robustness of AI Image Fingerprints

Model fingerprint detection techniques have emerged as a promising approach for attributing AI-generated images to their source models, but their robustness und...

#research #paper #ai #machine-learning #computer-vision
2 months ago · ai · - · -

[Paper] Reducing Domain Gap with Diffusion-Based Domain Adaptation for Cell Counting

Generating realistic synthetic microscopy images is critical for training deep learning models in label-scarce environments, such as cell counting with many cel...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] SVG-T2I: Scaling Up Text-to-Image Latent Diffusion Model Without Variational Autoencoder

Visual generation grounded in Visual Foundation Model (VFM) representations offers a highly promising unified pathway for integrating visual understanding, perc...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] DentalGPT: Incentivizing Multimodal Complex Reasoning in Dentistry

Reliable interpretation of multimodal data in dentistry is essential for automated oral healthcare, yet current multimodal large language models (MLLMs) struggl...

#research #paper #ai #machine-learning #nlp #computer-vision
2 months ago · ai · - · -

[Paper] HFS: Holistic Query-Aware Frame Selection for Efficient Video Reasoning

Key frame selection in video understanding presents significant challenges. Traditional top-K selection methods, which score frames independently, often fail to...

#research #paper #ai #nlp #computer-vision
2 months ago · ai · - · -

[Paper] Parallax: Runtime Parallelization for Operator Fallbacks in Heterogeneous Edge Systems

The growing demand for real-time DNN applications on edge devices necessitates faster inference of increasingly complex models. Although many devices include sp...

#research #paper #ai #machine-learning #computer-vision
2 months ago · ai · - · -

[Paper] StereoSpace: Depth-Free Synthesis of Stereo Geometry via End-to-End Diffusion in a Canonical Space

We introduce StereoSpace, a diffusion-based framework for monocular-to-stereo synthesis that models geometry purely through viewpoint conditioning, without expl...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] WorldLens: Full-Spectrum Evaluations of Driving World Models in Real World

Generative world models are reshaping embodied AI, enabling agents to synthesize realistic 4D driving environments that look convincing but often fail physicall...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] Empowering Dynamic Urban Navigation with Stereo and Mid-Level Vision

The success of foundation models in language and vision motivated research in fully end-to-end robot navigation foundation models (NFMs). NFMs directly map mono...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] Omni-Attribute: Open-vocabulary Attribute Encoder for Visual Concept Personalization

Visual concept personalization aims to transfer only specific image attributes, such as identity, expression, lighting, and style, into unseen contexts. However...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] SceneMaker: Open-set 3D Scene Generation with Decoupled De-occlusion and Pose Estimation Model

We propose a decoupled 3D scene generation framework called SceneMaker in this work. Due to the lack of sufficient open-set de-occlusion and pose estimation pri...

#research #paper #ai #machine-learning #computer-vision
2 months ago · ai · - · -

[Paper] Bidirectional Normalizing Flow: From Data to Noise and Back

Normalizing Flows (NFs) have been established as a principled framework for generative modeling. Standard NFs consist of a forward process and a reverse process...

#research #paper #ai #machine-learning #computer-vision
2 months ago · ai · - · -

[Paper] Group Diffusion: Enhancing Image Generation by Unlocking Cross-Sample Collaboration

In this work, we explore an untapped signal in diffusion model inference. While all previous methods generate images independently at inference, we instead ask ...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] E-RayZer: Self-supervised 3D Reconstruction as Spatial Visual Pre-training

Self-supervised pre-training has revolutionized foundation models for languages, individual 2D images and videos, but remains largely unexplored for learning 3D...

#research #paper #ai #computer-vision

Newer posts

Older posts