paper — Page 96 | EUNO.NEWS

1 month ago · ai

[Paper] DiffusionBrowser: Interactive Diffusion Previews via Multi-Branch Decoders

Video diffusion models have revolutionized generative video synthesis, but they are imprecise, slow, and can be opaque during generation -- keeping users in the...

#research #paper #ai #machine-learning #computer-vision
1 month ago · ai

[Paper] LitePT: Lighter Yet Stronger Point Transformer

Modern neural architectures for 3D point cloud processing contain both convolutional layers and attention blocks, but the best way to assemble them remains uncl...

#research #paper #ai #computer-vision
1 month ago · ai

[Paper] Towards Scalable Pre-training of Visual Tokenizers for Generation

The quality of the latent space in visual tokenizers (e.g., VAEs) is crucial for modern generative models. However, the standard reconstruction-based training p...

#research #paper #ai #computer-vision
1 month ago · ai

[Paper] Beyond surface form: A pipeline for semantic analysis in Alzheimer's Disease detection from spontaneous speech

Alzheimer's Disease (AD) is a progressive neurodegenerative condition that adversely affects cognitive abilities. Language-related changes can be automatically ...

#research #paper #ai #nlp
1 month ago · ai

[Paper] Recurrent Video Masked Autoencoders

We present Recurrent Video Masked-Autoencoders (RVM): a novel video representation learning approach that uses a transformer-based recurrent neural network to a...

#research #paper #ai #computer-vision
1 month ago · ai

[Paper] I-Scene: 3D Instance Models are Implicit Generalizable Spatial Learners

Generalization remains the central challenge for interactive 3D scene generation. Existing learning-based approaches ground spatial understanding in limited sce...

#research #paper #ai #computer-vision
1 month ago · ai

[Paper] LASER: Layer-wise Scale Alignment for Training-Free Streaming 4D Reconstruction

Recent feed-forward reconstruction models like VGGT and π^3 achieve impressive reconstruction quality but cannot process streaming videos due to quadratic memor...

#research #paper #ai #computer-vision
1 month ago · ai

[Paper] Feedforward 3D Editing via Text-Steerable Image-to-3D

Recent progress in image-to-3D has opened up immense possibilities for design, AR/VR, and robotics. However, to use AI-generated 3D assets in real applications,...

#research #paper #ai #machine-learning #computer-vision
1 month ago · ai

[Paper] JoVA: Unified Multimodal Learning for Joint Video-Audio Generation

In this paper, we present JoVA, a unified framework for joint video-audio generation. Despite recent encouraging advances, existing methods face two critical li...

#research #paper #ai #computer-vision
1 month ago · ai

[Paper] Towards Effective Model Editing for LLM Personalization

Personalization is becoming indispensable for LLMs to align with individual user preferences and needs. Yet current approaches are often computationally expensi...

#research #paper #ai #nlp
1 month ago · ai

[Paper] Towards Interactive Intelligence for Digital Humans

We introduce Interactive Intelligence, a novel paradigm of digital human that is capable of personality-aligned expression, adaptive interaction, and self-evolu...

#research #paper #ai #nlp #computer-vision
1 month ago · ai

[Paper] Directional Textual Inversion for Personalized Text-to-Image Generation

Textual Inversion (TI) is an efficient approach to text-to-image personalization but often fails on complex prompts. We trace these failures to embedding norm i...

#research #paper #ai #machine-learning #computer-vision

Newer posts

Older posts