EUNO.NEWS EUNO.NEWS
  • All (21181) +146
  • AI (3169) +10
  • DevOps (940) +5
  • Software (11185) +102
  • IT (5838) +28
  • Education (48)
  • Notice
  • All (21181) +146
    • AI (3169) +10
    • DevOps (940) +5
    • Software (11185) +102
    • IT (5838) +28
    • Education (48)
  • Notice
  • All (21181) +146
  • AI (3169) +10
  • DevOps (940) +5
  • Software (11185) +102
  • IT (5838) +28
  • Education (48)
  • Notice
Sources Tags Search
한국어 English 中文
  • 1 month ago · ai

    [Paper] LitePT: Lighter Yet Stronger Point Transformer

    Modern neural architectures for 3D point cloud processing contain both convolutional layers and attention blocks, but the best way to assemble them remains uncl...

    #research #paper #ai #computer-vision
  • 1 month ago · ai

    [Paper] Towards Scalable Pre-training of Visual Tokenizers for Generation

    The quality of the latent space in visual tokenizers (e.g., VAEs) is crucial for modern generative models. However, the standard reconstruction-based training p...

    #research #paper #ai #computer-vision
  • 1 month ago · ai

    [Paper] Recurrent Video Masked Autoencoders

    We present Recurrent Video Masked-Autoencoders (RVM): a novel video representation learning approach that uses a transformer-based recurrent neural network to a...

    #research #paper #ai #computer-vision
  • 1 month ago · ai

    [Paper] I-Scene: 3D Instance Models are Implicit Generalizable Spatial Learners

    Generalization remains the central challenge for interactive 3D scene generation. Existing learning-based approaches ground spatial understanding in limited sce...

    #research #paper #ai #computer-vision
  • 1 month ago · ai

    [Paper] LASER: Layer-wise Scale Alignment for Training-Free Streaming 4D Reconstruction

    Recent feed-forward reconstruction models like VGGT and π^3 achieve impressive reconstruction quality but cannot process streaming videos due to quadratic memor...

    #research #paper #ai #computer-vision
  • 1 month ago · ai

    [Paper] Feedforward 3D Editing via Text-Steerable Image-to-3D

    Recent progress in image-to-3D has opened up immense possibilities for design, AR/VR, and robotics. However, to use AI-generated 3D assets in real applications,...

    #research #paper #ai #machine-learning #computer-vision
  • 1 month ago · ai

    [Paper] JoVA: Unified Multimodal Learning for Joint Video-Audio Generation

    In this paper, we present JoVA, a unified framework for joint video-audio generation. Despite recent encouraging advances, existing methods face two critical li...

    #research #paper #ai #computer-vision
  • 1 month ago · ai

    [Paper] Towards Interactive Intelligence for Digital Humans

    We introduce Interactive Intelligence, a novel paradigm of digital human that is capable of personality-aligned expression, adaptive interaction, and self-evolu...

    #research #paper #ai #nlp #computer-vision
  • 1 month ago · ai

    [Paper] Directional Textual Inversion for Personalized Text-to-Image Generation

    Textual Inversion (TI) is an efficient approach to text-to-image personalization but often fails on complex prompts. We trace these failures to embedding norm i...

    #research #paper #ai #machine-learning #computer-vision
  • 1 month ago · ai

    [Paper] World Models Can Leverage Human Videos for Dexterous Manipulation

    Dexterous manipulation is challenging because it requires understanding how subtle hand motion influences the environment through contact with objects. We intro...

    #research #paper #ai #machine-learning #computer-vision
  • 1 month ago · ai

    [Paper] From Code to Field: Evaluating the Robustness of Convolutional Neural Networks for Disease Diagnosis in Mango Leaves

    The validation and verification of artificial intelligence (AI) models through robustness assessment are essential to guarantee the reliable performance of inte...

    #research #paper #ai #machine-learning #computer-vision
  • 1 month ago · ai

    [Paper] Do-Undo: Generating and Reversing Physical Actions in Vision-Language Models

    We introduce the Do-Undo task and benchmark to address a critical gap in vision-language models: understanding and generating physically plausible scene transfo...

    #research #paper #ai #machine-learning #computer-vision

Newer posts

Older posts
EUNO.NEWS
RSS GitHub © 2026