EUNO.NEWS EUNO.NEWS
  • All (21181) +146
  • AI (3169) +10
  • DevOps (940) +5
  • Software (11185) +102
  • IT (5838) +28
  • Education (48)
  • Notice
  • All (21181) +146
    • AI (3169) +10
    • DevOps (940) +5
    • Software (11185) +102
    • IT (5838) +28
    • Education (48)
  • Notice
  • All (21181) +146
  • AI (3169) +10
  • DevOps (940) +5
  • Software (11185) +102
  • IT (5838) +28
  • Education (48)
  • Notice
Sources Tags Search
한국어 English 中文
  • 1 month ago · ai

    [Paper] Multimodal RewardBench 2: Evaluating Omni Reward Models for Interleaved Text and Image

    Reward models (RMs) are essential for training large language models (LLMs), but remain underexplored for omni models that handle interleaved image and text seq...

    #research #paper #ai #nlp #computer-vision
  • 1 month ago · ai

    [Paper] LinkedOut: Linking World Knowledge Representation Out of Video LLM for Next-Generation Video Recommendation

    Video Large Language Models (VLLMs) unlock world-knowledge-aware video understanding through pretraining on internet-scale data and have already shown promise o...

    #research #paper #ai #machine-learning #computer-vision
  • 1 month ago · ai

    [Paper] Training Together, Diagnosing Better: Federated Learning for Collagen VI-Related Dystrophies

    The application of Machine Learning (ML) to the diagnosis of rare diseases, such as collagen VI-related dystrophies (COL6-RD), is fundamentally limited by the s...

    #research #paper #ai #machine-learning #computer-vision
  • 1 month ago · ai

    [Paper] Spatia: Video Generation with Updatable Spatial Memory

    Existing video generation models struggle to maintain long-term spatial and temporal consistency due to the dense, high-dimensional nature of video signals. To ...

    #research #paper #ai #machine-learning #computer-vision
  • 1 month ago · ai

    [Paper] In Pursuit of Pixel Supervision for Visual Pre-training

    At the most basic level, pixels are the source of the visual information through which we perceive the world. Pixels contain information at all levels, ranging ...

    #research #paper #ai #computer-vision
  • 1 month ago · ai

    [Paper] DiffusionVL: Translating Any Autoregressive Models into Diffusion Vision Language Models

    In recent multimodal research, the diffusion paradigm has emerged as a promising alternative to the autoregressive paradigm (AR), owing to its unique decoding a...

    #research #paper #ai #computer-vision
  • 1 month ago · ai

    [Paper] Gaussian Pixel Codec Avatars: A Hybrid Representation for Efficient Rendering

    We present Gaussian Pixel Codec Avatars (GPiCA), photorealistic head avatars that can be generated from multi-view images and efficiently rendered on mobile dev...

    #research #paper #ai #computer-vision
  • 1 month ago · ai

    [Paper] Multi-View Foundation Models

    Foundation models are vital tools in various Computer Vision applications. They take as input a single RGB image and output a deep feature representation that i...

    #research #paper #ai #computer-vision
  • 1 month ago · ai

    [Paper] GateFusion: Hierarchical Gated Cross-Modal Fusion for Active Speaker Detection

    Active Speaker Detection (ASD) aims to identify who is currently speaking in each frame of a video. Most state-of-the-art approaches rely on late fusion to comb...

    #research #paper #ai #computer-vision
  • 1 month ago · ai

    [Paper] End-to-End Training for Autoregressive Video Diffusion via Self-Resampling

    Autoregressive video diffusion models hold promise for world simulation but are vulnerable to exposure bias arising from the train-test mismatch. While recent w...

    #research #paper #ai #computer-vision
  • 1 month ago · ai

    [Paper] VLIC: Vision-Language Models As Perceptual Judges for Human-Aligned Image Compression

    Evaluations of image compression performance which include human preferences have generally found that naive distortion functions such as MSE are insufficiently...

    #research #paper #ai #computer-vision
  • 1 month ago · ai

    [Paper] Skyra: AI-Generated Video Detection via Grounded Artifact Reasoning

    The misuse of AI-driven video generation technologies has raised serious social concerns, highlighting the urgent need for reliable AI-generated video detectors...

    #research #paper #ai #computer-vision

Newer posts

Older posts
EUNO.NEWS
RSS GitHub © 2026