EUNO.NEWS EUNO.NEWS
  • All (21181) +146
  • AI (3169) +10
  • DevOps (940) +5
  • Software (11185) +102
  • IT (5838) +28
  • Education (48)
  • Notice
  • All (21181) +146
    • AI (3169) +10
    • DevOps (940) +5
    • Software (11185) +102
    • IT (5838) +28
    • Education (48)
  • Notice
  • All (21181) +146
  • AI (3169) +10
  • DevOps (940) +5
  • Software (11185) +102
  • IT (5838) +28
  • Education (48)
  • Notice
Sources Tags Search
한국어 English 中文
  • 3 weeks ago · ai

    [Paper] Snapshot 3D image projection using a diffractive decoder

    3D image display is essential for next-generation volumetric imaging; however, dense depth multiplexing for 3D image projection remains challenging because diff...

    #research #paper #ai #computer-vision
  • 3 weeks ago · ai

    [Paper] Generative Digital Twins: Vision-Language Simulation Models for Executable Industrial Systems

    We propose a Vision-Language Simulation Model (VLSM) that unifies visual and textual understanding to synthesize executable FlexScript from layout sketches and ...

    #research #paper #ai #machine-learning #nlp #computer-vision
  • 0 month ago · ai

    [Paper] The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

    Deep representations across modalities are inherently intertwined. In this paper, we systematically analyze the spectral characteristics of various semantic and...

    #research #paper #ai #computer-vision
  • 0 month ago · ai

    [Paper] Interact2Ar: Full-Body Human-Human Interaction Generation via Autoregressive Diffusion Models

    Generating realistic human-human interactions is a challenging task that requires not only high-quality individual body and hand motions, but also coherent coor...

    #research #paper #ai #computer-vision
  • 0 month ago · ai

    [Paper] Pushing the Frontier of Audiovisual Perception with Large-Scale Multimodal Correspondence Learning

    We introduce Perception Encoder Audiovisual, PE-AV, a new family of encoders for audio and video understanding trained with scaled contrastive learning. Built o...

    #research #paper #ai #machine-learning #computer-vision
  • 0 month ago · ai

    [Paper] Visual-Aware CoT: Achieving High-Fidelity Visual Consistency in Unified Models

    Recently, the introduction of Chain-of-Thought (CoT) has largely improved the generation ability of unified models. However, it is observed that the current thi...

    #research #paper #ai #computer-vision
  • 0 month ago · ai

    [Paper] Zero-shot Reconstruction of In-Scene Object Manipulation from Video

    We build the first system to address the problem of reconstructing in-scene object manipulation from a monocular RGB video. It is challenging due to ill-posed s...

    #research #paper #ai #computer-vision
  • 0 month ago · ai

    [Paper] From Indoor to Open World: Revealing the Spatial Reasoning Gap in MLLMs

    While Multimodal Large Language Models (MLLMs) have achieved impressive performance on semantic tasks, their spatial intelligence--crucial for robust and ground...

    #research #paper #ai #computer-vision
  • 0 month ago · ai

    [Paper] VA-$π$: Variational Policy Alignment for Pixel-Aware Autoregressive Generation

    Autoregressive (AR) visual generation relies on tokenizers to map images to and from discrete sequences. However, tokenizers are trained to reconstruct clean im...

    #research #paper #ai #computer-vision
  • 0 month ago · ai

    [Paper] WorldWarp: Propagating 3D Geometry with Asynchronous Video Diffusion

    Generating long-range, geometrically consistent video presents a fundamental dilemma: while consistency demands strict adherence to 3D geometry in pixel space, ...

    #research #paper #ai #machine-learning #computer-vision
  • 0 month ago · ai

    [Paper] Efficient Vision Mamba for MRI Super-Resolution via Hybrid Selective Scanning

    Background: High-resolution MRI is critical for diagnosis, but long acquisition times limit clinical use. Super-resolution (SR) can enhance resolution post-scan...

    #research #paper #ai #computer-vision
  • 0 month ago · ai

    [Paper] Multimodal LLMs for Historical Dataset Construction from Archival Image Scans: German Patents (1877-1918)

    We leverage multimodal large language models (LLMs) to construct a dataset of 306,070 German patents (1877-1918) from 9,562 archival image scans using our LLM-b...

    #research #paper #ai #computer-vision

Newer posts

Older posts
EUNO.NEWS
RSS GitHub © 2026