EUNO.NEWS EUNO.NEWS
  • All (21181) +146
  • AI (3169) +10
  • DevOps (940) +5
  • Software (11185) +102
  • IT (5838) +28
  • Education (48)
  • Notice
  • All (21181) +146
    • AI (3169) +10
    • DevOps (940) +5
    • Software (11185) +102
    • IT (5838) +28
    • Education (48)
  • Notice
  • All (21181) +146
  • AI (3169) +10
  • DevOps (940) +5
  • Software (11185) +102
  • IT (5838) +28
  • Education (48)
  • Notice
Sources Tags Search
한국어 English 中文
  • 0 month ago · ai

    [Paper] Interact2Ar: Full-Body Human-Human Interaction Generation via Autoregressive Diffusion Models

    Generating realistic human-human interactions is a challenging task that requires not only high-quality individual body and hand motions, but also coherent coor...

    #research #paper #ai #computer-vision
  • 0 month ago · ai

    [Paper] Scalably Enhancing the Clinical Validity of a Task Benchmark with Physician Oversight

    Automating the calculation of clinical risk scores offers a significant opportunity to reduce physician administrative burden and enhance patient care. The curr...

    #research #paper #ai #machine-learning
  • 0 month ago · ai

    [Paper] Pushing the Frontier of Audiovisual Perception with Large-Scale Multimodal Correspondence Learning

    We introduce Perception Encoder Audiovisual, PE-AV, a new family of encoders for audio and video understanding trained with scaled contrastive learning. Built o...

    #research #paper #ai #machine-learning #computer-vision
  • 0 month ago · ai

    [Paper] Visual-Aware CoT: Achieving High-Fidelity Visual Consistency in Unified Models

    Recently, the introduction of Chain-of-Thought (CoT) has largely improved the generation ability of unified models. However, it is observed that the current thi...

    #research #paper #ai #computer-vision
  • 0 month ago · ai

    [Paper] Zero-shot Reconstruction of In-Scene Object Manipulation from Video

    We build the first system to address the problem of reconstructing in-scene object manipulation from a monocular RGB video. It is challenging due to ill-posed s...

    #research #paper #ai #computer-vision
  • 0 month ago · ai

    [Paper] From Indoor to Open World: Revealing the Spatial Reasoning Gap in MLLMs

    While Multimodal Large Language Models (MLLMs) have achieved impressive performance on semantic tasks, their spatial intelligence--crucial for robust and ground...

    #research #paper #ai #computer-vision
  • 0 month ago · ai

    [Paper] GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators

    Training capable Large Language Model (LLM) agents is critically bottlenecked by the high cost and static nature of real-world interaction data. We address this...

    #research #paper #ai #nlp
  • 0 month ago · ai

    [Paper] VA-$π$: Variational Policy Alignment for Pixel-Aware Autoregressive Generation

    Autoregressive (AR) visual generation relies on tokenizers to map images to and from discrete sequences. However, tokenizers are trained to reconstruct clean im...

    #research #paper #ai #computer-vision
  • 0 month ago · ai

    [Paper] WorldWarp: Propagating 3D Geometry with Asynchronous Video Diffusion

    Generating long-range, geometrically consistent video presents a fundamental dilemma: while consistency demands strict adherence to 3D geometry in pixel space, ...

    #research #paper #ai #machine-learning #computer-vision
  • 0 month ago · ai

    [Paper] Efficient Vision Mamba for MRI Super-Resolution via Hybrid Selective Scanning

    Background: High-resolution MRI is critical for diagnosis, but long acquisition times limit clinical use. Super-resolution (SR) can enhance resolution post-scan...

    #research #paper #ai #computer-vision
  • 0 month ago · ai

    [Paper] Multimodal LLMs for Historical Dataset Construction from Archival Image Scans: German Patents (1877-1918)

    We leverage multimodal large language models (LLMs) to construct a dataset of 306,070 German patents (1877-1918) from 9,562 archival image scans using our LLM-b...

    #research #paper #ai #computer-vision
  • 0 month ago · ai

    [Paper] Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

    Existing reinforcement learning (RL) approaches treat large language models (LLMs) as a single unified policy, overlooking their internal mechanisms. Understand...

    #research #paper #ai #machine-learning #nlp

Newer posts

Older posts
EUNO.NEWS
RSS GitHub © 2026