EUNO.NEWS EUNO.NEWS
  • All (21181) +146
  • AI (3169) +10
  • DevOps (940) +5
  • Software (11185) +102
  • IT (5838) +28
  • Education (48)
  • Notice
  • All (21181) +146
    • AI (3169) +10
    • DevOps (940) +5
    • Software (11185) +102
    • IT (5838) +28
    • Education (48)
  • Notice
  • All (21181) +146
  • AI (3169) +10
  • DevOps (940) +5
  • Software (11185) +102
  • IT (5838) +28
  • Education (48)
  • Notice
Sources Tags Search
한국어 English 中文
  • 1 week ago · ai

    [Paper] InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields

    Existing depth estimation methods are fundamentally limited to predicting depth on discrete image grids. Such representations restrict their scalability to arbi...

    #research #paper #ai #computer-vision
  • 1 week ago · ai

    [Paper] A Versatile Multimodal Agent for Multimedia Content Generation

    With the advancement of AIGC (AI-generated content) technologies, an increasing number of generative models are revolutionizing fields such as video editing, mu...

    #research #paper #ai #computer-vision
  • 1 week ago · ai

    [Paper] LTX-2: Efficient Joint Audio-Visual Foundation Model

    Recent text-to-video diffusion models can generate compelling video sequences, yet they remain silent -- missing the semantic, emotional, and atmospheric cues t...

    #research #paper #ai #computer-vision
  • 1 week ago · ai

    [Paper] UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision

    While Unified Multimodal Models (UMMs) have achieved remarkable success in cross-modal comprehension, a significant gap persists in their ability to leverage su...

    #research #paper #ai #machine-learning #computer-vision
  • 1 week ago · ai

    [Paper] AnatomiX, an Anatomy-Aware Grounded Multimodal Large Language Model for Chest X-Ray Interpretation

    Multimodal medical large language models have shown impressive progress in chest X-ray interpretation but continue to face challenges in spatial reasoning and a...

    #research #paper #ai #machine-learning #computer-vision
  • 1 week ago · ai

    [Paper] Multi-Modal Data-Enhanced Foundation Models for Prediction and Control in Wireless Networks: A Survey

    Foundation models (FMs) are recognized as a transformative breakthrough that has started to reshape the future of artificial intelligence (AI) across both acade...

    #research #paper #ai #machine-learning #nlp #computer-vision
  • 1 week ago · ai

    [Paper] DiffBench Meets DiffAgent: End-to-End LLM-Driven Diffusion Acceleration Code Generation

    Diffusion models have achieved remarkable success in image and video generation. However, their inherently multiple step inference process imposes substantial c...

    #research #paper #ai #computer-vision
  • 1 week ago · ai

    [Paper] LSP-DETR: Efficient and Scalable Nuclei Segmentation in Whole Slide Images

    Precise and scalable instance segmentation of cell nuclei is essential for computational pathology, yet gigapixel Whole-Slide Images pose major computational ch...

    #research #paper #ai #computer-vision
  • 1 week ago · ai

    [Paper] Unified Thinker: A General Reasoning Modular Core for Image Generation

    Despite impressive progress in high-fidelity image synthesis, generative models still struggle with logic-intensive instruction following, exposing a persistent...

    #research #paper #ai #machine-learning #computer-vision
  • 2 weeks ago · ai

    Global Attention Mechanism: Retain Information to Enhance Channel-SpatialInteractions

    Overview Global attention helps computers see pictures better—without losing the details. By retaining information across the whole image, models can preserve...

    #global attention #computer vision #image recognition #channel-spatial interaction #deep learning #neural networks #mobile AI
  • 2 weeks ago · ai

    [Paper] ExposeAnyone: Personalized Audio-to-Expression Diffusion Models Are Robust Zero-Shot Face Forgery Detectors

    Detecting unknown deepfake manipulations remains one of the most challenging problems in face forgery detection. Current state-of-the-art approaches fail to gen...

    #research #paper #ai #computer-vision
  • 2 weeks ago · ai

    [Paper] VINO: A Unified Visual Generator with Interleaved OmniModal Context

    We present VINO, a unified visual generator that performs image and video generation and editing within a single framework. Instead of relying on task-specific ...

    #research #paper #ai #computer-vision

Newer posts

Older posts
EUNO.NEWS
RSS GitHub © 2026