EUNO.NEWS EUNO.NEWS
  • All (2650) +327
  • AI (584) +25
  • DevOps (153) +5
  • Software (1126) +191
  • IT (781) +105
  • Education (6) +1
  • Notice
  • All (2650) +327
    • AI (584) +25
    • DevOps (153) +5
    • Software (1126) +191
    • IT (781) +105
    • Education (6) +1
  • Notice
  • All (2650) +327
  • AI (584) +25
  • DevOps (153) +5
  • Software (1126) +191
  • IT (781) +105
  • Education (6) +1
  • Notice
Sources Tags Search
한국어 English 中文
  • 1 week ago · ai

    [Paper] Toward Automatic Safe Driving Instruction: A Large-Scale Vision Language Model Approach

    Large-scale Vision Language Models (LVLMs) exhibit advanced capabilities in tasks that require visual information, including object detection. These capabilitie...

    #research #paper #ai #machine-learning #nlp #computer-vision
  • 1 week ago · ai

    [Paper] Canvas-to-Image: Compositional Image Generation with Multimodal Controls

    While modern diffusion models excel at generating high-quality and diverse images, they still struggle with high-fidelity compositional and multimodal control, ...

    #image generation #diffusion models #multimodal control #computer vision #research
  • 1 week ago · ai

    [Paper] TraceGen: World Modeling in 3D Trace Space Enables Learning from Cross-Embodiment Videos

    Learning new robot tasks on new platforms and in new scenes from only a handful of demonstrations remains challenging. While videos of other embodiments - human...

    #research #paper #ai #machine-learning #computer-vision
  • 1 week ago · ai

    [Paper] G$^2$VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning

    Vision-Language Models (VLMs) still lack robustness in spatial intelligence, demonstrating poor performance on spatial understanding and reasoning tasks. We att...

    #research #paper #ai #machine-learning #nlp #computer-vision
  • 1 week ago · ai

    [Paper] Seeing without Pixels: Perception from Camera Trajectories

    Can one perceive a video's content without seeing its pixels, just from the camera trajectory-the path it carves through space? This paper is the first to syste...

    #research #paper #ai #computer-vision
  • 1 week ago · ai

    [Paper] Revolutionizing Glioma Segmentation & Grading Using 3D MRI - Guided Hybrid Deep Learning Models

    Gliomas are brain tumor types that have a high mortality rate which means early and accurate diagnosis is important for therapeutic intervention for the tumors....

    #research #paper #ai #computer-vision
  • 1 week ago · ai

    [Paper] Uncertainty Quantification for Visual Object Pose Estimation

    Quantifying the uncertainty of an object's pose estimate is essential for robust control and planning. Although pose estimation is a well-studied robotics probl...

    #research #paper #ai #computer-vision
  • 1 week ago · ai

    [Paper] Multi-Crit: Benchmarking Multimodal Judges on Pluralistic Criteria-Following

    Large multimodal models (LMMs) are increasingly adopted as judges in multimodal evaluation systems due to their strong instruction following and consistency wit...

    #research #paper #ai #computer-vision
  • 1 week ago · ai

    [Paper] CaFlow: Enhancing Long-Term Action Quality Assessment with Causal Counterfactual Flow

    Action Quality Assessment (AQA) predicts fine-grained execution scores from action videos and is widely applied in sports, rehabilitation, and skill evaluation....

    #action-quality-assessment #causal-inference #video-analysis #computer-vision #long-term-temporal-modeling
  • 1 week ago · ai

    [Paper] Mechanisms of Non-Monotonic Scaling in Vision Transformers

    Deeper Vision Transformers often perform worse than shallower ones, which challenges common scaling assumptions. Through a systematic empirical analysis of ViT-...

    #research #paper #ai #machine-learning #computer-vision
  • 1 week ago · ai

    [Paper] Qwen3-VL Technical Report

    We introduce Qwen3-VL, the most capable vision-language model in the Qwen series to date, achieving superior performance across a broad range of multimodal benc...

    #research #paper #ai #machine-learning #computer-vision
  • 1 week ago · ai

    [Paper] Active Learning for GCN-based Action Recognition

    Despite the notable success of graph convolutional networks (GCNs) in skeleton-based action recognition, their performance often depends on large volumes of lab...

    #active learning #graph convolutional networks #action recognition #skeleton-based vision #computer vision

Newer posts

Older posts
EUNO.NEWS
RSS GitHub © 2025