EUNO.NEWS EUNO.NEWS
  • All (2682) +359
  • AI (585) +26
  • DevOps (156) +8
  • Software (1140) +205
  • IT (795) +119
  • Education (6) +1
  • Notice
  • All (2682) +359
    • AI (585) +26
    • DevOps (156) +8
    • Software (1140) +205
    • IT (795) +119
    • Education (6) +1
  • Notice
  • All (2682) +359
  • AI (585) +26
  • DevOps (156) +8
  • Software (1140) +205
  • IT (795) +119
  • Education (6) +1
  • Notice
Sources Tags Search
한국어 English 中文
  • 1 week ago · ai

    [Paper] Frequency-Aware Token Reduction for Efficient Vision Transformer

    Vision Transformers have demonstrated exceptional performance across various computer vision tasks, yet their quadratic computational complexity concerning toke...

    #vision transformers #token reduction #frequency-aware pruning #computer vision #model efficiency
  • 1 week ago · ai

    [Paper] MobileI2V: Fast and High-Resolution Image-to-Video on Mobile Devices

    Recently, video generation has witnessed rapid advancements, drawing increasing attention to image-to-video (I2V) synthesis on mobile devices. However, the subs...

    #research #paper #ai #computer-vision
  • 1 week ago · ai

    [Paper] EvRainDrop: HyperGraph-guided Completion for Effective Frame and Event Stream Aggregation

    Event cameras produce asynchronous event streams that are spatially sparse yet temporally dense. Mainstream event representation learning algorithms typically u...

    #event cameras #hypergraph neural network #multimodal fusion #computer vision #deep learning
  • 1 week ago · ai

    [Paper] E-M3RF: An Equivariant Multimodal 3D Re-assembly Framework

    3D reassembly is a fundamental geometric problem, and in recent years it has increasingly been challenged by deep learning methods rather than classical optimiz...

    #equivariant neural networks #multimodal 3D reconstruction #point cloud processing #computer vision
  • 1 week ago · ai

    [Paper] SAM Guided Semantic and Motion Changed Region Mining for Remote Sensing Change Captioning

    Remote sensing change captioning is an emerging and popular research task that aims to describe, in natural language, the content of interest that has changed b...

    #research #paper #ai #machine-learning #computer-vision
  • 1 week ago · ai

    [Paper] Monet: Reasoning in Latent Visual Space Beyond Images and Language

    'Thinking with images' has emerged as an effective paradigm for advancing visual reasoning, extending beyond text-only chains of thought by injecting visual evi...

    #research #paper #ai #machine-learning #computer-vision
  • 1 week ago · ai

    [Paper] Thinking With Bounding Boxes: Enhancing Spatio-Temporal Video Grounding via Reinforcement Fine-Tuning

    Spatio-temporal video grounding (STVG) requires localizing a target object in untrimmed videos both temporally and spatially from natural language descriptions....

    #research #paper #ai #computer-vision
  • 1 week ago · ai

    [Paper] Endo-G$^{2}$T: Geometry-Guided & Temporally Aware Time-Embedded 4DGS For Endoscopic Scenes

    Endoscopic (endo) video exhibits strong view-dependent effects such as specularities, wet reflections, and occlusions. Pure photometric supervision misaligns wi...

    #4D Gaussian Splatting #endoscopic reconstruction #computer vision #depth estimation #real-time rendering
  • 1 week ago · ai

    [Paper] PFF-Net: Patch Feature Fitting for Point Cloud Normal Estimation

    Estimating the normal of a point requires constructing a local patch to provide center-surrounding context, but determining the appropriate neighborhood size is...

    #research #paper #ai #computer-vision
  • 1 week ago · ai

    [Paper] SurgMLLMBench: A Multimodal Large Language Model Benchmark Dataset for Surgical Scene Understanding

    Recent advances in multimodal large language models (LLMs) have highlighted their potential for medical and surgical applications. However, existing surgical da...

    #research #paper #ai #machine-learning #computer-vision
  • 1 week ago · ai

    [Paper] Hybrid SIFT-SNN for Efficient Anomaly Detection of Traffic Flow-Control Infrastructure

    This paper presents the SIFT-SNN framework, a low-latency neuromorphic signal-processing pipeline for real-time detection of structural anomalies in transport i...

    #research #paper #ai #machine-learning #computer-vision
  • 1 week ago · ai

    [Paper] The More, the Merrier: Contrastive Fusion for Higher-Order Multimodal Alignment

    Learning joint representations across multiple modalities remains a central challenge in multimodal machine learning. Prevailing approaches predominantly operat...

    #research #paper #ai #machine-learning #computer-vision

Newer posts

Older posts
EUNO.NEWS
RSS GitHub © 2025