EUNO.NEWS EUNO.NEWS
  • All (21181) +146
  • AI (3169) +10
  • DevOps (940) +5
  • Software (11185) +102
  • IT (5838) +28
  • Education (48)
  • Notice
  • All (21181) +146
    • AI (3169) +10
    • DevOps (940) +5
    • Software (11185) +102
    • IT (5838) +28
    • Education (48)
  • Notice
  • All (21181) +146
  • AI (3169) +10
  • DevOps (940) +5
  • Software (11185) +102
  • IT (5838) +28
  • Education (48)
  • Notice
Sources Tags Search
한국어 English 中文
  • 0 month ago · ai

    [Paper] Beyond CLIP: Knowledge-Enhanced Multimodal Transformers for Cross-Modal Alignment in Diabetic Retinopathy Diagnosis

    Diabetic retinopathy (DR) is a leading cause of preventable blindness worldwide, demanding accurate automated diagnostic systems. While general-domain vision-la...

    #research #paper #ai #machine-learning #computer-vision
  • 0 month ago · ai

    [Paper] MapTrace: Scalable Data Generation for Route Tracing on Maps

    While Multimodal Large Language Models have achieved human-like performance on many visual and textual reasoning tasks, their proficiency in fine-grained spatia...

    #research #paper #ai #machine-learning #computer-vision
  • 0 month ago · ai

    [Paper] KerJEPA: Kernel Discrepancies for Euclidean Self-Supervised Learning

    Recent breakthroughs in self-supervised Joint-Embedding Predictive Architectures (JEPAs) have established that regularizing Euclidean representations toward iso...

    #research #paper #ai #machine-learning #computer-vision
  • 0 month ago · ai

    YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications

    Overview YOLOv6 is a new step in object detection designed for factories, stores, and cameras everywhere. Built by a team focused on speed and reliability, it...

    #YOLOv6 #object detection #computer vision #real‑time AI #edge computing #industrial AI #open source
  • 0 month ago · ai

    [Paper] Point What You Mean: Visually Grounded Instruction Policy

    Vision-Language-Action (VLA) models align vision and language with embodied control, but their object referring ability remains limited when relying solely on t...

    #research #paper #ai #computer-vision
  • 0 month ago · ai

    [Paper] LouvreSAE: Sparse Autoencoders for Interpretable and Controllable Style Transfer

    Artistic style transfer in generative models remains a significant challenge, as existing methods often introduce style only via model fine-tuning, additional a...

    #research #paper #ai #machine-learning #computer-vision
  • 0 month ago · ai

    [Paper] Delta-LLaVA: Base-then-Specialize Alignment for Token-Efficient Vision-Language Models

    Multimodal Large Language Models (MLLMs) combine visual and textual representations to enable rich reasoning capabilities. However, the high computational cost ...

    #research #paper #ai #computer-vision
  • 0 month ago · ai

    [Paper] Thinking Beyond Labels: Vocabulary-Free Fine-Grained Recognition using Reasoning-Augmented LMMs

    Vocabulary-free fine-grained image recognition aims to distinguish visually similar categories within a meta-class without a fixed, human-defined label set. Exi...

    #research #paper #ai #computer-vision
  • 0 month ago · ai

    [Paper] Localising Shortcut Learning in Pixel Space via Ordinal Scoring Correlations for Attribution Representations (OSCAR)

    Deep neural networks often exploit shortcuts. These are spurious cues which are associated with output labels in the training data but are unrelated to task sem...

    #research #paper #ai #computer-vision
  • 0 month ago · ai

    **Myth: Computer Vision is only effective for images and not

    Myth: Computer Vision is only effective for images and not for videos. Reality: Computer Vision can handle both images and videos, thanks to advancements in tem...

    #computer vision #video analysis #deep learning #temporal processing #AI myths
  • 0 month ago · ai

    [Paper] Application of deep learning approaches for medieval historical documents transcription

    Handwritten text recognition and optical character recognition solutions show excellent results with processing data of modern era, but efficiency drops with La...

    #research #paper #ai #machine-learning #nlp #computer-vision
  • 1 month ago · ai

    In Defense of the Triplet Loss for Person Re-Identification

    Introduction Person re-identification re-ID is the task of finding the same individual across different camera views. It has important applications in security...

    #triplet loss #person re-identification #computer vision #deep learning #metric learning #end-to-end training

Newer posts

Older posts
EUNO.NEWS
RSS GitHub © 2026