ai — Page 69 | EUNO.NEWS

0 month ago · ai

[Paper] From Indoor to Open World: Revealing the Spatial Reasoning Gap in MLLMs

While Multimodal Large Language Models (MLLMs) have achieved impressive performance on semantic tasks, their spatial intelligence--crucial for robust and ground...

#research #paper #ai #computer-vision
0 month ago · ai

[Paper] GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators

Training capable Large Language Model (LLM) agents is critically bottlenecked by the high cost and static nature of real-world interaction data. We address this...

#research #paper #ai #nlp
0 month ago · ai

[Paper] VA-$π$: Variational Policy Alignment for Pixel-Aware Autoregressive Generation

Autoregressive (AR) visual generation relies on tokenizers to map images to and from discrete sequences. However, tokenizers are trained to reconstruct clean im...

#research #paper #ai #computer-vision
0 month ago · ai

[Paper] WorldWarp: Propagating 3D Geometry with Asynchronous Video Diffusion

Generating long-range, geometrically consistent video presents a fundamental dilemma: while consistency demands strict adherence to 3D geometry in pixel space, ...

#research #paper #ai #machine-learning #computer-vision
0 month ago · ai

[Paper] Efficient Vision Mamba for MRI Super-Resolution via Hybrid Selective Scanning

Background: High-resolution MRI is critical for diagnosis, but long acquisition times limit clinical use. Super-resolution (SR) can enhance resolution post-scan...

#research #paper #ai #computer-vision
0 month ago · ai

[Paper] Multimodal LLMs for Historical Dataset Construction from Archival Image Scans: German Patents (1877-1918)

We leverage multimodal large language models (LLMs) to construct a dataset of 306,070 German patents (1877-1918) from 9,562 archival image scans using our LLM-b...

#research #paper #ai #computer-vision
0 month ago · ai

[Paper] Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

Existing reinforcement learning (RL) approaches treat large language models (LLMs) as a single unified policy, overlooking their internal mechanisms. Understand...

#research #paper #ai #machine-learning #nlp
0 month ago · ai

[Paper] Beyond CLIP: Knowledge-Enhanced Multimodal Transformers for Cross-Modal Alignment in Diabetic Retinopathy Diagnosis

Diabetic retinopathy (DR) is a leading cause of preventable blindness worldwide, demanding accurate automated diagnostic systems. While general-domain vision-la...

#research #paper #ai #machine-learning #computer-vision
0 month ago · ai

[Paper] Clustering with Label Consistency

Designing efficient, effective, and consistent metric clustering algorithms is a significant challenge attracting growing attention. Traditional approaches focu...

#research #paper #ai #machine-learning
0 month ago · ai

[Paper] Exploring Zero-Shot ACSA with Unified Meaning Representation in Chain-of-Thought Prompting

Aspect-Category Sentiment Analysis (ACSA) provides granular insights by identifying specific themes within reviews and their associated sentiment. While supervi...

#research #paper #ai #nlp
0 month ago · ai

[Paper] Deep Legendre Transform

We introduce a novel deep learning algorithm for computing convex conjugates of differentiable convex functions, a fundamental operation in convex analysis with...

#research #paper #ai #machine-learning
0 month ago · ai

[Paper] The Best of Both Worlds: Hybridizing Neural Operators and Solvers for Stable Long-Horizon Inference

Numerical simulation of time-dependent partial differential equations (PDEs) is central to scientific and engineering applications, but high-fidelity solvers ar...

#research #paper #ai #machine-learning

Newer posts

Older posts