ai — Page 81 | EUNO.NEWS

2 weeks ago · ai

Claude 4.5 Opus' Soul Document

Article URL: https://simonwillison.net/2025/Dec/2/claude-soul-document/ Comments URL: https://news.ycombinator.com/item?id=46125184 Points: 79 Comments: 37...

#Claude 4.5 Opus #Anthropic #large language model #LLM documentation #AI research #model architecture
2 weeks ago · ai

Amazon launches Trainium3

Article URL: https://techcrunch.com/2025/12/02/amazon-releases-an-impressive-new-ai-chip-and-teases-a-nvidia-friendly-roadmap/ Comments URL: https://news.ycombi...

#amazon #trainium3 #ai-chip #machine-learning #hardware #cloud #inference #training
2 weeks ago · ai

[Paper] MagicQuillV2: Precise and Interactive Image Editing with Layered Visual Cues

We propose MagicQuill V2, a novel system that introduces a layered composition paradigm to generative image editing, bridging the gap between the sema...

#research #paper #ai #computer-vision
2 weeks ago · ai

[Paper] CAMEO: Correspondence-Attention Alignment for Multi-View Diffusion Models

Multi-view diffusion models have recently emerged as a powerful paradigm for novel view synthesis, yet the underlying mechanism that enables their view-consiste...

#research #paper #ai #computer-vision
2 weeks ago · ai

[Paper] OneThinker: All-in-one Reasoning Model for Image and Video

Reinforcement learning (RL) has recently achieved remarkable success in eliciting visual reasoning within Multimodal Large Language Models (MLLMs). However, exi...

#research #paper #ai #computer-vision
2 weeks ago · ai

[Paper] PPTArena: A Benchmark for Agentic PowerPoint Editing

We introduce PPTArena, a benchmark for PowerPoint editing that measures reliable modifications to real slides under natural-language instructions. In contrast t...

#research #paper #ai #machine-learning #computer-vision
2 weeks ago · ai

[Paper] MultiShotMaster: A Controllable Multi-Shot Video Generation Framework

Current video generation techniques excel at single-shot clips but struggle to produce narrative multi-shot videos, which require flexible shot arrangement, coh...

#research #paper #ai #computer-vision
2 weeks ago · ai

[Paper] Video4Spatial: Towards Visuospatial Intelligence with Context-Guided Video Generation

We investigate whether video generative models can exhibit visuospatial intelligence, a capability central to human cognition, using only visual data. To this e...

#research #paper #ai #machine-learning #computer-vision
2 weeks ago · ai

[Paper] ViSAudio: End-to-End Video-Driven Binaural Spatial Audio Generation

Despite progress in video-to-audio generation, the field focuses predominantly on mono output, lacking spatial immersion. Existing binaural approaches remain co...

#research #paper #ai #machine-learning #computer-vision
2 weeks ago · ai

[Paper] Learning Physically Consistent Lagrangian Control Models Without Acceleration Measurements

This article investigates the modeling and control of Lagrangian systems involving non-conservative forces using a hybrid method that does not require accelerat...

#research #paper #ai #machine-learning
2 weeks ago · ai

[Paper] MAViD: A Multimodal Framework for Audio-Visual Dialogue Understanding and Generation

We propose MAViD, a novel Multimodal framework for Audio-Visual Dialogue understanding and generation. Existing approaches primarily focus on non-interactive sy...

#research #paper #ai #computer-vision
2 weeks ago · ai

[Paper] SMP: Reusable Score-Matching Motion Priors for Physics-Based Character Control

Data-driven motion priors that can guide agents toward producing naturalistic behaviors play a pivotal role in creating life-like virtual characters. Adversaria...

#research #paper #ai #machine-learning #computer-vision

Newer posts

Older posts