research — Page 152

2 months ago · ai

[Paper] Video-CoM: Interactive Video Reasoning via Chain of Manipulations

Recent multimodal large language models (MLLMs) have advanced video understanding, yet most still 'think about videos' ie once a video is encoded, reasoning unf...

#research #paper #ai #computer-vision
2 months ago · ai

[Paper] Thinking by Doing: Building Efficient World Model Reasoning in LLMs via Multi-turn Interaction

Developing robust world model reasoning is crucial for large language model (LLM) agents to plan and interact in complex environments. While multi-turn interact...

#research #paper #ai #machine-learning
2 months ago · ai

[Paper] AnyTalker: Scaling Multi-Person Talking Video Generation with Interactivity Refinement

Recently, multi-person video generation has started to gain prominence. While a few preliminary works have explored audio-driven multi-person talking video gene...

#research #paper #ai #computer-vision
2 months ago · ai

[Paper] ThetaEvolve: Test-time Learning on Open Problems

Recent advances in large language models (LLMs) have enabled breakthroughs in mathematical discovery, exemplified by AlphaEvolve, a closed-source system that ev...

#research #paper #ai #machine-learning #nlp
2 months ago · ai

[Paper] Visual Generation Tuning

Large Vision Language Models (VLMs) effectively bridge the modality gap through extensive pretraining, acquiring sophisticated visual representations aligned wi...

#research #paper #ai #computer-vision
2 months ago · ai

[Paper] SmallWorlds: Assessing Dynamics Understanding of World Models in Isolated Environments

Current world models lack a unified and controlled setting for systematic evaluation, making it difficult to assess whether they truly capture the underlying ru...

#research #paper #ai #machine-learning
2 months ago · ai

[Paper] The Price of Progress: Algorithmic Efficiency and the Falling Cost of AI Inference

Language models have seen enormous progress on advanced benchmarks in recent years, but much of this progress has only been possible by using more costly models...

#research #paper #ai #machine-learning
2 months ago · ai

[Paper] Object-Centric Data Synthesis for Category-level Object Detection

Deep learning approaches to object detection have achieved reliable detection of specific object classes in images. However, extending a model's detection capab...

#research #paper #ai #computer-vision
2 months ago · ai

[Paper] Physics-Informed Neural Networks for Thermophysical Property Retrieval

Inverse heat problems refer to the estimation of material thermophysical properties given observed or known heat diffusion behaviour. Inverse heat problems have...

#research #paper #ai #machine-learning #computer-vision
2 months ago · ai

[Paper] Provable Benefits of Sinusoidal Activation for Modular Addition

This paper studies the role of activation functions in learning modular addition with two-layer neural networks. We first establish a sharp expressivity gap: si...

#research #paper #ai #machine-learning
2 months ago · ai

[Paper] ASTRO: Adaptive Stitching via Dynamics-Guided Trajectory Rollouts

Offline reinforcement learning (RL) enables agents to learn optimal policies from pre-collected datasets. However, datasets containing suboptimal and fragmented...

#research #paper #ai #machine-learning
2 months ago · ai

[Paper] Accelerated Execution of Bayesian Neural Networks using a Single Probabilistic Forward Pass and Code Generation

Machine learning models perform well across domains such as diagnostics, weather forecasting, NLP, and autonomous driving, but their limited uncertainty handlin...

#research #paper #ai #machine-learning

Newer posts

Older posts