Source

arXiv

1364 posts from this source

Sort:

4 days ago · ai · - · -

[Paper] SkillJuror: Measuring How Agent Skill Organization Changes Runtime Behavior

Agent Skills augment large language model (LLM) agents with procedural knowledge at inference time, but current benchmarks rarely distinguish what a Skill says ...

#research #paper #ai #machine-learning
5 days ago · software · - · -

[Paper] SentTrack: Sentiment-Driven Bottleneck Detection in GitHub Issue Repositories

Software engineering teams increasingly depend on GitHub issue threads to coordinate work, report bugs, and negotiate technical decisions, yet most repository h...

#research #paper #software
5 days ago · software · - · -

[Paper] Defeater Cards: Characterizing and Managing Safety Assurance Case Defeaters

Safety assurance cases provide structured justifications that safety-critical systems meet their safety requirements. Recently, the notion of defeaters has emer...

#research #paper #software
5 days ago · software · - · -

[Paper] Web-Native Graphical EMF Model Editors

Graphical model editing is shifting from desktop applications to web-based tools. We analyze the characteristics of existing frameworks and, based on this analy...

#research #paper #software
5 days ago · ai · - · -

[Paper] A Scalable PyTorch Abstraction for Multi-GPU Gaussian Splatting

Gaussian splatting methods have become increasingly popular for neural reconstruction of the real world. However, they are often limited in scale and resolution...

#research #paper #ai #machine-learning #computer-vision
5 days ago · ai · - · -

[Paper] TileFuse: A Fused Mixed-Precision Kernel Library for Efficient Quantized LLM Inference on AMD NPUs

With the growing demand for on-device LLM inference, edge SoCs increasingly integrate NPUs to improve performance and energy efficiency under tight power and th...

#research #paper #ai #machine-learning
5 days ago · devops · - · -

[Paper] An Ocean Model Ported by a Large Language Model: Experience and Lessons from FESOM2 (Fortran to C to C++/Kokkos)

Large language models (LLMs) can translate and modify source code, and have been shown to do so for codes of different complexity. Whether they can port a compl...

#research #paper #devops
5 days ago · ai · - · -

[Paper] When to Align, When to Predict: A Phase Diagram for Multimodal Learning

Cross-modal alignment (CA) and cross-modal prediction (CP) are the dominant paradigms for multimodal representation learning, yet there is no systematic underst...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] A Unifying Lens on Supervised Fine-Tuning Through Target Distribution Design

Supervised fine-tuning (SFT) typically maximizes the likelihood of every token in a demonstrated trajectory. However, an observed token can be non-unique, noisy...

#research #paper #ai #machine-learning #nlp
5 days ago · ai · - · -

[Paper] ARM: An AutoRegressive Large Multimodal Model with Unified Discrete Representations

This paper introduces ARM, a discrete representation-based AutoRegressive Model that unifies image understanding, generation, and editing within a next-token pr...

#research #paper #ai #computer-vision
5 days ago · ai · - · -

[Paper] Next Forcing: Causal World Modeling with Multi-Chunk Prediction

Autoregressive video generation has emerged as a powerful paradigm for World Action Models (WAMs). However, existing approaches suffer from slow training conver...

#research #paper #ai #computer-vision
5 days ago · ai · - · -

[Paper] AnyMod-LLVE: Low-Light Video Enhancement with Modality-Agnostic Inference

Low-light video enhancement (LLVE) remains a challenging task due to severe information degradation under low-illumination conditions. Recent multimodal approac...

#research #paper #ai #computer-vision
5 days ago · ai · - · -

[Paper] EEVEE: Towards Test-time Prompt Learning in the Real World for Self-Improving Agents

In this paper, we propose EEVEE, the first multi-dataset test-time prompt learning framework for LLM agents, enabling test-time prompt learning under real-world...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] Lip Forcing: Few-Step Autoregressive Diffusion for Real-time Lip Synchronization

Diffusion-based lip synchronization models achieve strong visual quality and audio-visual alignment, but full-sequence bidirectional attention and many denoisin...

#research #paper #ai #computer-vision
5 days ago · ai · - · -

[Paper] Data Journalist Agent: Transforming Data into Verifiable Multimodal Stories

Data tells stories that shape society; the data journalist's job is to turn raw information into stories non-experts can trust. A high-quality news feature take...

#research #paper #ai #nlp #computer-vision
5 days ago · ai · - · -

[Paper] The Role of Feedback Alignment in Self-Distillation

Conditioning a language model on additional context, such as feedback on a previous attempt, typically improves its response. Self-distillation trains the model...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] Predicting Future Behaviors in Reasoning Models Enables Better Steering

Deployed large reasoning models (LRMs) often behave unexpectedly. Test-time steering controls LRM outputs by intervening on their hidden representations, but it...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] Algorithmic and Minimax Complexities in Kernel Bandits

Gaussian-process upper confidence bound (GP-UCB) and decision-estimation-coefficient (DEC) methods may appear, at first sight, to belong to different theories. ...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] Piper: A Programmable Distributed Training System

Large-scale model training increasingly relies on composing multiple parallelism strategies, such as data, pipeline, and expert parallelism, together with memor...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] Multi-Faceted Interactivity Alignment in Full-Duplex Speech Models

Full-duplex spoken dialogue models can listen and speak simultaneously, making them a promising architecture for natural conversation. However, current models a...

#research #paper #ai #nlp
5 days ago · ai · - · -

[Paper] Flaws in the LLM Automation Narrative

Large Language Models (LLMs) are increasingly described as performing at the level of human experts on knowledge economy tasks. These claims are primarily based...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] ReasonAlloc: Hierarchical Decoding-Time KV Cache Budget Allocation for Reasoning Models

Long chain-of-thought (CoT) trajectories in large language model (LLM) reasoning cause severe inference bottlenecks due to rapid key-value (KV) cache growth. Cu...

#research #paper #ai #machine-learning
5 days ago · devops · - · -

[Paper] Revisiting 'Cooler is Better': ITD-Aware Per-CPU Thermal Optimization for Sustainable Data Center Operation

As data center energy demand approaches grid-level constraints, optimizing conventional server infrastructure is essential for sustainable growth. The long-stan...

#research #paper #devops
5 days ago · ai · - · -

[Paper] COGENT: Continuous Graph Emulators with Neural Ordinary Differential Equations for Long-Term Physical Forecasting

In this work, we present COGENT, a continuous graph emulator with Neural Ordinary Differential Equations for long-term physical forecasting on irregular geospat...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] Itô maps for any-step SDEs

Recent one-step generative models accelerate sampling by learning deterministic flow maps of the underlying dynamics. These methods rely on learning from ordina...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] Mean Flow Distillation: Robust and Stable Distillation for Flow Matching Models

Flow Matching models have demonstrated strong performance across a wide range of generative tasks. However, their reliance on ODE-based iterative sampling incur...

#research #paper #ai #computer-vision
5 days ago · ai · - · -

[Paper] P3D-Bench: Benchmarking MLLMs for Parametric 3D Generation and Structural Reasoning

Multimodal large language models can write code to produce complex programs as well as use programs to do 3D modeling, which opens up a new avenue for 3D genera...

#research #paper #ai #computer-vision
5 days ago · ai · - · -

[Paper] ABC-Bench: An Agentic Bio-Capabilities Benchmark for Biosecurity

Large language models (LLMs) are rapidly acquiring capabilities relevant to biological research, from literature synthesis to interpretation of experimental dat...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] Efficiently Learning Drifting Halfspaces with Massart Noise

We study the problem of learning a drifting concept in the presence of Massart noise. In this framework, an online learner has access to a history of independen...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] MOFA-VTON: More Fashion Possibilities with Fine-Grained Adaptations in Virtual Try-On

Virtual try-on aims to fit an in-shop clothing image onto a specific human body. An optimal virtual try-on method should provide diverse and flexible dressing o...

#research #paper #ai #computer-vision
5 days ago · ai · - · -

[Paper] OncoTraj: a public benchmark for longitudinal resistance prediction in EGFR-mutant non-small-cell lung cancer on osimertinib

Resistance to first-line osimertinib in EGFR-mutant non-small-cell lung cancer (NSCLC) is the canonical example of predictable clonal evolution under therapeuti...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] Data assimilation for subsurface flow using latent diffusion model parameterization: performance of ensemble-Kalman and Monte Carlo techniques

Data assimilation (DA) in subsurface flow entails calibrating model parameters to match observed data, typically at wells, while preserving geological realism. ...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] First-Order Trajectory Matching: Fast Ensemble Predictions of Chaotic, Turbulent, Stochastic Systems

We introduce First-Order Trajectory Matching (FTM), a surrogate-modeling method that learns the first-order local transport of probability mass from trajectorie...

#research #paper #ai #machine-learning
5 days ago · software · - · -

[Paper] Operationalizing Property-Based Testing for Data-Intensive Scalable Computing Systems

While fuzzing effectively catches crashes, its shallow oracles often miss semantic drifts and optimization-related errors in data-intensive scalable computing (...

#research #paper #software
5 days ago · ai · - · -

[Paper] UniPET: a universal network for high-quality PET image denoising across varied dose reduction factors

Most existing deep learning-based PET image denoising methods assume a fixed and known dose reduction factor (DRF) for low-dose PET images. However, these metho...

#research #paper #ai #computer-vision
5 days ago · ai · - · -

[Paper] Robust Regression of General ReLUs with Queries

We study the task of agnostically learning general (as opposed to homogeneous) ReLUs under the Gaussian distribution with respect to the squared loss. In the pa...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] WorldOlympiad: Can Your World Model Survive a Triathlon?

We introduce WorldOlympiad, a benchmark for diagnosing video-based world models across physical faithfulness, geometric consistency, and interaction fidelity. W...

#research #paper #ai #computer-vision
5 days ago · ai · - · -

[Paper] Provenance-Grounded Gating and Adaptive Recovery in Synthetic Post-Training Data Curation

Synthetic post-training pipelines commonly filter generated samples with reward models or holistic LLM judges, yet two practices remain rarely examined together...

#research #paper #ai #machine-learning #nlp
5 days ago · ai · - · -

[Paper] DMT: Demographic Conditioning, Morphology-Enhanced Transformer for Cuffless Blood Pressure Estimation from PPG Signals

Blood pressure (BP) is a key marker for cardiovascular risk assessment and therapeutic decision-making, and Photoplethysmography (PPG) enables low-cost, wearabl...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] Overcoming Rank Collapse in Feedback Alignment

Backpropagation (BP) is widely viewed as biologically implausible, in part because it requires feedback weights to be the transpose of forward weights for error...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] Monte Carlo Pass Search: Using Trajectory Generation for 3D Counterfactual Pass Evaluation in Football

We recast pass evaluation in football (soccer) as a Monte Carlo Tree Search (MCTS)-like evaluation problem whose components mostly exist in the literature under...

#research #paper #ai #machine-learning #computer-vision
5 days ago · ai · - · -

[Paper] TRACE: A Unified Rollout Budget Allocation Framework for Efficient Agentic Reinforcement Learning

Reinforcement learning with verifiable rewards (RLVR) is a promising approach for enhancing reasoning and agentic behavior in large language models. However, ro...

#research #paper #ai #machine-learning #nlp
5 days ago · ai · - · -

[Paper] Data-Driven Dynamic Assortment in Online Platforms: Learning about Two Sides

We study a dynamic assortment problem on a two-sided service platform with incomplete information and heterogeneous customers in a discrete-time setting. In eac...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] Towards Autonomous Accelerator Design: FPGA Accelerator Generation with SECDA

Designing FPGA-based accelerators for modern artificial intelligence workloads requires exploring a large and complex hardware design space that involves archit...

#research #paper #ai #machine-learning
5 days ago · ai · - · -

[Paper] Designed by Journalists, but Is It for Readers? Rethinking AI Disclosures and Transparency in News

As newsrooms integrate generative AI, journalists face a disclosure challenge: how to communicate AI involvement in ways that maintain reader trust. Current pra...

#research #paper #ai #machine-learning
5 days ago · devops · - · -

[Paper] A Neurosymbolic Prolog Skill for LLM-Driven Service Placement

Service placement in the cloud-edge continuum requires assigning application components to heterogeneous resources under multiple constraints, including latency...

#research #paper #devops
5 days ago · ai · - · -

[Paper] Multimodal Brain Tumour Classification Using Feature Fusion

Clinicians diagnose brain tumors by synthesizing patient symptoms, medical history, and quantitative imaging data from modalities such as MRI and CT scans into ...

#research #paper #ai #machine-learning #computer-vision
5 days ago · ai · - · -

[Paper] FADA: Accessible fetal ultrasound interpretation and annotation with a selectively distilled unified vision-language model

A global shortage of trained sonographers limits prenatal ultrasound screening in low- and middle-income countries, where over half of pregnant women receive no...

#research #paper #ai #machine-learning #computer-vision

Newer posts

Older posts