Source

arXiv

1659 posts from this source

Sort:

1 week ago · ai · - · -

[Paper] Declarative Skills for AI Agents in Knowledge-Grounded Tool-Use Workflows

We study orchestration mechanisms for tool-using AI agents in realistic customer-service workflows over an unstructured knowledge base. We argue that declarativ...

#research #paper #ai #machine-learning
1 week ago · software · - · -

[Paper] From Custom Logic to APIs: Understanding and Recommending API Replacement Refactorings

Software refactoring is essential for maintaining code quality. However, API replacement refactoring, which replaces custom logic with API calls, remains undere...

#research #paper #software
1 week ago · devops · - · -

[Paper] Communication Strategy Selection for Multi-GPU 3D FDTD with Convolutional Perfectly Matched Boundary Layers

In this paper we describe a communication-strategy study for multi-GPU three-dimensional finite-difference time-domain computation with convolutional perfectly ...

#research #paper #devops
1 week ago · software · - · -

[Paper] Empirical Study on the Characteristics and Evolution of AI-usage in GitHub Repositories: Evidence from Code Comments

Developers increasingly use AI tools such as ChatGPT, Copilot, and Claude in everyday software workflows, but prior studies often evaluate LLM outputs in isolat...

#research #paper #software
1 week ago · ai · - · -

[Paper] LLM Agent-Assisted Reverse Engineering with Quantitative Readability Metrics

Automatic decompilers produce functionally correct but often unreadable C code. This paper addresses one stage of the reverse engineering workflow: improving th...

#research #paper #ai #machine-learning
1 week ago · software · - · -

[Paper] SkelDPO: A Skeleton-Guided Direct Preference Optimization Framework for Efficient Code Generation

With the remarkable progress of Code Large Language Models (Code LLMs) in achieving semantic correctness, execution efficiency has become an increasingly import...

#research #paper #software
1 week ago · software · - · -

[Paper] Chiseling Out Efficiency: Structured Skeleton Supervision for Efficient Code Generation

Large Language Models (LLMs) are capable of generating syntactically correct and functionally complete programs, greatly streamlining software development. Howe...

#research #paper #software
1 week ago · ai · - · -

[Paper] Terastal: Layer-Variant-based Scheduling for Real-Time Multi-DNN Workloads on Heterogeneous Accelerators

Heterogeneous DNN accelerators improve soft real-time multi-DNN execution by mapping each layer to its preferred accelerator to reduce latency. However, under s...

#research #paper #ai #machine-learning
1 week ago · software · - · -

[Paper] The Custody Envelope Threshold: Authority-Scaled Admission of External Artifacts in Institutional Infrastructure

Modern infrastructure depends on externally maintained artifacts such as package-registry dependencies, CI/CD actions, container images, Terraform providers and...

#research #paper #software
1 week ago · software · - · -

[Paper] Pomona: Continuous Code Quality Improvement via Small, Automated Changes at Bloomberg

In this short experience paper, we present Pomona, a lightweight agentic tool that utilises agent skills for continuous automated code quality improvement. Insp...

#research #paper #software
1 week ago · devops · - · -

[Paper] StageFrontier: Synchronization-Aware Stage Accounting for Distributed ML Training

When a distributed training job slows down, the hard part is knowing where to look. Synchronization hides the cause: a stall on one rank shows up as a wait on t...

#research #paper #devops
1 week ago · ai · - · -

[Paper] Towards Serverless Semi-Decentralized Federated Learning with Heterogeneous Optimizers

We investigate cluster formation, involving the number and composition of clusters, in decentralized federated learning (FL) with heterogeneous machine learning...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] TailLoR: Protecting Principal Components in Parameter-Efficient Continual Learning

Parameter-efficient finetuning methods based on spectral decomposition have enabled progress in Continual Learning. In this paper we introduce TailLoR, which ut...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] HANDOFF: Humanoid Agentic Task-Space Whole-Body Control via Distilled Complementary Teachers

For a humanoid robot to be deployed in the real world, the choice of command space (i.e., the interface between task planning and whole-body control) is crucial...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] Code2LoRA: Hypernetwork-Generated Adapters for Code Language Models under Software Evolution

Code language models need repository-level context to resolve imports, APIs, and project conventions. Existing methods inject this knowledge as long inputs (ret...

#research #paper #ai #machine-learning #nlp
1 week ago · ai · - · -

[Paper] TempoVLA: Learning Speed-Controllable Vision-Language-Action Policies

Robot manipulation alternates between low-risk transit phases that call for fast execution and high-risk contact stages that demand slow, precise motion. Yet ex...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] Regret Minimization with Adaptive Opponents in Repeated Games

In this paper, we study regret minimization in repeated games with adaptive opponents who can respond based on histories of play. The standard metric of externa...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] PAR3D: A Unified 3D-MLLM with Part-Aware Representation for Scene Understanding

Recent advances in 3D multimodal large language models (3D-MLLMs) have enabled unified solutions for 3D scene understanding tasks, including visual question ans...

#research #paper #ai #computer-vision
1 week ago · ai · - · -

[Paper] Operation-Guided Progressive Human-to-AI Text Transformation Benchmark for Multi-Granularity AI-Text Detection

As AI writing assistants become increasingly integrated into real-world drafting and revision workflows, many documents are no longer purely human-written or AI...

#research #paper #ai #machine-learning #nlp
1 week ago · ai · - · -

[Paper] DNQ: Deep Nash Q-Network for Partially Observable n-Player Games

Many real-world competitive systems require multiple decision-makers to act simultaneously under shared constraints, limited information, and repeated interacti...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] Pretraining Recurrent Networks without Recurrence

Training recurrent neural networks (RNNs) requires assigning credit across long sequences of computations. Standard backpropagation through time (BPTT) addresse...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] Complexity-Balanced Diffusion Splitting

Standard continuous-time generative models rely on monolithic architectures that must navigate vastly different signal regimes, from isotropic noise to intricat...

#research #paper #ai #computer-vision
1 week ago · ai · - · -

[Paper] Thinking with Imagination: Agentic Visual Spatial Reasoning with World Simulators

While Vision-Language Models (VLMs) have shown strong visual reasoning capabilities, their spatial reasoning abilities remain largely constrained to the observe...

#research #paper #ai #computer-vision
1 week ago · ai · - · -

[Paper] RREDCoT: Segment-Level Reward Redistribution for Reasoning Models

Recent advancements in reasoning language models have been driven by Reinforcement Learning (RL) fine-tuning. Most often, these rely on the Group Relative Polic...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] Self-Augmenting Retrieval for Diffusion Language Models

Discrete diffusion language models generate text by iteratively denoising an entire response in parallel. At each step, they predict tentative tokens for every ...

#research #paper #ai #machine-learning #nlp
1 week ago · ai · - · -

[Paper] MLEvolve: A Self-Evolving Framework for Automated Machine Learning Algorithm Discovery

Large language model (LLM) agents are increasingly applied to long-horizon tasks such as scientific discovery and machine learning engineering (MLE), where sust...

#research #paper #ai #machine-learning #nlp
1 week ago · ai · - · -

[Paper] PC Layer: Polynomial Weight Preconditioning for Improving LLM Pre-Training

We propose a preconditioning (PC) layer, a weight parameterization via polynomial preconditioner that ensures stable weight conditioning throughout LLM training...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] How abundant are good interpolators?

Let S be the set of unit norm linear classifiers θin mathbb{R}^d which correctly classify every point of a labeled dataset (X_i,y_i)_{i=1}^n, X_i in mathbb{R}^d...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] Goedel-Architect: Streamlining Formal Theorem Proving with Blueprint Generation and Refinement

We introduce Goedel-Architect, an agentic framework for formal theorem proving in Lean 4 centered on blueprint generation and refinement. A blueprint is a depen...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] You Only Index Once: Cross-Layer Sparse Attention with Shared Routing

Long-context inference in modern LLMs is increasingly constrained by decoding efficiency, especially in reasoning-heavy settings where models generate long inte...

#research #paper #ai #machine-learning #nlp
1 week ago · ai · - · -

[Paper] Human Adults and LLMs as Scientists: Who Benefits from Active Exploration?

A long-standing finding in the causal learning literature is that adults struggle to identify conjunctive causal rules, where an effect requires the simultaneou...

#research #paper #ai #nlp
1 week ago · ai · - · -

[Paper] Benchmark Everything Everywhere All at Once

Benchmarks are fundamental for evaluating and advancing LLMs and MLLMs by providing standardized and explicit measures of performance. However, their constructi...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] Will the Agent Recuse Itself? Measuring LLM-Agent Compliance with In-Band Access-Deny Signals

As autonomous LLM agents increasingly hold real credentials and operate infrastructure without a human in the loop, operators have no standard way to tell an ag...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] Event Detection for Parameter-to-KPI Dependency Learning for AI-RAN

Next-generation wireless networks are expected to rely on multiple concurrent AI-driven control functions that optimize different network objectives simultaneou...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] In-Context Multiple Instance Learning

Multiple Instance Learning (MIL) addresses problems where supervision is available at the level of bags of instances and has been successfully applied in fields...

#research #paper #ai #machine-learning #computer-vision
1 week ago · ai · - · -

[Paper] Scaffold, Not Vocabulary? A Controlled, Two-Tier, Pre-Registered Study of a Popperian Code-Generation Skill

Large language models increasingly write, review, and judge code, and a fast-growing practice equips them with prompt 'skills' that ask the model to reason like...

#research #paper #ai #nlp
1 week ago · ai · - · -

[Paper] Vortex: Efficient and Programmable Sparse Attention Serving for AI Agents

Sparse attention is becoming increasingly important for serving large language models (LLMs) as generation lengths continue to grow. However, deploying and eval...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] Agent Memory: Characterization and System Implications of Stateful Long-Horizon Workloads

LLM agents are increasingly deployed on long-horizon tasks requiring sustained reasoning over extended interaction histories. Realizing this at scale requires a...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] Latent Reasoning with Normalizing Flows

Large language models often improve reasoning by generating explicit chain-of-thought (CoT), demonstrating the importance of intermediate computation. However, ...

#research #paper #ai #machine-learning #nlp
1 week ago · ai · - · -

[Paper] USAD 2.0: Scaling Representation Distillation for Universal Audio Understanding

Audio encoders are critical to modern audio applications as large language models (LLMs) increasingly rely on a single encoder for diverse inputs. While self-su...

#research #paper #ai #nlp
1 week ago · ai · - · -

[Paper] Revising Context, Shifting Simulated Stance: Auditing LLM-Based Stance Simulation in Online Discussions

Large language models are increasingly used to simulate social media users and infer how individuals may respond to online discussions. However, it remains uncl...

#research #paper #ai #nlp
1 week ago · ai · - · -

[Paper] Causal Atlases from Entropic Inference: Bayesian Networks beyond Optimal DAGs

Data-driven causal relationship identification is pertinent to advancing understanding of complex systems both within and beyond science. Bayesian networks offe...

#research #paper #ai #machine-learning
1 week ago · devops · - · -

[Paper] CarbonSim: A Lifecycle-Aware Framework for Evaluating Carbon Tradeoffs in Hardware Upgrade Decisions

As the demand for information and communication technologies (ICT) continues to rise, the environmental impact of computing systems is becoming an increasingly ...

#research #paper #devops
1 week ago · ai · - · -

[Paper] Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation

Prior work has shown that large language models (LLMs) can translate unseen or low-resource languages by undergoing continued training or even by encoding a gra...

#research #paper #ai #nlp
1 week ago · ai · - · -

[Paper] RiskFlow: Fast and Faithful Safety-Critical Traffic Scenario Generation

Safety-critical traffic scenario generation is essential for evaluating autonomous driving systems under rare but high-risk interactions. Existing diffusion-bas...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] A Komi-Yazva--Russian Parallel Corpus and Evaluation Protocol for Zero- and Few-Shot LLM Translation

We present the first Komi-Yazva--Russian parallel corpus together with an explicit evaluation protocol for studying LLM translation in an endangered, extremely ...

#research #paper #ai #nlp
1 week ago · ai · - · -

[Paper] Double Preconditioning (DoPr): Optimization for Test-Time Performance, not Validation Loss

Many modern applications of deep learning involve training a neural network via a one-step prediction loss (e.g., L^2 regression, cross-entropy), but deploy the...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] Unsupervised Skill Discovery for Agentic Data Analysis

Inference-time skill augmentation provides a lightweight way to improve data-analytic agents by injecting reusable procedural knowledge without updating model p...

#research #paper #ai #machine-learning #nlp

Newer posts

Older posts