Source

arXiv

5644 posts from this source

Sort:

2 months ago · ai · - · -

[Paper] Controllable Reasoning Models Are Private Thinkers

AI agents powered by reasoning models require access to sensitive user data. However, their reasoning traces are difficult to control, which can result in the u...

#privacy #large-language-models #chain-of-thought #LoRA #instruction-following
2 months ago · ai · - · -

[Paper] SenCache: Accelerating Diffusion Model Inference via Sensitivity-Aware Caching

Diffusion models achieve state-of-the-art video generation quality, but their inference remains expensive due to the large number of sequential denoising steps....

#research #paper #ai #machine-learning #computer-vision
2 months ago · ai · - · -

[Paper] Uncertainty Quantification for Multimodal Large Language Models with Incoherence-adjusted Semantic Volume

Despite their capabilities, Multimodal Large Language Models (MLLMs) may produce plausible but erroneous outputs, hindering reliable deployment. Accurate uncert...

#research #paper #ai #machine-learning #nlp #computer-vision
2 months ago · ai · - · -

[Paper] MT-PingEval: Evaluating Multi-Turn Collaboration with Private Information Games

We present a scalable methodology for evaluating language models in multi-turn interactions, using a suite of collaborative games that require effective communi...

#research #paper #ai #machine-learning #nlp
2 months ago · ai · - · -

[Paper] Task-Centric Acceleration of Small-Language Models

Small language models (SLMs) have emerged as efficient alternatives to large language models for task-specific applications. However, they are often employed in...

#research #paper #ai #machine-learning #nlp
2 months ago · ai · - · -

[Paper] ArgLLM-App: An Interactive System for Argumentative Reasoning with Large Language Models

Argumentative LLMs (ArgLLMs) are an existing approach leveraging Large Language Models (LLMs) and computational argumentation for decision-making, with the aim ...

#argumentation #large-language-models #explainable-ai #interactive-visualization #user-study
2 months ago · devops · - · -

[Paper] Advanced Scheduling Strategies for Distributed Quantum Computing Jobs

Scaling the number of qubits available across multiple quantum devices is an active area of research within distributed quantum computing (DQC). This includes q...

#quantum computing #distributed scheduling #reinforcement learning #resource allocation #quantum hardware
2 months ago · ai · - · -

[Paper] CoME: Empowering Channel-of-Mobile-Experts with Informative Hybrid-Capabilities Reasoning

Mobile Agents can autonomously execute user instructions, which requires hybrid-capabilities reasoning, including screen summary, subtask planning, action decis...

#mobile AI agents #mixture of experts #chain-of-thought #reinforcement learning #information gain
2 months ago · ai · - · -

[Paper] AgenticOCR: Parsing Only What You Need for Efficient Retrieval-Augmented Generation

The expansion of retrieval-augmented generation (RAG) into multimodal domains has intensified the challenge for processing complex visual documents, such as fin...

#OCR #retrieval-augmented generation #multimodal AI #document understanding #efficient inference
2 months ago · software · - · -

[Paper] Context-Aware Functional Test Generation via Business Logic Extraction and Adaptation

Functional testing is essential for verifying that the business logic of mobile applications aligns with user requirements, serving as the primary methodology f...

#functional testing #mobile app testing #test generation #business logic extraction #AI-powered testing
2 months ago · ai · - · -

[Paper] CIRCLE: A Framework for Evaluating AI from a Real-World Lens

This paper proposes CIRCLE, a six-stage, lifecycle-based framework to bridge the reality gap between model-centric performance metrics and AI's materialized out...

#AI evaluation #MLOps #real-world testing #model validation #framework
2 months ago · ai · - · -

[Paper] Data Driven Optimization of GPU efficiency for Distributed LLM Adapter Serving

Large Language Model (LLM) adapters enable low-cost model specialization, but introduce complex caching and scheduling challenges in distributed serving systems...

#LLM adapters #GPU resource optimization #digital twin simulation #ML performance modeling #distributed inference
2 months ago · software · - · -

[Paper] LeGend: A Data-Driven Framework for Lemma Generation in Hardware Model Checking

Property checking of RTL designs is a central task in formal verification. Among available engines, IC3/PDR is a widely used backbone whose performance critical...

#research #paper #software
2 months ago · software · - · -

[Paper] The Vocabulary of Flaky Tests in the Context of SAP HANA

Background. Automated test execution is an important activity to gather information about the quality of a software project. So-called flaky tests, however, neg...

#research #paper #software
2 months ago · ai · - · -

[Paper] Green or Fast? Learning to Balance Cold Starts and Idle Carbon in Serverless Computing

Serverless computing simplifies cloud deployment but introduces new challenges in managing service latency and carbon emissions. Reducing cold-start latency req...

#research #paper #ai #machine-learning
2 months ago · devops · - · -

[Paper] Mixed Choice in Asynchronous Multiparty Session Types

We present a multiparty session type (MST) framework with asynchronous mixed choice (MC). We propose a core construct for MC that allows transient inconsistenci...

#research #paper #devops
2 months ago · software · - · -

[Paper] Invariant-Driven Automated Testing

Microservice architectures are an emergent technology that builds business logic into a suite of small services. Each microservice runs in its process and the c...

#research #paper #software
2 months ago · software · - · -

[Paper] Novice Developers Produce Larger Review Overhead for Project Maintainers while Vibe Coding

AI coding agents allow software developers to generate code quickly, which raises a practical question for project managers and open source maintainers: can vib...

#research #paper #software
2 months ago · ai · - · -

[Paper] SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale

Software engineering agents (SWE) are improving rapidly, with recent gains largely driven by reinforcement learning (RL). However, RL training is constrained by...

#research #paper #ai #nlp
2 months ago · ai · - · -

[Paper] MPU: Towards Secure and Privacy-Preserving Knowledge Unlearning for Large Language Models

Machine unlearning for large language models often faces a privacy dilemma in which strict constraints prohibit sharing either the server's parameters or the cl...

#research #paper #ai #machine-learning
2 months ago · devops · - · -

[Paper] Hestia: Hyperthread-Level Scheduling for Cloud Microservices with Interference-Aware Attention

Modern cloud servers routinely co-locate multiple latency-sensitive microservice instances to improve resource efficiency. However, the diversity of microservic...

#research #paper #devops
2 months ago · software · - · -

[Paper] Peeling Off the Cocoon: Unveiling Suppressed Golden Seeds for Mutational Greybox Fuzzing

PoCo is a technique that aims to enhance modern coverage-based seed selection (CSS) techniques (such as afl-cmin) by gradually removing obstacle conditional sta...

#research #paper #software
2 months ago · ai · - · -

[Paper] From Flat Logs to Causal Graphs: Hierarchical Failure Attribution for LLM-based Multi-Agent Systems

LLM-powered Multi-Agent Systems (MAS) have demonstrated remarkable capabilities in complex domains but suffer from inherent fragility and opaque failure mechani...

#LLM #multi-agent systems #causal graph #failure attribution #research paper
2 months ago · devops · - · -

[Paper] QoSFlow: Ensuring Service Quality of Distributed Workflows Using Interpretable Sensitivity Models

With the increasing importance of distributed scientific workflows, there is a critical need to ensure Quality of Service (QoS) constraints, such as minimizing ...

#research #paper #devops
2 months ago · ai · - · -

[Paper] All Mutation Rates $c/n$ for the $(1+1)$ Evolutionary Algorithm

For every real number c geq 1 and for all varepsilon > 0, there is a fitness function f : {0,1}^n to mathbb{R} for which the optimal mutation rate for the (1...

#evolutionary algorithms #mutation rate #theoretical analysis #optimization
2 months ago · ai · - · -

[Paper] Rudder: Steering Prefetching in Distributed GNN Training using LLM Agents

Large-scale Graph Neural Networks (GNNs) are typically trained by sampling a vertex's neighbors to a fixed distance. Because large input graphs are distributed,...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] FedDAG: Clustered Federated Learning via Global Data and Gradient Integration for Heterogeneous Environments

Federated Learning (FL) enables a group of clients to collaboratively train a model without sharing individual data, but its performance drops when client data ...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] MediX-R1: Open Ended Medical Reinforcement Learning

We introduce MediX-R1, an open-ended Reinforcement Learning (RL) framework for medical multimodal large language models (MLLMs) that enables clinically grounded...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] VGG-T$^3$: Offline Feed-Forward 3D Reconstruction at Scale

We present a scalable 3D reconstruction model that addresses a critical limitation in offline feed-forward methods: their computational and memory requirements ...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] Model Agreement via Anchoring

Numerous lines of aim to control model disagreement -- the extent to which two machine learning models disagree in their predictions. We adopt a simple and stan...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] SeeThrough3D: Occlusion Aware 3D Control in Text-to-Image Generation

We identify occlusion reasoning as a fundamental yet overlooked aspect for 3D layout-conditioned generation. It is essential for synthesizing partially occluded...

#research #paper #ai #machine-learning #computer-vision
2 months ago · ai · - · -

[Paper] A Dataset is Worth 1 MB

A dataset server must often distribute the same large payload to many clients, incurring massive communication costs. Since clients frequently operate on divers...

#dataset compression #pseudo‑labels #data distillation #computer vision #few‑shot learning
2 months ago · ai · - · -

[Paper] Sensor Generalization for Adaptive Sensing in Event-based Object Detection via Joint Distribution Training

Bio-inspired event cameras have recently attracted significant research due to their asynchronous and low-latency capabilities. These features provide a high dy...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] SOTAlign: Semi-Supervised Alignment of Unimodal Vision and Language Models via Optimal Transport

The Platonic Representation Hypothesis posits that neural networks trained on different modalities converge toward a shared statistical model of the world. Rece...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] EvoX: Meta-Evolution for Automated Discovery

Recent work such as AlphaEvolve has shown that combining LLM-driven optimization with evolutionary search can effectively improve programs, prompts, and algorit...

#research #paper #ai #machine-learning #nlp
2 months ago · ai · - · -

[Paper] Scale Can't Overcome Pragmatics: The Impact of Reporting Bias on Vision-Language Reasoning

The lack of reasoning capabilities in Vision-Language Models (VLMs) has remained at the forefront of research discourse. We posit that this behavior stems from ...

#research #paper #ai #nlp #computer-vision
2 months ago · ai · - · -

[Paper] FlashOptim: Optimizers for Memory Efficient Training

Standard mixed-precision training of neural networks requires many bytes of accelerator memory for each model parameter. These bytes reflect not just the parame...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] Mean Estimation from Coarse Data: Characterizations and Efficient Algorithms

Coarse data arise when learners observe only partial information about samples; namely, a set containing the sample rather than its exact value. This occurs nat...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] Retrieve and Segment: Are a Few Examples Enough to Bridge the Supervision Gap in Open-Vocabulary Segmentation?

Open-vocabulary segmentation (OVS) extends the zero-shot recognition capabilities of vision-language models (VLMs) to pixel-level prediction, enabling segmentat...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] Differentiable Zero-One Loss via Hypersimplex Projections

Recent advances in machine learning have emphasized the integration of structured optimization components into end-to-end differentiable models, enabling richer...

#zero-one loss #hypersimplex projection #soft argmax #differentiable surrogate #large-batch training
2 months ago · ai · - · -

[Paper] Understanding Usage and Engagement in AI-Powered Scientific Research Tools: The Asta Interaction Dataset

AI-powered scientific research tools are rapidly being integrated into research workflows, yet the field lacks a clear lens into how researchers use these syste...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] Bitwise Systolic Array Architecture for Runtime-Reconfigurable Multi-precision Quantized Multiplication on Hardware Accelerators

Neural network accelerators have been widely applied to edge devices for complex tasks like object tracking, image recognition, etc. Previous works have explore...

#quantized neural networks #systolic array #FPGA accelerator #runtime reconfiguration #mixed-precision
2 months ago · ai · - · -

[Paper] Utilizing LLMs for Industrial Process Automation

A growing number of publications address the best practices to use Large Language Models (LLMs) for software engineering in recent years. However, most of this ...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] Toward Expert Investment Teams:A Multi-Agent LLM System with Fine-Grained Trading Tasks

The advancement of large language models (LLMs) has accelerated the development of autonomous financial trading systems. While mainstream approaches deploy mult...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] LLM Novice Uplift on Dual-Use, In Silico Biology Tasks

Large language models (LLMs) perform increasingly well on biology benchmarks, but it remains unclear whether they uplift novice users -- i.e., enable humans to ...

#research #paper #ai #machine-learning #nlp
2 months ago · ai · - · -

[Paper] Deep ensemble graph neural networks for probabilistic cosmic-ray direction and energy reconstruction in autonomous radio arrays

Using advanced machine learning techniques, we developed a method for reconstructing precisely the arrival direction and energy of ultra-high-energy cosmic rays...

#graph neural networks #deep ensemble #uncertainty quantification #cosmic ray reconstruction #radio antenna arrays
2 months ago · ai · - · -

[Paper] ParamMem: Augmenting Language Agents with Parametric Reflective Memory

Self-reflection enables language agents to iteratively refine solutions, yet often produces repetitive outputs that limit reasoning performance. Recent studies ...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] Generalized Rapid Action Value Estimation in Memory-Constrained Environments

Generalized Rapid Action Value Estimation (GRAVE) has been shown to be a strong variant within the Monte-Carlo Tree Search (MCTS) family of algorithms for Gener...

#Monte Carlo Tree Search #General Game Playing #memory optimization #node recycling #GRAVE algorithm

Newer posts

Older posts