Source

arXiv

1659 posts from this source

Sort:

1 week ago · ai · - · -

[Paper] Sparsely gated tiny linear experts

Sparsity allows scaling model parameters without proportionally increasing computational cost. While mixture of experts (MoE) models are made increasingly spars...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] Sparsely gated tiny linear experts

Sparsity allows scaling model parameters without proportionally increasing computational cost. While mixture of experts (MoE) models are made increasingly spars...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] Sparsely gated tiny linear experts

Sparsity allows scaling model parameters without proportionally increasing computational cost. While mixture of experts (MoE) models are made increasingly spars...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] Socratic-SWE: Self-Evolving Coding Agents via Trace-Derived Agent Skills

LLM-driven software engineering agents have become a central testbed for real-world language-model capability, yet their training remains limited by the availab...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] A Comprehensive Anatomy of Human and DeepSeek-R1 LLM Mathematical Reasoning

The emergence of 'Aha moments' in large language models, particularly DeepSeek-R1-0120, has raised the question of whether these systems genuinely reason or mer...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] Reversible Foundations: Training a 120B Sparse MoE through State-Preserving Scaling

This paper reports on training a hundred-billion-parameter sparse mixture of experts on a single eight-GPU node, end to end. LightningLM 0.1V is a recurrence-ba...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] The Proxy Benders Decomposition

Benders decomposition is a fundamental framework for solving large-scale mixed-integer optimization problems with complicating variables that, when fixed, yield...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] M$^3$Exam: Benchmarking Multimodal Memory for Realistic User-Agent Interactions

Language agents are increasingly deployed over accumulating multimodal information, yet existing benchmarks assume a human-human form with sparse visuals and st...

#research #paper #ai #nlp
1 week ago · ai · - · -

[Paper] RealDocBench: A Benchmark for Field-Level QA and Layout Understanding on Real-World Regulated Documents

Document parsing systems are increasingly deployed in high-stakes, regulated workflows such as mortgage underwriting, financial reporting, supply-chain logistic...

#research #paper #ai #computer-vision
1 week ago · ai · - · -

[Paper] Generative Modeling of Discrete Latent Structures via Dynamic Policy Gradients

Many scientific problems require inferring unobserved mechanistic latent states from indirect observations. While classical approaches, including expectation ma...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] Mind the Gap: Disentangling Performance Bottlenecks in Video Instance Segmentation

In Video Instance Segmentation (VIS), classification, segmentation, and tracking objectives are jointly evaluated, but their individual contributions to perform...

#research #paper #ai #computer-vision
1 week ago · software · - · -

[Paper] Is US Defense Acquisition Ready to Acquire AI-Enabled Capabilities? Assessing the DoD Software Acquisition Pathway Through a Scenario-Based Policy Analysis

As AI systems transition from experimental prototypes to mission-critical tools, their dependence on dynamic data, evolving models, and governance raises questi...

#research #paper #software
1 week ago · ai · - · -

[Paper] Online Pandora's Box for Contextual LLM Cascading

Motivated by Large Language Model (LLM) cascading, we propose an online contextual Pandora's Box model for adaptively querying and selecting LLM APIs. In each p...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] Impact of Synthetic Lesional MR Images in Automated Focal Cortical Dysplasia Detection in Low-Data Scenarios

Background and Purpose: Automated detection of focal cortical dysplasia (FCD) requires large volumes of voxelwise lesion-delineated MRI data, which are difficul...

#research #paper #ai #machine-learning #computer-vision
1 week ago · ai · - · -

[Paper] Do Coding Agents Deceive Us? Detecting and Preventing Cheating via Capped Evaluation with Randomized Tests

A growing failure mode in agent evaluation and training is that models can achieve high evaluation scores by exploiting shortcuts instead of solving the intende...

#research #paper #ai #machine-learning #nlp
1 week ago · ai · - · -

[Paper] Beyond Backscatter: InSAR coherence from detected SAR images

In this work, we propose a deep learning framework for coherence regression directly from detected SAR images, without the need for accurate coregistration. A R...

#research #paper #ai #computer-vision
1 week ago · software · - · -

[Paper] On the Shoulders of Giants: Empowering Automated Smart Contract Auditing via the GiAnt Corpus

High-quality smart contract auditing datasets are crucial for evaluating security tools and advancing smart contract security research. Two major limitations of...

#research #paper #software
1 week ago · ai · - · -

[Paper] Combinatorial Landscape Analysis for Dominating Set and Vertex Coloring

We analyze the two combinatorial problems of Dominating Set and Vertex Coloring regarding what kind of local optima are present for various instances. For a var...

#research #paper #ai
1 week ago · ai · - · -

[Paper] Combinatorial Landscape Analysis for Dominating Set and Vertex Coloring

We analyze the two combinatorial problems of Dominating Set and Vertex Coloring regarding what kind of local optima are present for various instances. For a var...

#research #paper #ai
1 week ago · ai · - · -

[Paper] Combinatorial Landscape Analysis for Dominating Set and Vertex Coloring

We analyze the two combinatorial problems of Dominating Set and Vertex Coloring regarding what kind of local optima are present for various instances. For a var...

#research #paper #ai
1 week ago · ai · - · -

[Paper] DirectAudioEdit: Inversion-Free Text-Guided Audio Editing via Diffusion Prediction Contrast

Text-guided audio editing aims to modify the language-specified acoustic content while preserving edit-irrelevant source components. Existing training-free meth...

#research #paper #ai #nlp
1 week ago · ai · - · -

[Paper] LLM-Guided Evolution for Medical Decision Pipelines

Adapting large language models (LLMs) to clinical workflows often requires costly fine-tuning or manual prompt and pipeline engineering. We study LLM-guided MAP...

#research #paper #ai #nlp
1 week ago · ai · - · -

[Paper] LLM-Guided Evolution for Medical Decision Pipelines

Adapting large language models (LLMs) to clinical workflows often requires costly fine-tuning or manual prompt and pipeline engineering. We study LLM-guided MAP...

#research #paper #ai #nlp
1 week ago · ai · - · -

[Paper] LLM-Guided Evolution for Medical Decision Pipelines

Adapting large language models (LLMs) to clinical workflows often requires costly fine-tuning or manual prompt and pipeline engineering. We study LLM-guided MAP...

#research #paper #ai #nlp
1 week ago · ai · - · -

[Paper] Hierarchical Certified Semantic Commitment for Byzantine-Resilient LLM-Agent Collaboration

Byzantine collaboration among large-language-model agents requires a finality-control primitive: given delivered stochastic, structured natural-language proposa...

#research #paper #ai #machine-learning
1 week ago · ai · - · -

[Paper] Hierarchical Certified Semantic Commitment for Byzantine-Resilient LLM-Agent Collaboration

Byzantine collaboration among large-language-model agents requires a finality-control primitive: given delivered stochastic, structured natural-language proposa...

#research #paper #ai #machine-learning
1 week ago · software · - · -

[Paper] QBugLM: An Agentic Benchmarking Framework for LLM-based Quantum Software Debugging

Quantum software bugs often yield silent, incorrect outputs rather than explicit errors, making them particularly difficult to detect and repair with convention...

#research #paper #software
1 week ago · ai · - · -

[Paper] SV-Detect: AI-generated Text Detection with Steering Vectors

Detecting machine-generated text is especially difficult under distribution shift, such as transfer across domains, source models, and editing attacks. We propo...

#research #paper #ai #machine-learning #nlp
1 week ago · ai · - · -

[Paper] Acoustic Cue Alignment in Audio Language Models for Speech Emotion Recognition

Instruction-following audio language models (ALMs) can be augmented with explicit acoustic cues, yet it remains unclear whether such cues are used in a grounded...

#research #paper #ai #machine-learning #nlp
1 week ago · ai · - · -

[Paper] Phun-Bench: Evaluating LLMs on Phonological Understanding in Chinese

Language is a vehicle for thought, intricately tied to sounds, symbols, and meaning. However, most large language model (LLM) research focuses on meaning (seman...

#research #paper #ai #nlp
1 week ago · ai · - · -

[Paper] SWE-Explore: Benchmarking How Coding Agents Explore Repositories

Repository-level coding benchmarks such as SWE-bench have driven a rapid surge in the capabilities of coding agents. Yet they usually treat coding tasks as a ho...

#research #paper #ai #nlp
1 week ago · ai · - · -

[Paper] The Whale That Outswam Evolution: Swarm Intelligence Maximises Memory in Connectome Reservoirs

Reservoir computing exploits the fixed dynamics of a recurrent network for temporal processing, requiring only a trained linear readout. Biological neural conne...

#research #paper #ai #machine-learning
1 week ago · devops · - · -

[Paper] Clairvoyant: Predictive SJF Scheduling to Mitigate Head-of-Line Blocking in Serial LLM Backends

Serial LLM inference backends -- such as Ollama -- process requests one at a time under FCFS admission, causing Head-of-Line Blocking (HOLB) under mixed workloa...

#research #paper #devops
1 week ago · devops · - · -

[Paper] Clairvoyant: Predictive SJF Scheduling to Mitigate Head-of-Line Blocking in Serial LLM Backends

Serial LLM inference backends -- such as Ollama -- process requests one at a time under FCFS admission, causing Head-of-Line Blocking (HOLB) under mixed workloa...

#research #paper #devops
1 week ago · ai · - · -

[Paper] KIT's Submission to Cross-Lingual Voice Cloning in IWSLT 2026

Cross-lingual voice cloning aims to generate speech in a target language while preserving speaker identity from a source-language reference. This task is centra...

#research #paper #ai #nlp
1 week ago · ai · - · -

[Paper] When Large Language Models Fail in Healthcare: Evaluating Sensitivity to Prompt Variations

Large Language Models (LLMs) are increasingly used in healthcare for tasks such as clinical question answering, diagnosis support, and report summarization. Des...

#research #paper #ai #machine-learning #nlp
1 week ago · ai · - · -

[Paper] MMAE: A Massive Multitask Audio Editing Benchmark

We introduce MMAE, a Massive Multitask Audio Editing benchmark, serving as the first comprehensive evaluation testbed designed for general-purpose instruction-b...

#research #paper #ai #nlp
1 week ago · software · - · -

[Paper] A Causal Probabilistic Framework for Perception-Informed Closed-Loop Simulation of Autonomous Driving

Software-in-the-loop (SIL) simulation is a cornerstone for the validation of modern automotive safety functions. However, many current frameworks utilize ideal ...

#research #paper #software
1 week ago · ai · - · -

[Paper] A Data-Free Symbolic Regression Approach for Solving Equations

Many equations arising in science currently cannot be solved by available analytical techniques and are therefore solved numerically, without yielding explicit ...

#research #paper #ai
1 week ago · ai · - · -

[Paper] A Data-Free Symbolic Regression Approach for Solving Equations

Many equations arising in science currently cannot be solved by available analytical techniques and are therefore solved numerically, without yielding explicit ...

#research #paper #ai
1 week ago · ai · - · -

[Paper] A Data-Free Symbolic Regression Approach for Solving Equations

Many equations arising in science currently cannot be solved by available analytical techniques and are therefore solved numerically, without yielding explicit ...

#research #paper #ai
1 week ago · software · - · -

[Paper] MalSkillBench: A Runtime-Verified Benchmark of Malicious Agent Skills

AI coding agents such as Claude Code and Gemini CLI increasingly extend themselves with third-party skills: markdown packages bundling natural-language instruct...

#research #paper #software
1 week ago · ai · - · -

[Paper] MetaConfigurator: AI-Assisted RDF Authoring from JSON Data

Scientific workflows increasingly generate structured JSON data that is easy to exchange but difficult to interpret consistently across systems due to lacking s...

#research #paper #ai #machine-learning
1 week ago · software · - · -

[Paper] Porting Declarative UI to HarmonyOS: A Heuristic-guided LLM Approach

As an emerging operating system, HarmonyOS has a significant demand for software migration from platforms such as Android and iOS, where the User Interface (UI)...

#research #paper #software
1 week ago · devops · - · -

[Paper] Predictive Autoscaling in Cloud-Native and Federated Cloud-Edge Computing Environments: A Taxonomy and Future Directions

Autoscaling is a key capability in cloud-native systems, where dynamic workloads, heterogeneous environments, and latency-sensitive applications require efficie...

#research #paper #devops
1 week ago · devops · - · -

[Paper] Predictive Autoscaling in Cloud-Native and Federated Cloud-Edge Computing Environments: A Taxonomy and Future Directions

Autoscaling is a key capability in cloud-native systems, where dynamic workloads, heterogeneous environments, and latency-sensitive applications require efficie...

#research #paper #devops
1 week ago · devops · - · -

[Paper] PCCL: Process Group-Aware Scalable and Generic Collective Algorithm Synthesizer

Distributed machine learning has become increasingly important due to the massive scale of large-scale generative models. Both model parameters and data are dis...

#research #paper #devops
1 week ago · devops · - · -

[Paper] Mission-Level Runtime Assurance Framework for Autonomous Driving

This paper studies runtime safety for autonomous driving when high-level driving commands become faulty or unreliable. Unlike conventional runtime-safety approa...

#research #paper #devops

Newer posts

Older posts