Source

arXiv

5644 posts from this source

Sort:

2 months ago · ai · - · -

[Paper] Why Pass@k Optimization Can Degrade Pass@1: Prompt Interference in LLM Post-training

Pass@k is a widely used performance metric for verifiable large language model tasks, including mathematical reasoning, code generation, and short-answer reason...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] Human Video Generation from a Single Image with 3D Pose and View Control

Recent diffusion methods have made significant progress in generating videos from single images due to their powerful visual generation capabilities. However, c...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] Spa3R: Predictive Spatial Field Modeling for 3D Visual Reasoning

While Vision-Language Models (VLMs) exhibit exceptional 2D visual understanding, their ability to comprehend and reason about 3D space--a cornerstone of spatial...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] The Diffusion Duality, Chapter II: $Ψ$-Samplers and Efficient Curriculum

Uniform-state discrete diffusion models excel at few-step generation and guidance due to their ability to self-correct, making them preferred over autoregressiv...

#research #paper #ai #machine-learning
2 months ago · devops · - · -

[Paper] Circumventing the CAP Theorem with Open Atomic Ethernet

The CAP theorem is routinely treated as a systems law: under network partition, a replicated service must sacrifice either consistency or availability. The theo...

#CAP theorem #Ethernet protocol #distributed systems #network topology #data‑center networking
2 months ago · ai · - · -

[Paper] Mask-HybridGNet: Graph-based segmentation with emergent anatomical correspondence from pixel-level supervision

Graph-based medical image segmentation represents anatomical structures using boundary graphs, providing fixed-topology landmarks and inherent population-level ...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] XMorph: Explainable Brain Tumor Analysis Via LLM-Assisted Hybrid Deep Intelligence

Deep learning has significantly advanced automated brain tumor diagnosis, yet clinical adoption remains limited by interpretability and computational constraint...

#research #paper #ai #machine-learning #computer-vision
2 months ago · ai · - · -

[Paper] Seeing Through Words: Controlling Visual Retrieval Quality with Language Models

Text-to-image retrieval is a fundamental task in vision-language learning, yet in real-world scenarios it is often challenged by short and underspecified user q...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] Efficient Hierarchical Any-Angle Path Planning on Multi-Resolution 3D Grids

Hierarchical, multi-resolution volumetric mapping approaches are widely used to represent large and complex environments as they can efficiently capture their o...

#path-planning #any-angle #multi-resolution #robotics #ROS
2 months ago · ai · - · -

[Paper] NoRD: A Data-Efficient Vision-Language-Action Model that Drives without Reasoning

Vision-Language-Action (VLA) models are advancing autonomous driving by replacing modular pipelines with unified end-to-end architectures. However, current VLAs...

#research #paper #ai #machine-learning #computer-vision
2 months ago · ai · - · -

[Paper] Sequential Counterfactual Inference for Temporal Clinical Data: Addressing the Time Traveler Dilemma

Counterfactual inference enables clinicians to ask 'what if' questions about patient outcomes, but standard methods assume feature independence and simultaneous...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] PVminer: A Domain-Specific Tool to Detect the Patient Voice in Patient Generated Data

Patient-generated text such as secure messages, surveys, and interviews contains rich expressions of the patient voice (PV), reflecting communicative behaviors ...

#research #paper #ai #machine-learning #nlp
2 months ago · ai · - · -

[Paper] Not Just How Much, But Where: Decomposing Epistemic Uncertainty into Per-Class Contributions

In safety-critical classification, the cost of failure is often asymmetric, yet Bayesian deep learning summarises epistemic uncertainty with a single scalar, mu...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] SELAUR: Self Evolving LLM Agent via Uncertainty-aware Rewards

Large language models (LLMs) are increasingly deployed as multi-step decision-making agents, where effective reward design is essential for guiding learning. Al...

#research #paper #ai #machine-learning #nlp
2 months ago · ai · - · -

[Paper] CG-DMER: Hybrid Contrastive-Generative Framework for Disentangled Multimodal ECG Representation Learning

Accurate interpretation of electrocardiogram (ECG) signals is crucial for diagnosing cardiovascular diseases. Recent multimodal approaches that integrate ECGs w...

#multimodal learning #contrastive generative models #ECG representation #medical AI #self‑supervised pretraining
2 months ago · ai · - · -

[Paper] Scaling State-Space Models on Multiple GPUs with Tensor Parallelism

Selective state space models (SSMs) have rapidly become a compelling backbone for large language models, especially for long-context workloads. Yet in deploymen...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] A Benchmark for Deep Information Synthesis

Large language model (LLM)-based agents are increasingly used to solve complex tasks involving tool use, such as web browsing, code execution, and data analysis...

#LLM benchmark #information synthesis #hallucination analysis #multi‑step reasoning #evaluation dataset
2 months ago · devops · - · -

[Paper] ReviveMoE: Fast Recovery for Hardware Failures in Large-Scale MoE LLM Inference Deployments

As LLM deployments scale over more hardware, the probability of a single failure in a system increases significantly, and cloud operators must consider robust c...

#research #paper #devops
2 months ago · ai · - · -

[Paper] Prompt-Level Distillation: A Non-Parametric Alternative to Model Fine-Tuning for Efficient Reasoning

Advanced reasoning typically requires Chain-of-Thought prompting, which is accurate but incurs prohibitive latency and substantial test-time inference costs. Th...

#research #paper #ai #nlp
2 months ago · ai · - · -

[Paper] Beyond the Star Rating: A Scalable Framework for Aspect-Based Sentiment Analysis Using LLMs and Text Classification

Customer-provided reviews have become an important source of information for business owners and other customers alike. However, effectively analyzing millions ...

#aspect-based sentiment analysis #large language models #text classification #scalable NLP #sentiment mining
2 months ago · software · - · -

[Paper] Validation of an analyzability model for quantum software: a family of experiments

The analyzability of hybrid software, which integrates both classical and quantum components, is a key factor in ensuring its maintainability and industrial ado...

#research #paper #software
2 months ago · ai · - · -

[Paper] An Expert Schema for Evaluating Large Language Model Errors in Scholarly Question-Answering Systems

Large Language Models (LLMs) are transforming scholarly tasks like search and summarization, but their reliability remains uncertain. Current evaluation metrics...

#large-language-models #error-taxonomy #scholarly-qa #evaluation-metrics
2 months ago · software · - · -

[Paper] Automated Detection and Mitigation of Dependability Failures in Healthcare Scenarios through Digital Twins

Medical Cyber-Physical Systems (CPSs) integrating Patients, Devices, and healthcare personnel (Physicians) form safety-critical PDP triads whose dependability i...

#digital twin #cyber‑physical systems #formal verification #machine learning #healthcare safety
2 months ago · ai · - · -

[Paper] MIP Candy: A Modular PyTorch Framework for Medical Image Processing

Medical image processing demands specialized software that handles high-dimensional volumetric data, heterogeneous file formats, and domain-specific training pr...

#research #paper #ai #machine-learning #computer-vision
2 months ago · software · - · -

[Paper] A Modular Multi-Document Framework for Scientific Visualization and Simulation in Java

This paper presents the design and implementation of a modular multi-document interface (MDI) framework for scientific visualization and simulation in the Java ...

#Java #scientific visualization #simulation framework #modular architecture #Maven
2 months ago · devops · - · -

[Paper] Is a LOCAL algorithm computable?

Common definitions of the 'standard' LOCAL model tend to be sloppy and even self-contradictory on one point: do the nodes update their state using an arbitrary ...

#research #paper #devops
2 months ago · ai · - · -

[Paper] Toward an Agentic Infused Software Ecosystem

Fully leveraging the capabilities of AI agents in software development requires a rethinking of the software ecosystem itself. To this end, this paper outlines ...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] Airavat: An Agentic Framework for Internet Measurement

Internet measurement faces twin challenges: complex analyses require expert-level orchestration of tools, yet even syntactically correct implementations can hav...

#research #paper #ai #machine-learning
2 months ago · software · - · -

[Paper] A Morton-Type Space-Filling Curve for Pyramid Subdivision and Hybrid Adaptive Mesh Refinement

The forest-of-refinement-trees approach allows for dynamic adaptive mesh refinement (AMR) at negligible cost. While originally developed for quadrilateral and h...

#adaptive mesh refinement #space-filling curve #p4est library #high-performance computing
2 months ago · ai · - · -

[Paper] Body-Reservoir Governance in Repeated Games: Embodied Decision-Making, Dynamic Sentinel Adaptation, and Complexity-Regularized Optimization

Standard game theory explains cooperation in repeated games through conditional strategies such as Tit-for-Tat (TfT), but these require continuous computation t...

#embodied AI #echo state networks #repeated games #adaptive governance #computational efficiency
2 months ago · software · - · -

[Paper] Unseen-Codebases-Domain Data Synthesis and Training Based on Code Graphs

In the context of newly release software frameworks, large language models (LLMs) often exhibit poor performance and a high rate of hallucination, as they are n...

#research #paper #software
2 months ago · software · - · -

[Paper] PackMonitor: Enabling Zero Package Hallucinations Through Decoding-Time Monitoring

As Large Language Models (LLMs) are increasingly integrated into software development workflows, their trustworthiness has become a critical concern. However, i...

#research #paper #software
2 months ago · ai · - · -

[Paper] Agile V: A Compliance-Ready Framework for AI-Augmented Engineering -- From Concept to Audit-Ready Delivery

Current AI-assisted engineering workflows lack a built-in mechanism to maintain task-level verification and regulatory traceability at machine-speed delivery. A...

#AI agents #software engineering #Agile #compliance automation #devops
2 months ago · devops · - · -

[Paper] Lagom: Unleashing the Power of Communication and Computation Overlapping for Distributed LLM Training

Overlapping communication with computation is crucial for distributed large-model training, yet optimizing it - especially when computation becomes the bottlene...

#research #paper #devops
2 months ago · software · - · -

[Paper] An LLM-driven Scenario Generation Pipeline Using an Extended Scenic DSL for Autonomous Driving Safety Validation

Real-world crash reports, which combine textual summaries and sketches, are valuable for scenario-based testing of autonomous driving systems (ADS). However, cu...

#research #paper #software
2 months ago · devops · - · -

[Paper] A Granularity Characterization of Task Scheduling Effectiveness

Task-based runtime systems provide flexible load balancing and portability for parallel scientific applications, but their strong scaling is highly sensitive to...

#research #paper #devops
2 months ago · ai · - · -

[Paper] Heterogeneity-Aware Client Selection Methodology For Efficient Federated Learning

Federated Learning (FL) enables a distributed client-server architecture where multiple clients collaboratively train a global Machine Learning (ML) model witho...

#research #paper #ai #machine-learning
2 months ago · devops · - · -

[Paper] Circumventing the FLP Impossibility Result with Open Atomic Ethernet

The Fischer--Lynch--Paterson (FLP) impossibility result is widely regarded as one of the most fundamental negative results in distributed computing: no determin...

#research #paper #devops
2 months ago · ai · - · -

[Paper] Mobile-O: Unified Multimodal Understanding and Generation on Mobile Device

Unified multimodal models can both understand and generate visual content within a single architecture. Existing models, however, remain data-hungry and too hea...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] tttLRM: Test-Time Training for Long Context and Autoregressive 3D Reconstruction

We propose tttLRM, a novel large 3D reconstruction model that leverages a Test-Time Training (TTT) layer to enable long-context, autoregressive 3D reconstructio...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] A Very Big Video Reasoning Suite

Rapid progress in video models has largely focused on visual quality, leaving their reasoning capabilities underexplored. Video reasoning grounds intelligence i...

#video reasoning #large-scale dataset #computer vision #benchmark #AI research
2 months ago · ai · - · -

[Paper] Flow3r: Factored Flow Prediction for Scalable Visual Geometry Learning

Current feed-forward 3D/4D reconstruction systems rely on dense geometry and pose supervision -- expensive to obtain at scale and particularly scarce for dynami...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] Skill-Inject: Measuring Agent Vulnerability to Skill File Attacks

LLM agents are evolving rapidly, powered by code execution, tools, and the recently introduced agent skills feature. Skills allow users to extend LLM applicatio...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] JUCAL: Jointly Calibrating Aleatoric and Epistemic Uncertainty in Classification Tasks

We study post-calibration uncertainty for trained ensembles of classifiers. Specifically, we consider both aleatoric (label noise) and epistemic (model) uncerta...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] Behavior Learning (BL): Learning Hierarchical Optimization Structures from Data

Inspired by behavioral science, we propose Behavior Learning (BL), a novel general-purpose machine learning framework that learns interpretable and identifiable...

#behavior learning #hierarchical optimization #interpretable AI #utility maximization #machine learning research
2 months ago · ai · - · -

[Paper] Conformal Risk Control for Non-Monotonic Losses

Conformal risk control is an extension of conformal prediction for controlling risk functions beyond miscoverage. The original algorithm controls the expected v...

#conformal prediction #risk control #non-monotonic loss #machine learning
2 months ago · ai · - · -

[Paper] Simulation-Ready Cluttered Scene Estimation via Physics-aware Joint Shape and Pose Optimization

Estimating simulation-ready scenes from real-world observations is crucial for downstream planning and policy learning tasks. Regretfully, existing methods stru...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] Agentic AI for Scalable and Robust Optical Systems Control

We present AgentOptics, an agentic AI framework for high-fidelity, autonomous optical system control built on the Model Context Protocol (MCP). AgentOptics inte...

#agentic AI #optical hardware control #LLM agents #tool abstraction protocol #benchmark evaluation

Newer posts

Older posts