Source

arXiv

5750 posts from this source

Sort:

2 months ago · ai · - · -

[Paper] Mobile-O: Unified Multimodal Understanding and Generation on Mobile Device

Unified multimodal models can both understand and generate visual content within a single architecture. Existing models, however, remain data-hungry and too hea...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] tttLRM: Test-Time Training for Long Context and Autoregressive 3D Reconstruction

We propose tttLRM, a novel large 3D reconstruction model that leverages a Test-Time Training (TTT) layer to enable long-context, autoregressive 3D reconstructio...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] A Very Big Video Reasoning Suite

Rapid progress in video models has largely focused on visual quality, leaving their reasoning capabilities underexplored. Video reasoning grounds intelligence i...

#video reasoning #large-scale dataset #computer vision #benchmark #AI research
2 months ago · ai · - · -

[Paper] Flow3r: Factored Flow Prediction for Scalable Visual Geometry Learning

Current feed-forward 3D/4D reconstruction systems rely on dense geometry and pose supervision -- expensive to obtain at scale and particularly scarce for dynami...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] Skill-Inject: Measuring Agent Vulnerability to Skill File Attacks

LLM agents are evolving rapidly, powered by code execution, tools, and the recently introduced agent skills feature. Skills allow users to extend LLM applicatio...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] JUCAL: Jointly Calibrating Aleatoric and Epistemic Uncertainty in Classification Tasks

We study post-calibration uncertainty for trained ensembles of classifiers. Specifically, we consider both aleatoric (label noise) and epistemic (model) uncerta...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] Behavior Learning (BL): Learning Hierarchical Optimization Structures from Data

Inspired by behavioral science, we propose Behavior Learning (BL), a novel general-purpose machine learning framework that learns interpretable and identifiable...

#behavior learning #hierarchical optimization #interpretable AI #utility maximization #machine learning research
2 months ago · ai · - · -

[Paper] Conformal Risk Control for Non-Monotonic Losses

Conformal risk control is an extension of conformal prediction for controlling risk functions beyond miscoverage. The original algorithm controls the expected v...

#conformal prediction #risk control #non-monotonic loss #machine learning
2 months ago · ai · - · -

[Paper] Simulation-Ready Cluttered Scene Estimation via Physics-aware Joint Shape and Pose Optimization

Estimating simulation-ready scenes from real-world observations is crucial for downstream planning and policy learning tasks. Regretfully, existing methods stru...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] Agentic AI for Scalable and Robust Optical Systems Control

We present AgentOptics, an agentic AI framework for high-fidelity, autonomous optical system control built on the Model Context Protocol (MCP). AgentOptics inte...

#agentic AI #optical hardware control #LLM agents #tool abstraction protocol #benchmark evaluation
2 months ago · ai · - · -

[Paper] Recurrent Structural Policy Gradient for Partially Observable Mean Field Games

Mean Field Games (MFGs) provide a principled framework for modeling interactions in large population models: at scale, population dynamics become deterministic,...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] Do Large Language Models Understand Data Visualization Rules?

Data visualization rules-derived from decades of research in design and perception-ensure trustworthy chart communication. While prior work has shown that large...

#research #paper #ai #computer-vision
2 months ago · ai · - · -

[Paper] KNIGHT: Knowledge Graph-Driven Multiple-Choice Question Generation with Adaptive Hardness Calibration

With the rise of large language models (LLMs), they have become instrumental in applications such as Retrieval-Augmented Generation (RAG). Yet evaluating these ...

#research #paper #ai #machine-learning #nlp
2 months ago · ai · - · -

[Paper] Modeling Epidemiological Dynamics Under Adversarial Data and User Deception

Epidemiological models increasingly rely on self-reported behavioral data such as vaccination status, mask usage, and social distancing adherence to forecast di...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] AdaEvolve: Adaptive LLM Driven Zeroth-Order Optimization

The paradigm of automated program generation is shifting from one-shot generation to inference-time search, where Large Language Models (LLMs) function as seman...

#LLM #evolutionary optimization #meta-learning #bandit algorithms #zeroth-order optimization
2 months ago · ai · - · -

[Paper] LAD: Learning Advantage Distribution for Reasoning

Current reinforcement learning objectives for large-model reasoning primarily focus on maximizing expected rewards. This paradigm can lead to overfitting to dom...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] To Reason or Not to: Selective Chain-of-Thought in Medical Question Answering

Objective: To improve the efficiency of medical question answering (MedQA) with large language models (LLMs) by avoiding unnecessary reasoning while maintaining...

#research #paper #ai #machine-learning #nlp
2 months ago · ai · - · -

[Paper] Adaptation to Intrinsic Dependence in Diffusion Language Models

Diffusion language models (DLMs) have recently emerged as a promising alternative to autoregressive (AR) approaches, enabling parallel token generation beyond a...

#diffusion language models #adaptive unmasking schedule #total correlation #theoretical convergence #AI research
2 months ago · ai · - · -

[Paper] NanoKnow: How to Know What Your Language Model Knows

How do large language models (LLMs) know what they know? Answering this question has been difficult because pre-training data is often a 'black box' -- unknown ...

#research #paper #ai #machine-learning #nlp
2 months ago · ai · - · -

[Paper] NovaPlan: Zero-Shot Long-Horizon Manipulation via Closed-Loop Video Language Planning

Solving long-horizon tasks requires robots to integrate high-level semantic reasoning with low-level physical interaction. While vision-language models (VLMs) a...

#research #paper #ai #machine-learning #computer-vision
2 months ago · ai · - · -

[Paper] ReSyn: Autonomously Scaling Synthetic Environments for Reasoning Models

Reinforcement learning with verifiable rewards (RLVR) has emerged as a promising approach for training reasoning language models (RLMs) by leveraging supervisio...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] Benchmarking Unlearning for Vision Transformers

Research in machine unlearning (MU) has gained strong momentum: MU is now widely regarded as a critical capability for building safe and fair AI. In parallel, r...

#machine-unlearning #vision-transformers #benchmark #computer-vision #research
2 months ago · ai · - · -

[Paper] Reliable Abstention under Adversarial Injections: Tight Lower Bounds and New Upper Bounds

We study online learning in the adversarial injection model introduced by [Goel et al. 2017], where a stream of labeled examples is predominantly drawn i.i.d. f...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] Transcending the Annotation Bottleneck: AI-Powered Discovery in Biology and Medicine

The dependence on expert annotation has long constituted the primary rate-limiting step in the application of artificial intelligence to biomedicine. While supe...

#research #paper #ai #machine-learning #computer-vision
2 months ago · devops · - · -

[Paper] Mitigating Artifacts in Pre-quantization Based Scientific Data Compressors with Quantization-aware Interpolation

Error-bounded lossy compression has been regarded as a promising way to address the ever-increasing amount of scientific data in today's high-performance comput...

#research #paper #devops
2 months ago · ai · - · -

[Paper] BabyLM Turns 4: Call for Papers for the 2026 BabyLM Workshop

BabyLM aims to dissolve the boundaries between cognitive modeling and language modeling. We call for both workshop papers and for researchers to join the 4th Ba...

#language models #data-efficient NLP #multilingual #cognitive NLP #workshop call for papers
2 months ago · ai · - · -

[Paper] How Retrieved Context Shapes Internal Representations in RAG

Retrieval-augmented generation (RAG) enhances large language models (LLMs) by conditioning generation on retrieved external documents, but the effect of retriev...

#research #paper #ai #nlp
2 months ago · ai · - · -

[Paper] StructXLIP: Enhancing Vision-language Models with Multimodal Structural Cues

Edge-based representations are fundamental cues for visual understanding, a principle rooted in early vision research and still central today. We extend this pr...

#research #paper #ai #machine-learning #computer-vision
2 months ago · ai · - · -

[Paper] Multilingual Large Language Models do not comprehend all natural languages to equal degrees

Large Language Models (LLMs) play a critical role in how humans access information. While their core use relies on comprehending written requests, our understan...

#research #paper #ai #machine-learning #nlp
2 months ago · ai · - · -

[Paper] Entropy in Large Language Models

In this study, the output of large language models (LLM) is considered an information source generating an unlimited sequence of symbols drawn from a finite alp...

#research #paper #ai #nlp
2 months ago · ai · - · -

[Paper] CodeCompass: Navigating the Navigation Paradox in Agentic Code Intelligence

Modern code intelligence agents operate in contexts exceeding 1 million tokens--far beyond the scale where humans manually locate relevant files. Yet agents con...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] Position: General Alignment Has Hit a Ceiling; Edge Alignment Must Be Taken Seriously

Large language models are being deployed in complex socio-technical systems, which exposes limits in current alignment practice. We take the position that the d...

#research #paper #ai #nlp
2 months ago · ai · - · -

[Paper] AgenticSum: An Agentic Inference-Time Framework for Faithful Clinical Text Summarization

Large language models (LLMs) offer substantial promise for automating clinical text summarization, yet maintaining factual consistency remains challenging due t...

#clinical summarization #large language models #fact‑checking #agentic inference #medical NLP
2 months ago · devops · - · -

[Paper] A Context-Aware Knowledge Graph Platform for Stream Processing in Industrial IoT

Industrial IoT ecosystems bring together sensors, machines and smart devices operating collaboratively across industrial environments. These systems generate la...

#knowledge graph #stream processing #industrial IoT #Kafka #Flink
2 months ago · ai · - · -

[Paper] LLM-enabled Applications Require System-Level Threat Monitoring

LLM-enabled applications are rapidly reshaping the software ecosystem by using large language models as core reasoning components for complex task execution. Th...

#research #paper #ai #machine-learning
2 months ago · ai · - · -

[Paper] MAS-FIRE: Fault Injection and Reliability Evaluation for LLM-Based Multi-Agent Systems

As LLM-based Multi-Agent Systems (MAS) are increasingly deployed for complex tasks, ensuring their reliability has become a pressing challenge. Since MAS coordi...

#research #paper #ai #machine-learning
2 months ago · devops · - · -

[Paper] Linear Reservoir: A Diagonalization-Based Optimization

We introduce a diagonalization-based optimization for Linear Echo State Networks (ESNs) that reduces the per-step computational complexity of reservoir state up...

#research #paper #devops #computer-vision
2 months ago · ai · - · -

[Paper] Unsupervised Anomaly Detection in NSL-KDD Using $β$-VAE: A Latent Space and Reconstruction Error Approach

As Operational Technology increasingly integrates with Information Technology, the need for Intrusion Detection Systems becomes more important. This paper explo...

#research #paper #ai #machine-learning
2 months ago · devops · - · -

[Paper] A Risk-Aware UAV-Edge Service Framework for Wildfire Monitoring and Emergency Response

Wildfire monitoring demands timely data collection and processing for early detection and rapid response. UAV-assisted edge computing is a promising approach, b...

#UAV #edge computing #wildfire monitoring #routing optimization #risk-aware
2 months ago · ai · - · -

[Paper] Carbon-Aware Governance Gates: An Architecture for Sustainable GenAI Development

The rapid adoption of Generative AI (GenAI) in the software development life cycle (SDLC) increases computational demand, which can raise the carbon footprint o...

#sustainable AI #carbon-aware governance #generative AI #energy efficiency #software development lifecycle
2 months ago · software · - · -

[Paper] Git Takes Two: Split-View Awareness for Collaborative Learning of Distributed Workflows in Git

Git is widely used for collaborative software development, but it can be challenging for newcomers. While most learning tools focus on individual workflows, Git...

#research #paper #software
2 months ago · devops · - · -

[Paper] GPU-Resident Gaussian Process Regression Leveraging Asynchronous Tasks with HPX

Gaussian processes (GPs) are a widely used regression tool, but the cubic complexity of exact solvers limits their scalability. To address this challenge, we ex...

#research #paper #devops
2 months ago · software · - · -

[Paper] Towards Understanding Views on Combining Videos and Gamification in Software Engineering Training

Watching training videos passively leads to superficial learning. Adding gamification can increase engagement. We study how software engineering students and in...

#research #paper #software
2 months ago · ai · - · -

[Paper] Workflow-Level Design Principles for Trustworthy GenAI in Automotive System Engineering

The adoption of large language models in safety-critical system engineering is constrained by trustworthiness, traceability, and alignment with established veri...

#research #paper #ai #machine-learning
2 months ago · software · - · -

[Paper] FuzzySQL: Uncovering Hidden Vulnerabilities in DBMS Special Features with LLM-Driven Fuzzing

Traditional database fuzzing techniques primarily focus on syntactic correctness and general SQL structures, leaving critical yet obscure DBMS features, such as...

#research #paper #software
2 months ago · software · - · -

[Paper] 'Write in English, Nobody Understands Your Language Here': A Study of Non-English Trends in Open-Source Repositories

The open-source software (OSS) community has historically been dominated by English as the primary language for code, documentation, and developer interactions....

#research #paper #software
2 months ago · ai · - · -

[Paper] When AI Teammates Meet Code Review: Collaboration Signals Shaping the Integration of Agent-Authored Pull Requests

Autonomous coding agents increasingly contribute to software development by submitting pull requests on GitHub; yet, little is known about how these contributio...

#research #paper #ai #machine-learning
2 months ago · devops · - · -

[Paper] Why iCloud Fails: The Category Mistake of Cloud Synchronization

iCloud Drive presents a filesystem interface but implements cloud synchronization semantics that diverge from POSIX in fundamental ways. This divergence is not ...

#research #paper #devops

Newer posts

Older posts