Source

arXiv

1312 posts from this source

Sort:

6 days ago · ai · - · -

[Paper] Hybrid Robustness Verification for Spatio-Temporal Neural Networks

With AI increasingly deployed in safety-critical systems, providing formal robustness guarantees for the underlying models is essential. Existing verification m...

#research #paper #ai #machine-learning #computer-vision
6 days ago · ai · - · -

[Paper] Learning Dynamics Reveal a Hierarchy of Weight-Induced Layerwise Gram Metrics

We study feed-forward ReLU networks with fixed readout and quadratic loss. The aim is to rewrite gradient descent not primarily as a dynamics in weight space, b...

#research #paper #ai #machine-learning
6 days ago · ai · - · -

[Paper] HDSL: A Hierarchical Domain-Specific Language for Structured 3D Indoor Scene Generation and Localized Editing with LLM Agents

Text-driven indoor scene generation and editing require an intermediate representation that language models can both produce and revise. Existing LLM-based syst...

#research #paper #ai #computer-vision
6 days ago · ai · - · -

[Paper] The Neutral Mask: How RLHF Provides Shallow Alignment while Leaving Partisan Structure Intact in a Large Language Model

The ambition behind alignment training is to make large language models safe and useful. The primary mechanism, reinforcement learning from human feedback (RLHF...

#research #paper #ai #nlp
6 days ago · ai · - · -

[Paper] Adaptive directional gradients for parameterised quantum circuits

Training parameterised quantum circuits (PQCs) on quantum hardware is bottlenecked by the measurement cost of gradient estimation, which under the parameter-shi...

#research #paper #ai #machine-learning
6 days ago · ai · - · -

[Paper] Tight Sample Complexity of Transformers

We tightly characterize the VC dimension of depth-L Transformers with a total of W parameters, mapping an input sequence of length T to a single output, establi...

#research #paper #ai #machine-learning
6 days ago · ai · - · -

[Paper] SearchSwarm: Towards Delegation Intelligence in Agentic LLMs for Long-Horizon Deep Research

Large language models are increasingly expected to handle complex, long-horizon real-world tasks whose context demands can grow without bound, yet model context...

#research #paper #ai #machine-learning
6 days ago · ai · - · -

[Paper] Disentanglement with Holographic Reduced Representations

Disentanglement, the separation of factors of variation in data using neural networks, remains a long-standing challenge in machine learning. Prior work has add...

#research #paper #ai #machine-learning
6 days ago · ai · - · -

[Paper] Beyond Probabilistic Similarity: Structural, Temporal, and Causal Limitations of Retrieval-Augmented Generation in the Legal Domain

Retrieval-Augmented Generation (RAG) has become a standard architectural response to unreliability in legal AI, yet high-profile failures, including fabricated ...

#research #paper #ai #machine-learning
6 days ago · ai · - · -

[Paper] Evaluating the Representation Space of Diffusion Models via Self-Supervised Principles

Diffusion models have demonstrated remarkable generative capabilities and have also emerged as powerful self-supervised representation learners, yet the connect...

#research #paper #ai #machine-learning #computer-vision
6 days ago · ai · - · -

[Paper] Proxy Reward Internalization and Mechanistic Exploitation: A Learned Precursor to Reward Hacking and Its Generalization

Reward hacking is usually studied after it becomes visible, once a model earns high proxy reward while failing the intended task. We instead study what proxy RL...

#research #paper #ai #machine-learning
6 days ago · ai · - · -

[Paper] IS-CoT: Breaking the Long-form Generation Collapse via Interleaved Structural Thinking

Generating coherent and controllable long-form content remains a persistent challenge for Large Language Models (LLMs). While reasoning-enhanced models have dem...

#research #paper #ai #nlp
6 days ago · ai · - · -

[Paper] BrainSurgery: Reproducible and Reliable Declarative Weight Manipulations for Model Editing and Upcycling

As deep learning models scale, managing, inspecting, and modifying large checkpoints has become increasingly challenging. Researchers often need to alter model ...

#research #paper #ai #machine-learning #nlp
6 days ago · ai · - · -

[Paper] Learning to Attack and Defend: Adaptive Red Teaming of Language Models via GRPO

AI red teaming must continually adapt to evolving attackers and defenders. Reinforcement learning offers a promising approach to discovering novel attacks, and ...

#research #paper #ai #machine-learning #nlp
6 days ago · ai · - · -

[Paper] Cranio-Diff: Diffusion-based Cross-domain Craniofacial Reconstruction with 2D X-ray Skull Guidance and Structural Identity Constraints

The state-of-the-art generative models, such as CycleGAN, Pix2Pix, and diffusion models have demonstrated remarkable performance in the face generation task. Ho...

#research #paper #ai #computer-vision
6 days ago · ai · - · -

[Paper] PsychoSafe: Eliciting Psychologically-Informed Refusals in Large Language Models

Large language models (LLMs) routinely face requests that should be refused, creating a trade-off between helpfulness and harm prevention. However, refusals the...

#research #paper #ai #nlp
6 days ago · ai · - · -

[Paper] AutoMegaKernel: A Statically-Checked Agent Harness for Self-Retargeting Megakernel Synthesis

AutoMegaKernel (AMK) compiles a HuggingFace Llama-family model into a single persistent cooperative CUDA kernel that runs the whole forward pass in one launch, ...

#research #paper #ai #machine-learning
6 days ago · ai · - · -

[Paper] AutoMegaKernel: A Statically-Checked Agent Harness for Self-Retargeting Megakernel Synthesis

AutoMegaKernel (AMK) compiles a HuggingFace Llama-family model into a single persistent cooperative CUDA kernel that runs the whole forward pass in one launch, ...

#research #paper #ai #machine-learning
6 days ago · ai · - · -

[Paper] GenEyePose: Patient-Free, Knowledge-Based Saccadic Eye Movement Modeling for Digital Neurophysiologic Biomarker Development

Eye movements, including saccades, are widely regarded as highly sensitive and objective biomarkers of neurophysiologic states. Detecting saccadic signatures in...

#research #paper #ai #computer-vision
6 days ago · ai · - · -

[Paper] SoccerNet 2026 Player-Centric Ball-Action Spotting:Retraining and Post-Processing Extensions to the FOOTPASS Baselines

We describe our system for the SoccerNet 2026 Player-Centric Ball-Action Spotting Challenge, which requires predicting who performs which action and when, acros...

#research #paper #ai #computer-vision
6 days ago · ai · - · -

[Paper] Correlation Is Not Enough: Embedding Human Metadata for Individual Causal Discovery

Ask a pretrained biomedical language model whether 'cortisol 28 ug/dL' and 'stock-market volatility' are related, and it returns a cosine similarity of 0.83 on ...

#research #paper #ai #machine-learning #nlp
6 days ago · ai · - · -

[Paper] Visual Prompting Meets Feature Reconstruction-Based Anomaly Detection with Dual-Teacher Supervision

Recent Anomaly Detection methods achieve perfect detection and segmentation scores on well-established datasets, such as MVTec. However, many of these methods f...

#research #paper #ai #machine-learning #computer-vision
6 days ago · ai · - · -

[Paper] SpatialWorld: Benchmarking Interactive Spatial Reasoning of Multimodal Agents in Real-World Tasks

Spatial reasoning is a foundational capability for multimodal large language models (MLLMs) to perceive and operate within the physical world. However, existing...

#research #paper #ai #machine-learning #nlp
6 days ago · ai · - · -

[Paper] Cross-Modal Masking for Robust Silent Speech Synthesis Using sEMG and Lipreading

Speech restoration through silent speech interfaces (SSIs) has emerged as a promising assistive technology for individuals with impaired or absent laryngeal voi...

#research #paper #ai #nlp
6 days ago · ai · - · -

[Paper] When Built-in Thinking Helps and Hurts: Constraint-Level Error Shifts in Instruction Following

Large reasoning models (LRMs) often improve math and coding performance, but their effect on instruction following is unclear. We study IFEval with Qwen3 models...

#research #paper #ai #nlp
6 days ago · ai · - · -

[Paper] End-to-End Context Compression at Scale

Long-context language model inference is bottlenecked by memory, as the KV cache grows with context length. Recent techniques to compress the KV cache fall shor...

#research #paper #ai #machine-learning #nlp
6 days ago · ai · - · -

[Paper] Beyond Accuracy: Community Perspectives on Machine Translation

Despite remarkable progress in machine translation (MT), non-AI communities have raised growing concerns about MT systems, suggesting a noticeable gap between t...

#research #paper #ai #nlp
6 days ago · ai · - · -

[Paper] Do Video Foundation Models Understand Intuitive Physics? A Layerwise Probing Analysis

We study whether pretrained video foundation models encode intuitive-physics information in their frozen representations, and how this information varies across...

#research #paper #ai #machine-learning #computer-vision
6 days ago · software · - · -

[Paper] Modeling Components and Connections in Cyber-Physical Systems

Text based configuration files for cyber-physical systems show the hierarchy of component modules well but often hide the details of connections and interfaces ...

#research #paper #software
6 days ago · ai · - · -

[Paper] Where Does the Answer Come From? Benchmarking View-Level Visual Evidence Identification in Multi-View MLLMs for Autonomous Driving

Multimodal large language models (MLLMs) achieve strong results on visual reasoning benchmarks, but answer accuracy alone does not indicate whether a model reli...

#research #paper #ai #nlp #computer-vision
6 days ago · ai · - · -

[Paper] FMplex: Model Virtualization for Serving Extensible Foundation Models

Foundation models (FMs) are increasingly used as backbones for downstream tasks across language, vision, time-series, and multimodal applications. Yet existing ...

#research #paper #ai #machine-learning
6 days ago · ai · - · -

[Paper] FMplex: Model Virtualization for Serving Extensible Foundation Models

Foundation models (FMs) are increasingly used as backbones for downstream tasks across language, vision, time-series, and multimodal applications. Yet existing ...

#research #paper #ai #machine-learning
6 days ago · software · - · -

[Paper] Agentic Persona Generation with Critique-Refinement: An Industrial Evaluation

Personas are widely used in software engineering to support requirements elicitation, design, and validation, but their manual creation is costly, time-consumin...

#research #paper #software
6 days ago · ai · - · -

[Paper] Gradient-Guided Reward Optimization for Inference-time Alignment

Ensuring the reliability of Large Language Models (LLMs) under distribution drift requires inference-time adaptation. While inference-time alignment methods suc...

#research #paper #ai #nlp
6 days ago · ai · - · -

[Paper] Civil Court Simulation with Large Language Models

Court simulation bridges legal education and judicial practice, yet human-based simulations are costly and difficult to scale. Large language models (LLMs) offe...

#research #paper #ai #nlp
6 days ago · devops · - · -

[Paper] Parent-Hash DAG: A Cost Analysis of Constant-Time Append for On-Chain Registries

Provenance trees are append-only directed acyclic graphs of artifact registrations anchored on a public blockchain, recently introduced as the data substrate of...

#research #paper #devops
6 days ago · devops · - · -

[Paper] Parent-Hash DAG: A Cost Analysis of Constant-Time Append for On-Chain Registries

Provenance trees are append-only directed acyclic graphs of artifact registrations anchored on a public blockchain, recently introduced as the data substrate of...

#research #paper #devops
6 days ago · ai · - · -

[Paper] Code Is More Than Text: Uncertainty Estimation for Code Generation

Large language models (LLMs) are increasingly deployed as code generators, where silently wrong programs pose real safety and reliability risks. Reliable uncert...

#research #paper #ai #machine-learning #nlp
6 days ago · ai · - · -

[Paper] Hybrid Metaheuristic Combining the Dragonfly Algorithm and Tabu Search for the Traveling Salesman Problem

The Traveling Salesman Problem (TSP) is a classical NP-hard combinatorial optimization problem that aims to find the shortest Hamiltonian cycle visiting each ci...

#research #paper #ai
6 days ago · ai · - · -

[Paper] Hybrid Metaheuristic Combining the Dragonfly Algorithm and Tabu Search for the Traveling Salesman Problem

The Traveling Salesman Problem (TSP) is a classical NP-hard combinatorial optimization problem that aims to find the shortest Hamiltonian cycle visiting each ci...

#research #paper #ai
6 days ago · software · - · -

[Paper] Relocate and Emulate: Re-Hosting Android's Application Layer

Dynamic analysis of Android's application layer typically relies on physical devices, limiting scalability and reproducibility. To compensate, we introduce a sy...

#research #paper #software
6 days ago · ai · - · -

[Paper] Local Search on Vertex Coloring for Bipartite Graphs

Local search is a well-known heuristic method used in optimization. In this thesis, we explore its capabilities on the vertex coloring problem, an NP-hard probl...

#research #paper #ai
6 days ago · ai · - · -

[Paper] Local Search on Vertex Coloring for Bipartite Graphs

Local search is a well-known heuristic method used in optimization. In this thesis, we explore its capabilities on the vertex coloring problem, an NP-hard probl...

#research #paper #ai
6 days ago · ai · - · -

[Paper] Harness Engineering for Physical AI: Robot Middleware Is the Harness Layer

Robot middleware faces a new role in the era of Physical AI. Learned policies, planners, and vision-language-action (VLA) models now enter deployed robots as ca...

#research #paper #ai #machine-learning
6 days ago · software · - · -

[Paper] Empirical Study for Structured Output Control in LLMs for Software Engineering

LLM-generated outputs in software engineering rarely exist in isolation. They must plug into toolchains, APIs, and data pipelines that impose strict, often orga...

#research #paper #software
6 days ago · devops · - · -

[Paper] Coupling Complementary Simulations for Combined Performance and Energy Optimization

Polymer simulations are among the most computationally demanding workloads in soft-matter research, often requiring days of execution and high energy consumptio...

#research #paper #devops
6 days ago · devops · - · -

[Paper] Coupling Complementary Simulations for Combined Performance and Energy Optimization

Polymer simulations are among the most computationally demanding workloads in soft-matter research, often requiring days of execution and high energy consumptio...

#research #paper #devops
6 days ago · devops · - · -

[Paper] Engineering Scalable Distributed List Ranking

The list ranking problem is one of the classical problems of parallel computing, with nontrivial algorithms and many applications as a subroutine for solving ot...

#research #paper #devops

Newer posts

Older posts