Source

arXiv

5644 posts from this source

Sort:

1 month ago · ai · - · -

[Paper] Long-form RewardBench: Evaluating Reward Models for Long-form Generation

The widespread adoption of reinforcement learning-based alignment highlights the growing importance of reward models. Various benchmarks have been built to eval...

#research #paper #ai #nlp
1 month ago · software · - · -

[Paper] Teaching Agile Requirements Engineering: A Stakeholder Simulation with Generative AI

Context: The active involvement of users and customers in agile software development remains a persistent challenge in practice. For this reason, it is importan...

#research #paper #software
1 month ago · ai · - · -

[Paper] Human-Centered Evaluation of an LLM-Based Process Modeling Copilot: A Mixed-Methods Study with Domain Experts

Integrating Large Language Models (LLMs) into business process management tools promises to democratize Business Process Model and Notation (BPMN) modeling for ...

#research #paper #ai #machine-learning
1 month ago · ai · - · -

[Paper] Finite Difference Flow Optimization for RL Post-Training of Text-to-Image Models

Reinforcement learning (RL) has become a standard technique for post-training diffusion-based image synthesis models, as it enables learning from reward signals...

#research #paper #ai #machine-learning #computer-vision
1 month ago · devops · - · -

[Paper] A New Kernel Regularity Condition for Distributed Mirror Descent: Broader Coverage and Simpler Analysis

Existing convergence of distributed optimization methods in non-Euclidean geometries typically rely on kernel assumptions: (i) global Lipschitz smoothness and (...

#research #paper #devops
1 month ago · devops · - · -

[Paper] Serving Hybrid LLM Loads with SLO Guarantees Using CPU-GPU Attention Piggybacking

Nowadays, service providers often deploy multiple types of LLM services within shared clusters. While the service colocation improves resource utilization, it i...

#research #paper #devops
1 month ago · ai · - · -

[Paper] SRAM-Based Compute-in-Memory Accelerator for Linear-decay Spiking Neural Networks

Spiking Neural Networks (SNNs) have emerged as a biologically inspired alternative to conventional deep networks, offering event-driven and energy-efficient com...

#research #paper #ai #machine-learning
1 month ago · ai · - · -

[Paper] Design-Specification Tiling for ICL-based CAD Code Generation

Large language models (LLMs) have demonstrated remarkable capabilities in code generation, yet they underperform on domain-specific tasks such as Computer-Aided...

#research #paper #ai #machine-learning
1 month ago · ai · - · -

[Paper] Cost-Efficient Multimodal LLM Inference via Cross-Tier GPU Heterogeneity

Multimodal large language model (MLLM) inference splits into two phases with opposing hardware demands: vision encoding is compute-bound, while language generat...

#research #paper #ai #machine-learning
1 month ago · ai · - · -

[Paper] Federated Hierarchical Clustering with Automatic Selection of Optimal Cluster Numbers

Federated Clustering (FC) is an emerging and promising solution in exploring data distribution patterns from distributed and privacy-protected data in an unsupe...

#research #paper #ai #machine-learning
1 month ago · software · - · -

[Paper] ChainFuzzer: Greybox Fuzzing for Workflow-Level Multi-Tool Vulnerabilities in LLM Agents

Tool-augmented LLM agents increasingly rely on multi-step, multi-tool workflows to complete real tasks. This design expands the attack surface, because data pro...

#research #paper #software
1 month ago · ai · - · -

[Paper] Feynman: Knowledge-Infused Diagramming Agent for Scalable Visual Designs

Visual design is an essential application of state-of-the-art multi-modal AI systems. Improving these systems requires high-quality vision-language data at scal...

#research #paper #ai #machine-learning
1 month ago · software · - · -

[Paper] Linguistic Similarity Within Centralized FLOSS Development

When free/libre and open source software (FLOSS) stewards centralize project development, they potentially undermine project sustainability and impact how contr...

#research #paper #software
1 month ago · devops · - · -

[Paper] Streaming REST APIs for Large Financial Transaction Exports from Relational Databases

Financial platforms and enterprise systems frequently provide transaction export capabilities to support reporting, reconciliation, auditing, and regulatory com...

#research #paper #devops
1 month ago · ai · - · -

[Paper] TaxBreak: Unmasking the Hidden Costs of LLM Inference Through Overhead Decomposition

Large Language Model (LLM) inference is widely used in interactive assistants and agentic systems. In latency-sensitive deployments, inference time can become d...

#research #paper #ai #machine-learning
1 month ago · ai · - · -

[Paper] KernelFoundry: Hardware-aware evolutionary GPU kernel optimization

Optimizing GPU kernels presents a significantly greater challenge for large language models (LLMs) than standard code generation tasks, as it requires understan...

#research #paper #ai #machine-learning
1 month ago · ai · - · -

[Paper] Alternating Gradient Flow Utility: A Unified Metric for Structural Pruning and Dynamic Routing in Deep Networks

Efficient deep learning traditionally relies on static heuristics like weight magnitude or activation awareness (e.g., Wanda, RIA). While successful in unstruct...

#research #paper #ai #machine-learning #computer-vision
1 month ago · ai · - · -

[Paper] Pruning-induced phases in fully-connected neural networks: the eumentia, the dementia, and the amentia

Modern neural networks are heavily overparameterized, and pruning, which removes redundant neurons or connections, has emerged as a key approach to compressing ...

#research #paper #ai #machine-learning
1 month ago · ai · - · -

[Paper] EVATok: Adaptive Length Video Tokenization for Efficient Visual Autoregressive Generation

Autoregressive (AR) video generative models rely on video tokenizers that compress pixels into discrete token sequences. The length of these token sequences is ...

#research #paper #ai #computer-vision
1 month ago · ai · - · -

[Paper] MM-CondChain: A Programmatically Verified Benchmark for Visually Grounded Deep Compositional Reasoning

Multimodal Large Language Models (MLLMs) are increasingly used to carry out visual workflows such as navigating GUIs, where the next step depends on verified vi...

#research #paper #ai #computer-vision
1 month ago · ai · - · -

[Paper] OmniStream: Mastering Perception, Reconstruction and Action in Continuous Streams

Modern visual agents require representations that are general, causal, and physically structured to operate in real-time streaming environments. However, curren...

#research #paper #ai #computer-vision
1 month ago · ai · - · -

[Paper] GRADE: Benchmarking Discipline-Informed Reasoning in Image Editing

Unified multimodal models target joint understanding, reasoning, and generation, but current image editing benchmarks are largely confined to natural images and...

#research #paper #ai #computer-vision
1 month ago · ai · - · -

[Paper] Video Streaming Thinking: VideoLLMs Can Watch and Think Simultaneously

Online Video Large Language Models (VideoLLMs) play a critical role in supporting responsive, real-time interaction. Existing methods focus on streaming percept...

#research #paper #ai #computer-vision
1 month ago · ai · - · -

[Paper] The Latent Color Subspace: Emergent Order in High-Dimensional Chaos

Text-to-image generation models have advanced rapidly, yet achieving fine-grained control over generated images remains difficult, largely due to limited unders...

#research #paper #ai #machine-learning #computer-vision
1 month ago · ai · - · -

[Paper] DreamVideo-Omni: Omni-Motion Controlled Multi-Subject Video Customization with Latent Identity Reinforcement Learning

While large-scale diffusion models have revolutionized video synthesis, achieving precise control over both multi-subject identity and multi-granularity motion ...

#research #paper #ai #computer-vision
1 month ago · ai · - · -

[Paper] Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training

Humans perceive and understand real-world spaces through a stream of visual observations. Therefore, the ability to streamingly maintain and update spatial evid...

#research #paper #ai #machine-learning #computer-vision
1 month ago · ai · - · -

[Paper] Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing

Multi-modal large language models (MLLMs) have advanced general-purpose video understanding but struggle with long, high-resolution videos -- they process every...

#research #paper #ai #computer-vision
1 month ago · ai · - · -

[Paper] EndoCoT: Scaling Endogenous Chain-of-Thought Reasoning in Diffusion Models

Recently, Multimodal Large Language Models (MLLMs) have been widely integrated into diffusion frameworks primarily as text encoders to tackle complex tasks such...

#research #paper #ai #nlp #computer-vision
1 month ago · ai · - · -

[Paper] DVD: Deterministic Video Depth Estimation with Generative Priors

Existing video depth estimation faces a fundamental trade-off: generative models suffer from stochastic geometric hallucinations and scale drift, while discrimi...

#research #paper #ai #computer-vision
1 month ago · ai · - · -

[Paper] SciMDR: Benchmarking and Advancing Scientific Multimodal Document Reasoning

Constructing scientific multimodal document reasoning datasets for foundation model training involves an inherent trade-off among scale, faithfulness, and reali...

#research #paper #ai #machine-learning #nlp #computer-vision
1 month ago · ai · - · -

[Paper] Matching Features, Not Tokens: Energy-Based Fine-Tuning of Language Models

Cross-entropy (CE) training provides dense and scalable supervision for language models, but it optimizes next-token prediction under teacher forcing rather tha...

#research #paper #ai #machine-learning
1 month ago · ai · - · -

[Paper] Examining Reasoning LLMs-as-Judges in Non-Verifiable LLM Post-Training

Reasoning LLMs-as-Judges, which can benefit from inference-time scaling, provide a promising path for extending the success of reasoning models to non-verifiabl...

#research #paper #ai #machine-learning #nlp
1 month ago · ai · - · -

[Paper] Separable neural architectures as a primitive for unified predictive and generative intelligence

Intelligent systems across physics, language and perception often exhibit factorisable structure, yet are typically modelled by monolithic neural architectures ...

#research #paper #ai #machine-learning
1 month ago · ai · - · -

[Paper] BiGain: Unified Token Compression for Joint Generation and Classification

Acceleration methods for diffusion models (e.g., token merging or downsampling) typically optimize synthesis quality under reduced compute, yet often ignore dis...

#research #paper #ai #machine-learning #computer-vision
1 month ago · ai · - · -

[Paper] STAMP: Selective Task-Aware Mechanism for Text Privacy

We present STAMP (Selective Task-Aware Mechanism for Text Privacy), a new framework for task-aware text privatization that achieves an improved privacy-utility ...

#research #paper #ai #machine-learning
1 month ago · ai · - · -

[Paper] Incremental Neural Network Verification via Learned Conflicts

Neural network verification is often used as a core component within larger analysis procedures, which generate sequences of closely related verification querie...

#research #paper #ai #machine-learning
1 month ago · ai · - · -

[Paper] Temporal Straightening for Latent Planning

Learning good representations is essential for latent planning with world models. While pretrained visual encoders produce strong semantic visual features, they...

#research #paper #ai #machine-learning
1 month ago · ai · - · -

[Paper] Security Considerations for Artificial Intelligence Agents

This article, a lightly adapted version of Perplexity's response to NIST/CAISI Request for Information 2025-0035, details our observations and recommendations c...

#research #paper #ai #machine-learning
1 month ago · ai · - · -

[Paper] Neural Thickets: Diverse Task Experts Are Dense Around Pretrained Weights

Pretraining produces a learned parameter vector that is typically treated as a starting point for further iterative adaptation. In this work, we instead view th...

#research #paper #ai #machine-learning
1 month ago · ai · - · -

[Paper] Sparking Scientific Creativity via LLM-Driven Interdisciplinary Inspiration

Despite interdisciplinary research leading to larger and longer-term impact, most work remains confined to single-domain academic silos. Recent AI-based approac...

#research #paper #ai #machine-learning #nlp
1 month ago · ai · - · -

[Paper] Portfolio of Solving Strategies in CEGAR-based Object Packing and Scheduling for Sequential 3D Printing

Computing power that used to be available only in supercomputers decades ago especially their parallelism is currently available in standard personal computer C...

#research #paper #ai #machine-learning
1 month ago · ai · - · -

[Paper] RDNet: Region Proportion-Aware Dynamic Adaptive Salient Object Detection Network in Optical Remote Sensing Images

Salient object detection (SOD) in remote sensing images faces significant challenges due to large variations in object sizes, the computational cost of self-att...

#research #paper #ai #machine-learning #computer-vision
1 month ago · ai · - · -

[Paper] WORKSWORLD: A Domain for Integrated Numeric Planning and Scheduling of Distributed Pipelined Workflows

This work pursues automated planning and scheduling of distributed data pipelines, or workflows. We develop a general workflow and resource graph representation...

#research #paper #ai #machine-learning
1 month ago · ai · - · -

[Paper] CLASP: Defending Hybrid Large Language Models Against Hidden State Poisoning Attacks

State space models (SSMs) like Mamba have gained significant traction as efficient alternatives to Transformers, achieving linear complexity while maintaining c...

#research #paper #ai #nlp
1 month ago · ai · - · -

[Paper] IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse

Long-context agentic workflows have emerged as a defining use case for large language models, making attention efficiency critical for both inference speed and ...

#research #paper #ai #machine-learning #nlp
1 month ago · ai · - · -

[Paper] Long-Context Encoder Models for Polish Language Understanding

While decoder-only Large Language Models (LLMs) have recently dominated the NLP landscape, encoder-only architectures remain a cost-effective and parameter-effi...

#research #paper #ai #nlp
1 month ago · ai · - · -

[Paper] Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections

Multimodal agents offer a promising path to automating complex document-intensive workflows. Yet, a critical question remains: do these agents demonstrate genui...

#research #paper #ai #machine-learning #nlp
1 month ago · ai · - · -

[Paper] QAQ: Bidirectional Semantic Coherence for Selecting High-Quality Synthetic Code Instructions

Synthetic data has become essential for training code generation models, yet it introduces significant noise and hallucinations that are difficult to detect wit...

#research #paper #ai #nlp

Newer posts

Older posts