Source

arXiv

5785 posts from this source

Sort:

3 months ago · ai · - · -

[Paper] WildGraphBench: Benchmarking GraphRAG with Wild-Source Corpora

Graph-based Retrieval-Augmented Generation (GraphRAG) organizes external knowledge as a hierarchical graph, enabling efficient retrieval and aggregation of scat...

#research #paper #ai #nlp
3 months ago · ai · - · -

[Paper] SIDiffAgent: Self-Improving Diffusion Agent

Text-to-image diffusion models have revolutionized generative AI, enabling high-quality and photorealistic image synthesis. However, their practical deployment ...

#research #paper #ai #machine-learning
3 months ago · ai · - · -

[Paper] Dissecting Outlier Dynamics in LLM NVFP4 Pretraining

Training large language models using 4-bit arithmetic enhances throughput and memory efficiency. Yet, the limited dynamic range of FP4 increases sensitivity to ...

#LLM #quantization #outlier analysis #training optimization
3 months ago · ai · - · -

[Paper] On Stability and Robustness of Diffusion Posterior Sampling for Bayesian Inverse Problems

Diffusion models have recently emerged as powerful learned priors for Bayesian inverse problems (BIPs). Diffusion-based solvers rely on a presumed likelihood fo...

#research #paper #ai #machine-learning
3 months ago · ai · - · -

[Paper] Twinning Complex Networked Systems: Data-Driven Calibration of the mABCD Synthetic Graph Generator

The increasing availability of relational data has contributed to a growing reliance on network-based representations of complex systems. Over time, these model...

#research #paper #ai #machine-learning
3 months ago · ai · - · -

[Paper] Auto-Comp: An Automated Pipeline for Scalable Compositional Probing of Contrastive Vision-Language Models

Modern Vision-Language Models (VLMs) exhibit a critical flaw in compositional reasoning, often confusing 'a red cube and a blue sphere' with 'a blue cube and a ...

#vision-language models #compositional reasoning #benchmarking #synthetic data #contrastive learning
3 months ago · ai · - · -

[Paper] Hunt Instead of Wait: Evaluating Deep Data Research on Large Language Models

The agency expected of Agentic Large Language Models goes beyond answering correctly, requiring autonomy to set goals and decide what to explore. We term this i...

#research #paper #ai #machine-learning #nlp
3 months ago · ai · - · -

[Paper] One Size, Many Fits: Aligning Diverse Group-Wise Click Preferences in Large-Scale Advertising Image Generation

Advertising image generation has increasingly focused on online metrics like Click-Through Rate (CTR), yet existing approaches adopt a ``one-size-fits-all' stra...

#research #paper #ai #machine-learning #computer-vision
3 months ago · ai · - · -

[Paper] Scale-covariant spiking wavelets

We establish a theoretical connection between wavelet transforms and spiking neural networks through scale-space theory. We rely on the scale-covariant guarante...

#research #paper #ai #machine-learning
3 months ago · ai · - · -

[Paper] Rethinking Genomic Modeling Through Optical Character Recognition

Recent genomic foundation models largely adopt large language model architectures that treat DNA as a one-dimensional token sequence. However, exhaustive sequen...

#research #paper #ai #machine-learning #nlp #computer-vision
3 months ago · ai · - · -

[Paper] NEAT: Neuron-Based Early Exit for Large Reasoning Models

Large Reasoning Models (LRMs) often suffer from overthinking, a phenomenon in which redundant reasoning steps are generated after a correct solution has already...

#research #paper #ai #nlp
3 months ago · ai · - · -

[Paper] Beyond RAG for Agent Memory: Retrieval by Decoupling and Aggregation

Agent memory systems often adopt the standard Retrieval-Augmented Generation (RAG) pipeline, yet its underlying assumptions differ in this setting. RAG targets ...

#retrieval-augmented generation #agent memory #hierarchical retrieval #large language models #natural language processing
3 months ago · ai · - · -

[Paper] ClueTracer: Question-to-Vision Clue Tracing for Training-Free Hallucination Suppression in Multimodal Reasoning

Large multimodal reasoning models solve challenging visual problems via explicit long-chain inference: they gather visual clues from images and decode clues int...

#multimodal reasoning #hallucination suppression #attention tracing #research paper
3 months ago · ai · - · -

[Paper] UniDriveDreamer: A Single-Stage Multimodal World Model for Autonomous Driving

World models have demonstrated significant promise for data synthesis in autonomous driving. However, existing methods predominantly concentrate on single-modal...

#research #paper #ai #computer-vision
3 months ago · ai · - · -

[Paper] SurfSplat: Conquering Feedforward 2D Gaussian Splatting with Surface Continuity Priors

Reconstructing 3D scenes from sparse images remains a challenging task due to the difficulty of recovering accurate geometry and texture without optimization. R...

#research #paper #ai #machine-learning #computer-vision
3 months ago · ai · - · -

[Paper] SpikingGamma: Surrogate-Gradient Free and Temporally Precise Online Training of Spiking Neural Networks with Smoothed Delays

Neuromorphic hardware implementations of Spiking Neural Networks (SNNs) promise energy-efficient, low-latency AI through sparse, event-driven computation. Yet, ...

#research #paper #ai #machine-learning
3 months ago · ai · - · -

[Paper] Grappa: Gradient-Only Communication for Scalable Graph Neural Network Training

Cross-partition edges dominate the cost of distributed GNN training: fetching remote features and activations per iteration overwhelms the network as graphs dee...

#research #paper #ai #machine-learning
3 months ago · ai · - · -

[Paper] FUPareto: Bridging the Forgetting-Utility Gap in Federated Unlearning via Pareto Augmented Optimization

Federated Unlearning (FU) aims to efficiently remove the influence of specific client data from a federated model while preserving utility for the remaining cli...

#research #paper #ai #machine-learning
3 months ago · ai · - · -

[Paper] Enhancing Generalization in Evolutionary Feature Construction for Symbolic Regression through Vicinal Jensen Gap Minimization

Genetic programming-based feature construction has achieved significant success in recent years as an automated machine learning technique to enhance learning p...

#research #paper #ai #machine-learning
3 months ago · ai · - · -

[Paper] Dynamic Heuristic Neuromorphic Solver for the Edge User Allocation Problem with Bayesian Confidence Propagation Neural Network

We propose a neuromorphic solver for the NP-hard Edge User Allocation problem using an attractor network with Winner-Takes-All (WTA) mechanism implemented with ...

#research #paper #ai
3 months ago · ai · - · -

[Paper] Unleashing the Potential of Differential Evolution through Individual-Level Strategy Diversity

Since Differential Evolution (DE) is sensitive to strategy choice, most existing variants pursue performance through adaptive mechanisms or intricate designs. W...

#research #paper #ai
3 months ago · ai · - · -

[Paper] VideoGPA: Distilling Geometry Priors for 3D-Consistent Video Generation

While recent video diffusion models (VDMs) produce visually impressive results, they fundamentally struggle to maintain 3D structural consistency, often resulti...

#research #paper #ai #machine-learning #computer-vision
3 months ago · ai · - · -

[Paper] End-to-end Optimization of Belief and Policy Learning in Shared Autonomy Paradigms

Shared autonomy systems require principled methods for inferring user intent and determining appropriate assistance levels. This is a central challenge in human...

#research #paper #ai #machine-learning
3 months ago · ai · - · -

[Paper] User Prompting Strategies and Prompt Enhancement Methods for Open-Set Object Detection in XR Environments

Open-set object detection (OSOD) localizes objects while identifying and rejecting unknown classes at inference. While recent OSOD models perform well on benchm...

#research #paper #ai #computer-vision
3 months ago · ai · - · -

[Paper] Decoupled Diffusion Sampling for Inverse Problems on Function Spaces

We propose a data-efficient, physics-aware generative framework in function space for inverse PDE problems. Existing plug-and-play diffusion posterior samplers ...

#research #paper #ai #machine-learning
3 months ago · ai · - · -

[Paper] FOCUS: DLLMs Know How to Tame Their Compute Bound

Diffusion Large Language Models (DLLMs) offer a compelling alternative to Auto-Regressive models, but their deployment is constrained by high decoding cost. In ...

#research #paper #ai #machine-learning #nlp
3 months ago · ai · - · -

[Paper] Denoising the Deep Sky: Physics-Based CCD Noise Formation for Astronomical Imaging

Astronomical imaging remains noise-limited under practical observing constraints, while standard calibration pipelines mainly remove structured artifacts and le...

#research #paper #ai #machine-learning #computer-vision
3 months ago · ai · - · -

[Paper] UPA: Unsupervised Prompt Agent via Tree-Based Search and Selection

Prompt agents have recently emerged as a promising paradigm for automated prompt optimization, framing refinement as a sequential decision-making problem over a...

#research #paper #ai #nlp
3 months ago · ai · - · -

[Paper] IRL-DAL: Safe and Adaptive Trajectory Planning for Autonomous Driving via Energy-Guided Diffusion Models

This paper proposes a novel inverse reinforcement learning framework using a diffusion-based adaptive lookahead planner (IRL-DAL) for autonomous vehicles. Train...

#research #paper #ai #machine-learning
3 months ago · ai · - · -

[Paper] PaperBanana: Automating Academic Illustration for AI Scientists

Despite rapid advances in autonomous AI scientists powered by language models, generating publication-ready illustrations remains a labor-intensive bottleneck i...

#research #paper #ai #nlp #computer-vision
3 months ago · ai · - · -

[Paper] Particle-Guided Diffusion Models for Partial Differential Equations

We introduce a guided stochastic sampling method that augments sampling from diffusion models with physics-based guidance derived from partial differential equa...

#research #paper #ai #machine-learning
3 months ago · ai · - · -

[Paper] TEON: Tensorized Orthonormalization Beyond Layer-Wise Muon for Large Language Model Pre-Training

The Muon optimizer has demonstrated strong empirical performance in pre-training large language models by performing matrix-level gradient (or momentum) orthogo...

#research #paper #ai #machine-learning
3 months ago · ai · - · -

[Paper] Agnostic Language Identification and Generation

Recent works on language identification and generation have established tight statistical rates at which these tasks can be achieved. These works typically oper...

#research #paper #ai #machine-learning #nlp
3 months ago · software · - · -

[Paper] Outcome-Conditioned Reasoning Distillation for Resolving Software Issues

Software issue resolution in large repositories is a long-range decision process: choices made during localization shape the space of viable edits, and missteps...

#research #paper #software
3 months ago · ai · - · -

[Paper] Now You Hear Me: Audio Narrative Attacks Against Large Audio-Language Models

Large audio-language models increasingly operate on raw speech inputs, enabling more seamless integration across domains such as voice assistants, education, an...

#research #paper #ai #machine-learning #nlp
3 months ago · software · - · -

[Paper] GrepRAG: An Empirical Study and Optimization of Grep-Like Retrieval for Code Completion

Repository-level code completion remains challenging for large language models (LLMs) due to cross-file dependencies and limited context windows. Prior work add...

#research #paper #software
3 months ago · ai · - · -

[Paper] Training-Free Test-Time Adaptation with Brownian Distance Covariance in Vision-Language Models

Vision-language models suffer performance degradation under domain shift, limiting real-world applicability. Existing test-time adaptation methods are computati...

#research #paper #ai #machine-learning #computer-vision
3 months ago · ai · - · -

[Paper] Nested Slice Sampling: Vectorized Nested Sampling for GPU-Accelerated Inference

Model comparison and calibrated uncertainty quantification often require integrating over parameters, but scalable inference can be challenging for complex, mul...

#research #paper #ai #machine-learning
3 months ago · ai · - · -

[Paper] Structured Over Scale: Learning Spatial Reasoning from Educational Video

Vision-language models (VLMs) demonstrate impressive performance on standard video understanding benchmarks yet fail systematically on simple reasoning tasks th...

#research #paper #ai #computer-vision
3 months ago · ai · - · -

[Paper] YuriiFormer: A Suite of Nesterov-Accelerated Transformers

We propose a variational framework that interprets transformer layers as iterations of an optimization algorithm acting on token embeddings. In this view, self-...

#research #paper #ai #machine-learning
3 months ago · ai · - · -

[Paper] ShotFinder: Imagination-Driven Open-Domain Video Shot Retrieval via Web Search

In recent years, large language models (LLMs) have made rapid progress in information retrieval, yet existing research has mainly focused on text or static mult...

#research #paper #ai #machine-learning #computer-vision
3 months ago · ai · - · -

[Paper] Strongly Polynomial Time Complexity of Policy Iteration for $L_infty$ Robust MDPs

Markov decision processes (MDPs) are a fundamental model in sequential decision making. Robust MDPs (RMDPs) extend this framework by allowing uncertainty in tra...

#research #paper #ai #machine-learning
3 months ago · ai · - · -

[Paper] Scaling Multiagent Systems with Process Rewards

While multiagent systems have shown promise for tackling complex tasks via specialization, finetuning multiple agents simultaneously faces two key challenges: (...

#research #paper #ai #machine-learning #nlp
3 months ago · ai · - · -

[Paper] Video-o3: Native Interleaved Clue Seeking for Long Video Multi-Hop Reasoning

Existing multimodal large language models for long-video understanding predominantly rely on uniform sampling and single-turn inference, limiting their ability ...

#research #paper #ai #computer-vision
3 months ago · ai · - · -

[Paper] Are you going to finish that? A Practical Study of the Tokenization Boundary Problem

Language models (LMs) are trained over sequences of tokens, whereas users interact with LMs via text. This mismatch gives rise to the partial token problem, whi...

#research #paper #ai #nlp
3 months ago · ai · - · -

[Paper] Region-Normalized DPO for Medical Image Segmentation under Noisy Judges

While dense pixel-wise annotations remain the gold standard for medical image segmentation, they are costly to obtain and limit scalability. In contrast, many d...

#research #paper #ai #computer-vision
3 months ago · ai · - · -

[Paper] Med-Scout: Curing MLLMs' Geometric Blindness in Medical Perception via Geometry-Aware RL Post-Training

Despite recent Multimodal Large Language Models (MLLMs)' linguistic prowess in medical diagnosis, we find even state-of-the-art MLLMs suffer from a critical per...

#research #paper #ai #machine-learning #computer-vision
3 months ago · ai · - · -

[Paper] Deep Search with Hierarchical Meta-Cognitive Monitoring Inspired by Cognitive Neuroscience

Deep search agents powered by large language models have demonstrated strong capabilities in multi-step retrieval, reasoning, and long-horizon task execution. H...

#research #paper #ai #nlp

Newer posts

Older posts