[Paper] WildGraphBench: Benchmarking GraphRAG with Wild-Source Corpora
Graph-based Retrieval-Augmented Generation (GraphRAG) organizes external knowledge as a hierarchical graph, enabling efficient retrieval and aggregation of scat...
5785 posts from this source
Graph-based Retrieval-Augmented Generation (GraphRAG) organizes external knowledge as a hierarchical graph, enabling efficient retrieval and aggregation of scat...
Text-to-image diffusion models have revolutionized generative AI, enabling high-quality and photorealistic image synthesis. However, their practical deployment ...
Training large language models using 4-bit arithmetic enhances throughput and memory efficiency. Yet, the limited dynamic range of FP4 increases sensitivity to ...
Diffusion models have recently emerged as powerful learned priors for Bayesian inverse problems (BIPs). Diffusion-based solvers rely on a presumed likelihood fo...
The increasing availability of relational data has contributed to a growing reliance on network-based representations of complex systems. Over time, these model...
Modern Vision-Language Models (VLMs) exhibit a critical flaw in compositional reasoning, often confusing 'a red cube and a blue sphere' with 'a blue cube and a ...
The agency expected of Agentic Large Language Models goes beyond answering correctly, requiring autonomy to set goals and decide what to explore. We term this i...
Advertising image generation has increasingly focused on online metrics like Click-Through Rate (CTR), yet existing approaches adopt a ``one-size-fits-all' stra...
We establish a theoretical connection between wavelet transforms and spiking neural networks through scale-space theory. We rely on the scale-covariant guarante...
Recent genomic foundation models largely adopt large language model architectures that treat DNA as a one-dimensional token sequence. However, exhaustive sequen...
Large Reasoning Models (LRMs) often suffer from overthinking, a phenomenon in which redundant reasoning steps are generated after a correct solution has already...
Agent memory systems often adopt the standard Retrieval-Augmented Generation (RAG) pipeline, yet its underlying assumptions differ in this setting. RAG targets ...
Large multimodal reasoning models solve challenging visual problems via explicit long-chain inference: they gather visual clues from images and decode clues int...
World models have demonstrated significant promise for data synthesis in autonomous driving. However, existing methods predominantly concentrate on single-modal...
Reconstructing 3D scenes from sparse images remains a challenging task due to the difficulty of recovering accurate geometry and texture without optimization. R...
Neuromorphic hardware implementations of Spiking Neural Networks (SNNs) promise energy-efficient, low-latency AI through sparse, event-driven computation. Yet, ...
Cross-partition edges dominate the cost of distributed GNN training: fetching remote features and activations per iteration overwhelms the network as graphs dee...
Federated Unlearning (FU) aims to efficiently remove the influence of specific client data from a federated model while preserving utility for the remaining cli...
Genetic programming-based feature construction has achieved significant success in recent years as an automated machine learning technique to enhance learning p...
We propose a neuromorphic solver for the NP-hard Edge User Allocation problem using an attractor network with Winner-Takes-All (WTA) mechanism implemented with ...
Since Differential Evolution (DE) is sensitive to strategy choice, most existing variants pursue performance through adaptive mechanisms or intricate designs. W...
While recent video diffusion models (VDMs) produce visually impressive results, they fundamentally struggle to maintain 3D structural consistency, often resulti...
Shared autonomy systems require principled methods for inferring user intent and determining appropriate assistance levels. This is a central challenge in human...
Open-set object detection (OSOD) localizes objects while identifying and rejecting unknown classes at inference. While recent OSOD models perform well on benchm...
We propose a data-efficient, physics-aware generative framework in function space for inverse PDE problems. Existing plug-and-play diffusion posterior samplers ...
Diffusion Large Language Models (DLLMs) offer a compelling alternative to Auto-Regressive models, but their deployment is constrained by high decoding cost. In ...
Astronomical imaging remains noise-limited under practical observing constraints, while standard calibration pipelines mainly remove structured artifacts and le...
Prompt agents have recently emerged as a promising paradigm for automated prompt optimization, framing refinement as a sequential decision-making problem over a...
This paper proposes a novel inverse reinforcement learning framework using a diffusion-based adaptive lookahead planner (IRL-DAL) for autonomous vehicles. Train...
Despite rapid advances in autonomous AI scientists powered by language models, generating publication-ready illustrations remains a labor-intensive bottleneck i...
We introduce a guided stochastic sampling method that augments sampling from diffusion models with physics-based guidance derived from partial differential equa...
The Muon optimizer has demonstrated strong empirical performance in pre-training large language models by performing matrix-level gradient (or momentum) orthogo...
Recent works on language identification and generation have established tight statistical rates at which these tasks can be achieved. These works typically oper...
Software issue resolution in large repositories is a long-range decision process: choices made during localization shape the space of viable edits, and missteps...
Large audio-language models increasingly operate on raw speech inputs, enabling more seamless integration across domains such as voice assistants, education, an...
Repository-level code completion remains challenging for large language models (LLMs) due to cross-file dependencies and limited context windows. Prior work add...
Vision-language models suffer performance degradation under domain shift, limiting real-world applicability. Existing test-time adaptation methods are computati...
Model comparison and calibrated uncertainty quantification often require integrating over parameters, but scalable inference can be challenging for complex, mul...
Vision-language models (VLMs) demonstrate impressive performance on standard video understanding benchmarks yet fail systematically on simple reasoning tasks th...
We propose a variational framework that interprets transformer layers as iterations of an optimization algorithm acting on token embeddings. In this view, self-...
In recent years, large language models (LLMs) have made rapid progress in information retrieval, yet existing research has mainly focused on text or static mult...
Markov decision processes (MDPs) are a fundamental model in sequential decision making. Robust MDPs (RMDPs) extend this framework by allowing uncertainty in tra...
While multiagent systems have shown promise for tackling complex tasks via specialization, finetuning multiple agents simultaneously faces two key challenges: (...
Existing multimodal large language models for long-video understanding predominantly rely on uniform sampling and single-turn inference, limiting their ability ...
Language models (LMs) are trained over sequences of tokens, whereas users interact with LMs via text. This mismatch gives rise to the partial token problem, whi...
While dense pixel-wise annotations remain the gold standard for medical image segmentation, they are costly to obtain and limit scalability. In contrast, many d...
Despite recent Multimodal Large Language Models (MLLMs)' linguistic prowess in medical diagnosis, we find even state-of-the-art MLLMs suffer from a critical per...
Deep search agents powered by large language models have demonstrated strong capabilities in multi-step retrieval, reasoning, and long-horizon task execution. H...