[Paper] Self-Evolving Distributed Memory Architecture for Scalable AI Systems
Distributed AI systems face critical memory management challenges across computation, communication, and deployment layers. RRAM based in memory computing suffe...
3944 posts from this source
Distributed AI systems face critical memory management challenges across computation, communication, and deployment layers. RRAM based in memory computing suffe...
We propose Mesh4D, a feed-forward model for monocular 4D mesh reconstruction. Given a monocular video of a dynamic object, our model reconstructs the object's c...
Recently, Quantum Visual Fields (QVFs) have shown promising improvements in model compactness and convergence speed for learning the provided 2D or 3D signals. ...
Nighttime color constancy remains a challenging problem in computational photography due to low-light noise and complex illumination conditions. We present RL-A...
Recovering clean and accurate geometry from images is essential for robotics and augmented reality. However, existing geometry foundation models still suffer se...
We prove tight lower bounds for online multicalibration, establishing an information-theoretic separation from marginal calibration. In the general setting wher...
Functional grasping with dexterous robotic hands is a key capability for enabling tool use and complex manipulation, yet progress has been constrained by two pe...
Referring Expression Segmentation (RES) and Comprehension (REC) respectively segment and detect the object described by an expression, while Referring Expressio...
As language models become increasingly capable, users expect them to provide not only accurate responses but also behaviors aligned with diverse human preferenc...
The diversity, quantity, and quality of manipulation data are critical for training effective robot policies. However, due to hardware and physical setup constr...
Large language models suffer from 'hallucinations'-logical inconsistencies induced by semantic noise. We propose that current architectures operate in a 'Metric...
Camera-controlled generative video re-rendering methods, such as ReCamMaster, have achieved remarkable progress. However, despite their success in single-view s...
Humans can effortlessly anticipate how objects might move or change through interaction--imagining a cup being lifted, a knife slicing, or a lid being closed. W...
We used machine learning and artificial intelligence: 1) to measure levels of peace in countries from news and social media and 2) to develop on-line tools that...
Agents capable of reasoning and planning in the real world require the ability of predicting the consequences of their actions. While world models possess this ...
I propose a novel framework that integrates stochastic differential equations (SDEs) with deep generative models to improve uncertainty quantification in machin...
One-shot prediction enables rapid adaptation of pretrained foundation models to new tasks using only one labeled example, but lacks principled uncertainty quant...
We present textsc{MineNPC-Task}, a user-authored benchmark and evaluation harness for testing memory-aware, mixed-initiative LLM agents in open-world Minecraft....
Large Language Models (LLMs) have shown remarkable capabilities in tool calling and tool usage, but suffer from hallucinations where they choose incorrect tools...
Brain Magnetic Resonance Imaging (MRI) plays a central role in studying neurological development, aging, and diseases. One key application is Brain Age Predicti...
MoE3D is a mixture-of-experts module designed to sharpen depth boundaries and mitigate flying-point artifacts (highlighted in red) of existing feed-forward 3D r...
Pervasive AI increasingly depends on on-device learning systems that deliver low-latency and energy-efficient computation under strict resource constraints. Liq...
Stock market price prediction is a significant interdisciplinary research domain that depends at the intersection of finance, statistics, and economics. Forecas...
Large vision-language models (VLMs) are highly capable, yet often hallucinate by favoring textual prompts over visual evidence. We study this failure mode in a ...
In this study, we aim to better align fall risk prediction from the Johns Hopkins Fall Risk Assessment Tool (JHFRAT) with additional clinically meaningful measu...
Entity linking (mapping ambiguous mentions in text to entities in a knowledge base) is a foundational step in tasks such as knowledge graph construction, questi...
When researchers deploy large language models for autonomous tasks like reviewing literature or generating hypotheses, the computational bills add up quickly. A...
Large language models (LLMs) have revolutionized text-based code automation, but their potential in graph-oriented engineering workflows remains under-explored....
The rapid advancement of large language models (LLMs) has led to growing interest in using synthetic data to train future models. However, this creates a self-c...
Chain-of-thought (CoT) reasoning has emerged as a powerful tool for multimodal large language models on video understanding tasks. However, its necessity and ad...
Embodied question answering (EQA) in 3D environments often requires collecting context that is distributed across multiple viewpoints and partially occluded. Ho...
Existing long-term personalized dialogue systems struggle to reconcile unbounded interaction streams with finite context constraints, often succumbing to memory...
Natural Language Inference (NLI) has been an important task for evaluating language models for Natural Language Understanding, but the logical properties of the...
Large Language Models (LLMs) for complex reasoning is often hindered by high computational costs and latency, while resource-efficient Small Language Models (SL...
Document Question Answering (DocQA) focuses on answering questions grounded in given documents, yet existing DocQA agents lack effective tool utilization and la...
Visual question answering for crop disease analysis requires accurate visual understanding and reliable language generation. This work presents a lightweight vi...
LLM-driven agentic applications increasingly automate complex, multi-step tasks, but serving them efficiently remains challenging due to heterogeneous component...
Designing scientific instrumentation often requires exploring large, highly constrained design spaces using computationally expensive physics simulations. These...
Epilepsy is a chronic neurological disorder characterized by recurrent unprovoked seizures, affects over 50 million people worldwide, and poses significant risk...
Epilepsy is a chronic neurological disorder characterized by recurrent unprovoked seizures, affects over 50 million people worldwide, and poses significant risk...
Privacy-preserving federated averaging is a central approach for protecting client privacy in federated learning. In this paper, we study this problem in an asy...
A cross-configuration benchmark is proposed to explore the capacities and limitations of AVX / NEON intrinsic functions in a generic context of development proj...
Driven by Moore's Law, the dimensions of transistors have been pushed down to the nanometer scale. Advanced quantum transport (QT) solvers are required to accur...
Pull request (PR) descriptions generated by AI coding agents are the primary channel for communicating code changes to human reviewers. However, the alignment b...
This paper presents a longitudinal ethical analysis of Untappd, a social drinking application that gamifies beer consumption through badges, streaks, and social...
Permissionless consensus protocols require a scarce resource to regulate leader election and provide Sybil resistance. Existing paradigms such as Proof of Work ...
Neural-Symbolic (NeSy) Artificial Intelligence has emerged as a promising approach for combining the learning capabilities of neural networks with the interpret...
This work presents DCIM 3.0, a unified framework integrating semantic reasoning, predictive analytics, autonomous orchestration, and unified connectivity for ne...