[Paper] HorizonStream: Long-Horizon Attention for Streaming 3D Reconstruction
Online 3D reconstruction requires estimating camera pose and scene geometry under strict causal and bounded-memory constraints. Existing methods often suffer fr...
1603 posts from this source
Online 3D reconstruction requires estimating camera pose and scene geometry under strict causal and bounded-memory constraints. Existing methods often suffer fr...
We introduce a new approach to high-fidelity 3D scene reconstruction from multi-view RGB images that tightly couples reconstruction with a strong generative 3D ...
Temporal knowledge-graph data marketplaces face three coupled failures in static designs: stale hybrid index shortcuts reduce recall as edges evolve, stationary...
Cross-lingual knowledge transfer is critical for building high-performing multilingual language models for languages with insufficient training data. When targe...
Despite remarkable progress in Multimodal Large Language Models (MLLMs), these models still struggle with fine-grained understanding tasks. In this work, we pro...
Gradient-flow sampling interprets a Gibbs distribution as the minimizer of an energy functional over probability measures and generates dynamics converging to t...
Modern video generators produce visually compelling clips but still struggle with physical and motion consistency, limiting their use as reliable world simulato...
We introduce training-free looped transformers, in which a lightweight inference-time wrapper loops a contiguous mid-stack block of layers of a frozen checkpoin...
We develop a gradient flow on the space of probability measures defined on matrix-valued parameters induced by regularized Muon, an analytically smoothed versio...
Large language models (LLMs) have the potential to aid and improve human decision-making in classification tasks, not only by providing fairly accurate predicti...
Causal generative modeling is essential for developing reliable and transparent AI systems capable of counterfactual reasoning. While existing approaches focus ...
Knowledge distillation generally assumes a strong-to-weak relationship where stronger teachers yield better students. In this work, we examine this assumption a...
Bradley-Terry-Luce (BTL) model estimation is a well-established strategy to rank a collection of items given a dataset of pairwise comparisons. Although the the...
The growing complexity and scale of scientific workflows in high performance computing (HPC) environments have led to significant challenges in managing energy ...
Keyframe selection is a direct way to provide verifiable visual evidence for long-video question answering (QA). Queries differ in what they require, and findin...
We propose a distributional theory of how hypernymy -- the ``is-a'' relation between general and specific concepts -- is encoded geometrically in language repre...
An SDN-like centralized control architecture is increasingly popular and has been widely explored in cyber-physical systems (CPS) such as manufacturing, interne...
Learned indexes have emerged as a promising alternative to traditional index structures, offering higher throughput and lower memory usage by approximating the ...
Many-core neuromorphic systems accelerate Spiking Neural Networks (SNNs), yet their packet-based spike communication can spend substantial traffic and energy re...
Agentic systems have recently emerged as state-of-the-art approaches for automated theorem proving in formal mathematics. To assess how far these capabilities e...
Modern Mixture-of-Experts (MoE) models increasingly rely on large-scale AI accelerator clusters for efficient training. Ascend NPUs expose heterogeneous on-chip...
Natural Language Generation (NLG) evaluation has changed dramatically since 1990, and will continue to evolve in the future. In 1990, when NLG had close ties to...
Semantic type mismatch between a noun and its context is central to coercion phenomena. This paper introduces a graph-based method to examine how lexical and co...
Online services strive to maintain application responsiveness even when the traffic is unpredictable and fluctuating. Today's online services are commonly deplo...
We study a protocol-level test for weak-label benchmarks: whether benchmark outputs change when the provided evidence is intervened on. Metadata-only shortcut c...
Chart descriptions are essential for accessibility, cross-modal retrieval, and assisting readers in extracting insights from complex visualizations. As multimod...
Blockchain systems that settle financial transactions face a structural tension: the single validator that assembles each block holds unilateral power over tran...
Transaction ordering attacks extract billions of dollars annually from decentralized finance users in the form of Maximal Extractable Value (MEV). Byzantine Fau...
We introduce the Preisach Attention Layer (PAL), a novel sequence modelling architecture grounded in the classical Preisach hysteresis operator from mathematica...
Long-horizon language agents can make many plausible local tool calls yet fail to persist until a requested count is actually complete. We study this gap as Qua...
Developers often use microbenchmarks to choose the most performant implementation of a method or a class. On the Java Virtual Machine (JVM), this is commonly do...
Cloud-based Digital Twin (DT) platforms enable real-time monitoring, simulation, and collaborative decision-making across distributed clients. However, ensuring...
The Java Stream API aims at increasing developer productivity thanks to an easy-to-read declarative syntax to express computations. It also simplifies parallel ...
The Rust programming language is increasingly being considered for safety-critical system development. However, established safety standards such as ISO 26262 r...
Enterprise AI systems, built on large language models, retrieval pipelines and autonomous agents, introduce a class of risks that traditional software quality a...
Compiler phase ordering has a strong effect on program performance. Finding an effective sequence of passes is still a difficult task because the search space i...
Directed acyclic graph (DAG)-based Byzantine Fault-Tolerant (BFT) protocols achieve high throughput by decoupling dissemination from agreement and allowing many...
High-throughput inference serving is essential for applications built on large language models (LLMs). Existing serving frameworks reduce request-level and batc...
Spiking Neural Networks (SNNs) provide an energy-efficient paradigm for visual recognition. We present SpikingMoE, which integrates a spike-driven Transformer w...
AI coding assistants have become prolific in recent years. Through a longitudinal mixed-methods investigation, we examined how professional software engineers p...
AI-assisted code review tools typically operate as generic 'expert reviewer' agents, producing homogeneous findings regardless of the analysis type needed. We p...
The majority of software developers use or are planning to use Artificial Intelligence (AI) tools in their development processes. Their top reasons include impr...
Tokenisation is an integral part of the current NLP pipeline. Current tokenisation algorithms such as BPE and Unigram are greedy algorithms -- they make locally...
Video Large Language Models (Video-LLMs) have made rapid progress on temporal video understanding, yet many fail at a basic perceptual primitive: signed image-p...
We propose the Integrable Context-Dependent Demand Network (ICDN), a demand-first neural model for multiproduct retail demand. The model learns log-demand as a ...
Camera pose matters. The position and orientation of each viewpoint define a shared spatial coordinate frame that relates observations across video frames. Yet ...
Current motion-controlled image-to-video generation models rigidly follow user-provided trajectories that are often sparse, imprecise, and causally incomplete. ...
Language models must now generalize out of the box to novel environments and work inside inference-scaling search procedures, such as AlphaEvolve, that select r...