[Paper] Momentum SVGD-EM for Accelerated Maximum Marginal Likelihood Estimation
Maximum marginal likelihood estimation (MMLE) can be formulated as the optimization of a free energy functional. From this viewpoint, the Expectation-Maximisati...
5644 posts from this source
Maximum marginal likelihood estimation (MMLE) can be formulated as the optimization of a free energy functional. From this viewpoint, the Expectation-Maximisati...
We tackle the challenging task of generating complete 3D facial animations for two interacting, co-located participants from a mixed audio stream. While existin...
In the forthcoming years the LHC experiments are going to be upgraded to benefit from the substantial increase of the LHC instantaneous luminosity, which will l...
Recent advancements in 3D Gaussian Splatting (3DGS) have shifted the focus toward balancing reconstruction fidelity with computational efficiency. In this work,...
Unsupervised reinforcement learning with verifiable rewards (URLVR) offers a pathway to scale LLM training beyond the supervision bottleneck by deriving rewards...
The emergence of large reasoning models demonstrates that scaling inference-time compute significantly enhances performance on complex tasks. However, it often ...
In this paper, we present a context-free unsupervised approach based on a self-conditioned GAN to learn different modes from 2D trajectories. Our intuition is t...
We introduce OfficeQA Pro, a benchmark for evaluating AI agents on grounded, multi-document reasoning over a large and heterogeneous document corpus. The corpus...
Recent advancements in Unified Multimodal Models (UMMs) have significantly advanced text-to-image (T2I) generation, particularly through the integration of Chai...
As video content creation shifts toward long-form narratives, composing short clips into coherent storylines becomes increasingly important. However, prevailing...
Template-free animatable head avatars can achieve high visual fidelity by learning expression-dependent facial deformation directly from a subject's capture, av...
AI agents have become surprisingly proficient at software engineering over the past year, largely due to improvements in reasoning capabilities. This raises a d...
Ensuring trustworthiness in open-world visual recognition requires models that are interpretable, fair, and robust to distribution shifts. Yet modern vision sys...
Streaming video understanding often involves time-sensitive scenarios where models need to answer exactly when the supporting visual evidence appears: answering...
Coverage-guided fuzzing has proven effective for software testing, but targeting library code requires specialized fuzz harnesses that translate fuzzer-generate...
Deployed machine learning systems face distribution drift, yet most monitoring pipelines stop at alarms and leave the response underspecified under labeling, co...
The application of large language models to code generation has evolved from one-shot generation to iterative refinement, yet the evolution of security througho...
Large language models (LLMs) can answer religious knowledge queries fluently, yet they often hallucinate and misattribute sources, which is especially consequen...
Selecting an optimization algorithm requires comparing candidates across problem instances, but the computational budget for deployment is often unknown at benc...
This report documents the work of our group (named SymBa) at the ALICE 2026 workshop in Copenhagen. Inspired by the pioneering work by Nils Aall Barricelli on s...
The quadratic complexity of the attention mechanism and the substantial memory footprint of the Key-Value (KV) cache present severe computational and memory cha...
Translations often carry traces of the source language, a phenomenon known as translationese. We introduce the first freely available English-to-Swedish dataset...
Large language model (LLM)-based AI systems have shown promise for patient-facing diagnostic and management conversations in simulated settings. Translating the...
Visual entity tracking is an innate cognitive ability in humans, yet it remains a critical bottleneck for Vision-Language Models (VLMs). This deficit is often o...
Understanding how structured sequence information can be represented and generalized in neural systems is key to modeling the transition from acoustic input to ...
The digital markets act (DMA) regulates very large digital platforms like Meta's Facebook or Apple's iOS with the goal to promote fairness, contestability (of m...
Aircraft engine blade maintenance relies on inspection records shared across manufacturers, airlines, maintenance organizations, and regulators. Yet current sys...
Accurate and interpretable mortality risk prediction in intensive care units (ICUs) remains a critical challenge due to the irregular temporal structure of elec...
The multiple-choice knapsack problem (MCKP) is a classic combinatorial optimization with wide practical applications. This paper investigates a significant yet ...
In this paper we propose a method for analyzing services deployed in serverless platforms. These services typically consists of orchestrated functions that can ...
Agile organizations increasingly rely on automated regression testing to sustain rapid, high-quality software delivery. However, as systems grow and requirement...
Advancements in data-driven machine learning have emerged as a pivotal element in supporting automotive software systems (ASSs) engineering across various level...
Recently, there has been increased interest in globally distributed training, which has the promise to both reduce training costs and democratize participation ...
Data replication is a critical aspect of data center design, as it ensures high availability, scalability, and fault tolerance. However, replicas need to be coo...
In high-speed rail (HSR) systems, federated learning (FL) enables cross-departmental flow prediction without sharing raw data. However, existing schemes suffer ...
In post-quantum blockchain settings, objects that require validity proofs (e.g., blob roots, execution-layer or consensus-layer signature aggregates) must be br...
Post-quantum signature schemes introduce kilobyte-scale authorization artifacts when applied directly to blockchain transaction validation. A widely considered ...
Vision Language Action (VLA) models are mainstream in embodied intelligence but face high inference costs. Edge-Cloud Collaborative (ECC) inference offers an ef...
Large language models (LLMs) have transformed the software engineering landscape. Recently, numerous LLM-based agents have been developed to address real-world ...
Open-source software is widely used in commercial applications. Pair that with the fact that when choosing open-source software for a new problem, developers of...
Efficient LLM inference scheduling is crucial for user experience.However, LLM inferences exhibit remarkable demand uncertainty (with unknown output length befo...
Integrating Internet of Things (IoT) data with business process event logs is crucial for analysing IoT-enhanced processes, yet remains challenging due to diffe...
The widespread integration of AI technologies has intensified concerns about fairness and bias, as these systems often perpetuate societal inequalities through ...
This paper introduces a novel class of model-driven evolutionary frameworks for near-field multi-source localization, addressing the major limitations of grid-b...
Human-vehicle interaction in safety-critical traffic environments increasingly incorporates neural sensing to infer user intent and cognitive state, yet most ex...
Adaptive Large Neighborhood Search (ALNS) is a prominent metaheuristic and a widely adopted approach for production and logistics optimization. However, it has ...
Multimodal Large Language Models (MLLM) classification performance depends critically on evaluation protocol and ground truth quality. Studies comparing MLLMs w...
While recent multimodal large language models (MLLMs) have made impressive strides, they predominantly employ a conventional autoregressive architecture as thei...