[Paper] Categorical Reparameterization with Denoising Diffusion models
Gradient-based optimization with categorical variables typically relies on score-function estimators, which are unbiased but noisy, or on continuous relaxations...
3944 posts from this source
Gradient-based optimization with categorical variables typically relies on score-function estimators, which are unbiased but noisy, or on continuous relaxations...
While Vision-Language Models (VLMs) and Multimodal Large Language Models (MLLMs) have shown strong generalisation in detecting image and video deepfakes, their ...
Investment portfolio optimization is a task conducted in all major financial institutions. The Cardinality Constrained Mean-Variance Portfolio Optimization (CCP...
Structured shape completion recovers missing geometry as primitives rather than as unstructured points, which enables primitive-based surface reconstruction. In...
Large Language Models (LLMs) have become a mainstay for many everyday applications. However, as data evolve their knowledge quickly becomes outdated. Continual ...
As autonomous AI agents transition from code completion tools to full-fledged teammates capable of opening pull requests (PRs) at scale, software maintainers fa...
Evaluating off-ball defensive performance in football is challenging, as traditional metrics do not capture the nuanced coordinated movements that limit opponen...
State-of-the-art large language model (LLM) pipelines rely on bootstrapped reasoning loops: sampling diverse chains of thought and reinforcing the highest-scori...
Integrating symbolic constraints into deep learning models could make them more robust, interpretable, and data-efficient. Still, it remains a time-consuming an...
Off-policy actor-critic methods in reinforcement learning train a critic with temporal-difference updates and use it as a learning signal for the policy (actor)...
Identifying relevant text spans is important for several downstream tasks in NLP, as it contributes to model explainability. While most span identification appr...
Handwritten STEM exams capture open-ended reasoning and diagrams, but manual grading is slow and difficult to scale. We present an end-to-end workflow for gradi...
We propose a reinforcement learning (RL) framework for adaptive precision tuning of linear solvers, and can be extended to general algorithms. The framework is ...
Deep neural networks show great potential for automating various visual quality inspection tasks in manufacturing. However, their applicability is limited in mo...
Vision-Language Models have demonstrated strong potential in medical image analysis and disease diagnosis. However, after deployment, their performance may dete...
In digital imaging, image demosaicing is a crucial first step which recovers the RGB information from a color filter array (CFA). Oftentimes, deep learning is u...
Long-term time series forecasting using transformers is hampered by the quadratic complexity of self-attention and the rigidity of uniform patching, which may b...
Existing paradigms for inferring pedestrian crossing behavior, ranging from statistical models to supervised learning methods, demonstrate limited generalizabil...
Ticket troubleshooting refers to the process of analyzing and resolving problems that are reported through a ticketing system. In large organizations offering a...
This paper presents a genetic algorithm (GA) approach to cost-optimal task scheduling in a production line. The system consists of a set of serial processing ta...
Language model (LM) probability is not a reliable quality estimator, as natural language is ambiguous. When multiple output options are valid, the model's proba...
Large Language Models (LLMs) have been emerging as prominent AI models for solving many natural language tasks due to their high performance (e.g., accuracy) an...
Generative Reward Models (GRMs) have attracted considerable research interest in reward modeling due to their interpretability, inference-time scalability, and ...
Sequence modeling layers in modern language models typically face a trade-off between storage capacity and computational efficiency. While Softmax attention off...
Spiking Neural Networks (SNNs) are dynamical systems that operate on spatiotemporal data, yet their learnable parameters are often limited to synaptic weights, ...
Large Protein Language Models have shown strong potential for generative protein design, yet they frequently produce structural hallucinations, generating seque...
Deploying large language models (LLMs) in mobile and edge computing environments is constrained by limited on-device resources, scarce wireless bandwidth, and f...
Large language models (LLMs) frequently produce contextual hallucinations, where generated content contradicts or ignores information explicitly stated in the p...
Integrating Artificial Intelligence into Software Engineering (SE) requires having a curated collection of models suited to SE tasks. With millions of models ho...
Real-time log analysis is the cornerstone of observability for modern infrastructure. However, existing online parsers are architecturally unsuited for the dyna...
Intelligent Connected Vehicles (ICVs) are a core component of modern transportation systems, and their security is crucial as it directly relates to user safety...
Traditional customer support systems, such as Interactive Voice Response (IVR), rely on rigid scripts and lack the flexibility required for handling complex, po...
Event-related potential (ERP), a specialized paradigm of electroencephalographic (EEG), reflects neurological responses to external stimuli or events, generally...
Althoughthereislittleempiricalresearchonplatform-specific performance for retail workloads, the digital transformation of the retail industry has accelerated th...
In this article, we explore federated customization of large models and highlight the key challenges it poses within the federated learning framework. We review...
Large Language Model (LLM)-based applications are increasingly deployed across various domains, including customer service, education, and mobility. However, th...
The primary value of AI agents in software development lies in their ability to extend the developer's capacity for reasoning and action, not to supplant human ...
Autonomous coding agents are increasingly deployed as AI teammates in modern software engineering, independently authoring pull requests (PRs) that modify produ...
Model-driven engineering (MDE) provides abstraction and analytical rigour, but industrial adoption in many domains has been limited by the cost of developing an...
Advances in artificial intelligence (AI) and deep learning have raised concerns about its increasing energy consumption, while demand for deploying AI in mobile...
This paper explores the complexities of automatic detection of software similarities, in relation to the unique challenges of digital artifacts, and introduces ...
The quadratic complexity of self-attention mechanism presents a significant impediment to applying Transformer models to long sequences. This work explores comp...
We propose the Consensus-Based Privacy-Preserving Data Distribution (CPPDD) framework, a lightweight and post-setup autonomous protocol for secure multi-client ...
Deploying LLMs efficiently requires testing hundreds of serving configurations, but evaluating each one on a GPU cluster takes hours and costs thousands of doll...
With the increasing demand for high-performance and high-efficiency computing, cloud computing, especially serverless computing, has gradually become a research...
Human biological systems sustain life through extraordinary resilience, continually detecting damage, orchestrating targeted responses, and restoring function t...
In recent decades, the RAFT distributed consensus algorithm has become a main pillar of the distributed systems ecosystem, ensuring data consistency and fault t...
In vehicle production factories, the vehicle painting process employs multiple robotic arms to simultaneously apply paint to car bodies advancing along a convey...