[Paper] Benign Overfitting in Adversarial Training for Vision Transformers
Despite the remarkable success of Vision Transformers (ViTs) across a wide range of vision tasks, recent studies have revealed that they remain vulnerable to ad...
5586 posts from this source
Despite the remarkable success of Vision Transformers (ViTs) across a wide range of vision tasks, recent studies have revealed that they remain vulnerable to ad...
The discretization of continuous numerical attributes remains a persistent computational bottleneck in the induction of decision trees, particularly as dataset ...
Human video generation remains challenging due to the difficulty of jointly modeling human appearance, motion, and camera viewpoint under limited multi-view dat...
Large Language Models (LLMs) still struggle with multi-step logical reasoning. Existing approaches either purely refine the reasoning chain in natural language ...
Distribution networks with high penetration of Distributed Energy Resources (DERs) increasingly rely on communication networks to coordinate grid-interactive co...
In [97,99,100], an fl-RDT framework is introduced to characterize statistical computational gaps (SCGs). Studying symmetric binary perceptrons (SBPs), [100] obt...
Vision-Language-Action (VLA) models offer a promising autonomous driving paradigm for leveraging world knowledge and reasoning capabilities, especially in long-...
Kubernetes offers two default paths for scaling Node.js workloads, and both have structural limitations. The Horizontal Pod Autoscaler scales on CPU utilization...
Accurate reconstruction and tracking of dynamic human faces from image sequences is challenging because non-rigid deformations, expression changes, and viewpoin...
The pursuit of truth is central to democratic deliberation and governance, yet political discourse reflects varying epistemic orientations, ranging from evidenc...
The standard Monte Carlo estimator widehat{I}_N^{mathrm{MC}} of int fdω relies on independent samples from ω and has variance of order 1/N. Replacing the sample...
Understanding artworks requires multi-step reasoning over visual content and cultural, historical, and stylistic context. While recent multimodal large language...
Answering open-ended questions remains challenging for AI systems because it requires synthesis, judgment, and exploration beyond factual retrieval, and users o...
Function vectors (FVs) are vector representations of tasks extracted from model activations during in-context learning. While prior work has shown that multilin...
Reinforcement learning-based control policies have been frequently demonstrated to be more effective than analytical techniques for many manipulation tasks. Com...
Effective human-robot teaming is crucial for the practical deployment of robots in human workspaces. However, optimizing joint human-robot plans remains a chall...
At present, executable visual workflows have emerged as a mainstream paradigm in real-world industrial deployments, offering strong reliability and controllabil...
Large language models have achieved remarkable progress on complex reasoning tasks. However, they often implicitly fabricate information when inputs are incompl...
Fine-grained, per-micro-batch load balancing is essential for efficient Mixture-of-Experts (MoE) training, yet every prior dynamic scheduling scheme pays for it...
An earlier paper (Hong, Potteiger, and Zapata 2026) established that an unoptimized GPT 4.1 prompt predicts fan-reported experience ratings within one point 67%...
Edge devices such as smartwatches and smart glasses cannot continuously run even the smallest 100M-1B parameter language models due to power and compute constra...
Multimodal Large Language Models are increasingly adopted as autonomous agents in interactive environments, yet their ability to proactively address safety haza...
Free-association norms provide essential empirical data for investigating linguistic, semantic, and cultural phenomena in the cognitive sciences. Although large...
Automated theorem proving is fundamental to formal methods, and the recent trend is to integrate large language models (LLMs) and proof assistants to form effec...
Cross-site scripting (XSS) remains a persistent web security vulnerability, especially because obfuscation can change the surface form of a malicious payload wh...
Mixture-of-Experts (MoE) architectures are widely used in modern large language models and multimodal models. However, inference efficiency is often limited by ...
Modern distributed file systems rely on uncoordinated, per node page caches that replicate hot data locally across the cluster. While ensuring fast local access...
In this paper, we examine the use of self-stabilizing algorithms, operating in a hierarchical manner, to determine intellectual property risks at a macro level....
As deductive verifiers mature, their potential user base is growing from the initial core developers to other users. To convince external users of the suitabili...
Recent work has demonstrated the promise of orchestrating large language models (LLMs) within evolutionary and agentic optimization systems. However, the mechan...
Pre-trained machine learning models (PTMs) are commonly provided via Model Hubs (e.g., Hugging Face) in standard formats like Pickles to facilitate accessibilit...
Maintaining consistency between code and documentation is a crucial yet frequently overlooked aspect of software development. Even minor mismatches can confuse ...
Federated learning (FL) is a key paradigm for distributed model learning across decentralized data sources. Communication in each FL round typically consists of...
This paper presents a framework to bridge the gap between subjective stakeholder context and formal system architecture. This is achieved using Soft Systems Met...
Green software engineering is emerging as a crucial response to information technology's rising energy impact, especially in continuous development. However, th...
Mobile Crowd Computing (MCdC) leverages the idle computational capacity of consumer smartphones to enable distributed task processing at scale; however, widespr...
Memristive devices present a promising foundation for next-generation information processing by combining memory and computation within a single physical substr...
Particle-in-Cell (PIC) simulations are fundamental to plasma physics but often suffer from limited scalability due to particle-grid interaction bottlenecks and ...
The conformity bias exhibited by large language models (LLMs) can pose a significant challenge to decision-making in LLM-based multi-agent systems (LLM-MAS). Wh...
Matrix-multiply-accumulate (MMA) units, or tensor cores, are now widespread across modern computing architectures. Yet, their use for particle-grid operators re...
We present a simple hierarchical communication scheme for distributed Fast Multipole Methods (FMMs) based on MPI neighborhood collectives and uniform trees. The...
Mathematical problem solving remains a challenging test of reasoning for large language and multimodal models, yet existing benchmarks are limited in size, lang...
Building photorealistic, animatable full-body digital humans remains a longstanding challenge in computer graphics and vision. Recent advances in animatable ava...
Modern sequence models are dominated by Transformers, where self-attention mixes information from the visible context in an input-dependent way. However, when r...
Proximal Policy Optimization (PPO) has become the predominant algorithm for on-policy reinforcement learning due to its scalability and empirical robustness acr...
We present BLF (Bayesian Linguistic Forecaster), an agentic system for binary forecasting that achieves state-of-the-art performance on the ForecastBench benchm...
Story Visualization aims to generate a sequence of images that faithfully depicts a textual narrative that preserve character identity, spatial configuration, a...
Large language models have achieved significant reasoning improvements through reinforcement learning with verifiable rewards (RLVR). Yet as model capabilities ...