[Paper] L4: Low-Latency and Load-Balanced LLM Serving via Length-Aware Scheduling
Efficiently harnessing GPU compute is critical to improving user experience and reducing operational costs in large language model (LLM) services. However, curr...
Efficiently harnessing GPU compute is critical to improving user experience and reducing operational costs in large language model (LLM) services. However, curr...
This article explores the role of unrecognised labour in corporate innovation systems via an analysis of researcher coding and discursive contributions to R, on...
Decentralized federated learning (DFL) enables collaborative model training across edge devices without centralized coordination, offering resilience against si...
Bangla is a low-resource language for code generation, lacking large-scale annotated datasets and tools to transform natural language specifications into execut...
Incorporating over-the-air computations (OAC) into the model training process of federated learning (FL) is an effective approach to alleviating the communicati...
Advancements in large language models (LLMs) are showing promising impact in software development and programming assistance. However, these models struggle whe...
Automated front-end engineering drastically reduces development cycles and minimizes manual coding overhead. While Generative AI has shown promise in translatin...
Planning for an upcoming project iteration (sprint) is one of the key activities in Scrum planning. In this paper, we present our work in progress on exploring ...
Dynamic multimodal multiobjective optimization presents the dual challenge of simultaneously tracking multiple equivalent pareto optimal sets and maintaining po...
Large Language Models (LLMs) execute complex multi-turn interaction protocols but lack formal specifications to verify execution against designer intent. We int...
Catastrophic forgetting poses a fundamental challenge in continual learning, particularly when models are quantized for deployment efficiency. We systematically...
Vision-Language-Action (VLA) models align vision and language with embodied control, but their object referring ability remains limited when relying solely on t...
Differential privacy (DP) has emerged as the gold standard for protecting user data in recommender systems, but existing privacy-preserving mechanisms face a fu...
Artistic style transfer in generative models remains a significant challenge, as existing methods often introduce style only via model fine-tuning, additional a...
This work puts forward a novel nonlinear optimal filter namely the Ensemble Schr{ö}dinger Bridge nonlinear filter. The proposed filter finds marriage of the sta...
Training on disjoint datasets can serve two primary goals: accelerating data processing and enabling federated learning. It has already been established that Ko...
As computation shifts from the cloud to the edge to reduce processing latency and network traffic, the resulting Computing Continuum (CC) creates a dynamic envi...
Multimodal Large Language Models (MLLMs) combine visual and textual representations to enable rich reasoning capabilities. However, the high computational cost ...
Over the years, automatic MT metrics have hillclimbed benchmarks and presented strong and sometimes human-level agreement with human ratings. Yet they remain bl...
We present Gabliteration, a novel neural weight modification technique that advances beyond traditional abliteration methods by implementing adaptive multi-dire...
Vocabulary-free fine-grained image recognition aims to distinguish visually similar categories within a meta-class without a fixed, human-defined label set. Exi...
High-performance computing (HPC) workloads are becoming increasingly diverse, exhibiting wide variability in job characteristics, yet cluster scheduling has lon...
Deep neural networks often exploit shortcuts. These are spurious cues which are associated with output labels in the training data but are unrelated to task sem...
High Performance Computing (HPC) based simulations are crucial in Astrophysics and Cosmology (A&C), helping scientists investigate and understand complex as...