AI Models Research Survey
Hi everyone 👋 I’m working on a research paper for my college assessment about how people use different AI models in real workflows. I’ve created a quick, anony...
Hi everyone 👋 I’m working on a research paper for my college assessment about how people use different AI models in real workflows. I’ve created a quick, anony...
Data within a specific context gains deeper significance beyond its isolated interpretation. In distributed systems, interdependent data sources reveal hidden r...
Large vision-language models (VLMs) often benefit from intermediate visual cues, either injected via external tools or generated as latent visual tokens during ...
Inversion-based visual editing provides an effective and training-free way to edit an image or a video based on user instructions. Existing methods typically in...
Cloud incidents pose major operational challenges in production, with unresolved production cloud incidents cost on average over $2M per hour. Prior research id...
Neural network pruning is widely used to reduce model size and computational cost. Yet, most existing methods treat sparsity as an externally imposed constraint...
Multi-object tracking aims to maintain object identities over time by associating detections across video frames. Two dominant paradigms exist in literature: tr...
Multimodal regression aims to predict a continuous target from heterogeneous input sources and typically relies on fusion strategies such as early or late fusio...
Automating end-to-end data science pipeline with AI agents still stalls on two gaps: generating insightful, diverse visual evidence and assembling it into a coh...
Evaluating the performance of various model architectures, such as transformers, large language models (LLMs), and other NLP systems, requires comprehensive ben...
Recent approaches have demonstrated the promise of using diffusion models to generate interactive and explorable worlds. However, most of these methods face cri...
The scaling law, a cornerstone of Large Language Model (LLM) development, predicts improvements in model performance with increasing computational resources. Ye...
Agents based on large language models have recently shown strong potential on real-world software engineering (SWE) tasks that require long-horizon interaction ...
We consider the problem of restoring linear conservation laws in data-driven linear dynamical models. Given a learned operator widehat{A} and a full-rank constr...
Projected Gradient Descent (PGD) is a strong and widely used first-order adversarial attack, yet its computational cost scales poorly, as all training samples u...
Energy consumption dictates the cost and environmental impact of deploying Large Language Models. This paper investigates the impact of on-chip SRAM size and op...
Real-time, streaming interactive avatars represent a critical yet challenging goal in digital human research. Although diffusion-based human avatar generation m...
Natural Language Processing (NLP) systems are increasingly used in sensitive domains such as healthcare, finance, and government, where they handle large volume...
Stability analyses of modern learning systems are frequently derived under smoothness assumptions that are violated by ReLU-type nonlinearities. In this note, w...
This volume contains the post-proceedings of the Workshop on Adaptable Cloud Architectures (WACA 2025), held on June 20, 2025, in Lille, France, co-located with...
The development of GUI agents could revolutionize the next generation of human-computer interaction. Motivated by this vision, we present MAI-UI, a family of fo...
Prompt-driven Video Segmentation Foundation Models (VSFMs) such as SAM2 are increasingly deployed in applications like autonomous driving and digital pathology,...
Binary program analysis is still very important in system security. There are many practical achievements in binary code analysis, but fine-grained analysis suc...
Large-scale Mixture-of-Experts (MoE) models rely on expert parallelism for efficient training and inference, which splits experts across devices and necessitate...