[Paper] Controllable Reasoning Models Are Private Thinkers
AI agents powered by reasoning models require access to sensitive user data. However, their reasoning traces are difficult to control, which can result in the u...
AI agents powered by reasoning models require access to sensitive user data. However, their reasoning traces are difficult to control, which can result in the u...
Diffusion models achieve state-of-the-art video generation quality, but their inference remains expensive due to the large number of sequential denoising steps....
Despite their capabilities, Multimodal Large Language Models (MLLMs) may produce plausible but erroneous outputs, hindering reliable deployment. Accurate uncert...
We present a scalable methodology for evaluating language models in multi-turn interactions, using a suite of collaborative games that require effective communi...
Small language models (SLMs) have emerged as efficient alternatives to large language models for task-specific applications. However, they are often employed in...
Argumentative LLMs (ArgLLMs) are an existing approach leveraging Large Language Models (LLMs) and computational argumentation for decision-making, with the aim ...
Mobile Agents can autonomously execute user instructions, which requires hybrid-capabilities reasoning, including screen summary, subtask planning, action decis...
Interpretability in AI: Asking the Right Question Researchers, practitioners, and even regulators often ask whether a model is interpretable. This framing assum...
This paper proposes CIRCLE, a six-stage, lifecycle-based framework to bridge the reality gap between model-centric performance metrics and AI's materialized out...
Large Language Model (LLM) adapters enable low-cost model specialization, but introduce complex caching and scheduling challenges in distributed serving systems...
The Real Difference Between Junior and Senior Data Scientists If you spend even five minutes on LinkedIn or X formerly Twitter, you’ll notice a loud debate rag...
Serverless computing simplifies cloud deployment but introduces new challenges in managing service latency and carbon emissions. Reducing cold-start latency req...