[Paper] Training AI Co-Scientists Using Rubric Rewards
AI co-scientists are emerging as a tool to assist human researchers in achieving their research goals. A crucial feature of these AI co-scientists is the abilit...
AI co-scientists are emerging as a tool to assist human researchers in achieving their research goals. A crucial feature of these AI co-scientists is the abilit...
Transparent objects remain notoriously hard for perception systems: refraction, reflection and transmission break the assumptions behind stereo, ToF and purely ...
Identifying specific and often complex behaviors from large language models (LLMs) in conversational settings is crucial for their evaluation. Recent work propo...
We introduce Iterated Bellman Calibration, a simple, model-agnostic, post-hoc procedure for calibrating off-policy value predictions in infinite-horizon Markov ...
We present a method and dataset for fine-tuning language models with preference supervision using feedback-driven improvement chains. Given a model response, an...
Automatic Speech Recognition (ASR) in professional settings faces challenges that existing benchmarks underplay: dense domain terminology, formal register varia...
Large language models (LLMs) are increasingly considered for use in high-impact workflows, including academic peer review. However, LLMs are vulnerable to docum...
Language agents increasingly require persistent worlds in which they can act, remember, and learn. Existing approaches sit at two extremes: conventional web fra...
We formulate long-context language modeling as a problem in continual learning rather than architecture design. Under this formulation, we only use a standard a...
We present an online method for guaranteeing calibration of quantile forecasts at multiple quantile levels simultaneously. A sequence of α-level quantile foreca...
We introduce a training-efficient framework for time-series learning that combines random features with controlled differential equations (CDEs). In this approa...
Intrinsic image decomposition is fundamental for visual understanding, as RGB images entangle material properties, illumination, and view-dependent effects. Rec...