[Paper] Online Pandora's Box for Contextual LLM Cascading

Published: 5 days ago (June 5, 2026 at 11:29 AM EDT)

2 min read

Source: arXiv

Source: arXiv - 2606.07392v1

Overview

Motivated by Large Language Model (LLM) cascading, we propose an online contextual Pandora’s Box model for adaptively querying and selecting LLM APIs. In each period, a decision-maker observes a request context and faces a two-phase decision problem. In the query phase, the decision-maker sequentially queries APIs, where each query reveals a generated output and the decision-maker incurs an (output-dependent) cost. In the selection phase, the decision-maker selects one of the generated outputs to deploy and observes only the downstream reward of the deployed output. This output-mediated feedback structure differs from classical online contextual Pandora’s Box models, in which opening a box directly reveals its reward. Rather than estimating the full conditional output and cost distributions of each API, we directly model the reservation index and develop a learning approach for the query phase. Specifically, we impose a parametric structure on the contextual reservation index functions induced by the classical Weitzman’s policy. Our policy combines generalized method of moments (GMM) type estimation of these reservation indices with UCB-style confidence bounds for both these indices and the shared output-level reward evaluator. Under regularity conditions, we prove that the resulting policy achieves dimension-dependent $\widetilde O(\sqrt T)$ cumulative regret over a horizon of $T$ periods.

Key Contributions

This paper presents research in the following areas:

cs.AI
cs.LG
econ.EM
stat.ML

Methodology

Please refer to the full paper for detailed methodology.

Practical Implications

This research contributes to the advancement of cs.AI.

Authors

Alexandre Belloni
Yan Chen
Yehua Wei

Paper Information

arXiv ID: 2606.07392v1
Categories: cs.AI, cs.LG, econ.EM, stat.ML
Published: June 5, 2026
PDF: Download PDF

[Paper] Online Pandora's Box for Contextual LLM Cascading

Overview

Key Contributions

Methodology

Practical Implications

Authors

Paper Information

Related posts

[Paper] How reliable are LLMs when it comes to playing dice?

[Paper] MemDreamer: Decoupling Perception and Reasoning for Long Video Understanding via Hierarchical Graph Memory and Agentic Retrieval Mechanism

[Paper] Sparse Subspace-to-Expert Sharing for Task-Agnostic Continual Learning

[Paper] Accelerated Decentralized Stochastic Gradient Descent for Strongly Convex Optimization