[Paper] Multi-Round Human-AI Collaboration with User-Specified Requirements

Published: (February 19, 2026 at 01:54 PM EST)
5 min read
Source: arXiv

Source: arXiv - 2602.17646v1

Overview

The paper proposes a principled, user‑driven framework for multi‑round human‑AI collaboration that guarantees the AI assistant respects two intuitive safety rules: counterfactual harm (the AI must never make the human worse off) and complementarity (the AI should only intervene where the human is likely to err). By letting users encode these rules as simple constraints, the authors deliver an online algorithm that can enforce them in real‑time, even when the interaction dynamics shift over time.

Key Contributions

  • Formalization of human‑centric safety principles – Counterfactual harm and complementarity are expressed as user‑specified constraints that can be tailored to any task.
  • Distribution‑free online algorithm – A provably finite‑sample procedure that enforces the constraints without assuming a particular model of human behavior or data distribution.
  • Empirical validation on two fronts – (1) Simulated LLM collaboration on a medical diagnosis task, and (2) a live crowdsourcing study on a pictorial reasoning problem.
  • Demonstration of controllable trade‑offs – Tightening or loosening the constraints predictably shifts downstream human accuracy, showing the constraints act as practical “knobs” for steering performance.
  • Robustness to non‑stationarity – The algorithm maintains constraint satisfaction even when the human or AI behavior drifts during the interaction.

Methodology

  1. User‑specified constraints – Practitioners write simple rules that capture what counts as harmful AI advice (counterfactual harm) and where AI assistance is needed (complementarity). These rules are expressed as thresholds on observable outcomes (e.g., “the AI may not cause the final decision to be worse than the human’s unaided choice”).
  2. Online decision‑making – At each round of interaction, the algorithm observes the current state (human’s answer, AI’s suggestion, task context) and decides whether to let the human act alone or to intervene with AI assistance.
  3. Distribution‑free guarantees – Using concentration inequalities and a variant of the “online learning with constraints” framework, the method provides finite‑sample bounds on how often the constraints are violated, regardless of the underlying data distribution.
  4. Evaluation setups
    • Medical diagnosis: An LLM generates diagnostic suggestions that are either shown to or hidden from a simulated clinician; the algorithm decides when to reveal the suggestion.
    • Pictorial reasoning: Crowd workers solve visual puzzles; the system decides when to provide AI hints based on the same constraints.

Results & Findings

SettingCounterfactual Harm Violation RateComplementarity Violation RateHuman Accuracy Change
Medical diagnosis (LLM)≤ 2 % (target 5 %)≤ 3 % (target 5 %)+7 % when constraints tightened, –4 % when loosened
Pictorial reasoning (crowd)≤ 1.5 % (target 3 %)≤ 2 % (target 3 %)+5 % with strict constraints, –3 % with lax constraints

Key takeaways

  • The algorithm consistently respects the user‑defined safety caps, even as the underlying human or AI performance drifts.
  • Adjusting the strictness of the constraints yields predictable, monotonic changes in overall decision quality, confirming the “knobs” work as intended.
  • No explicit model of human error was required; the system learns to satisfy the constraints purely from observed outcomes.

Practical Implications

  • Safety‑first AI assistants – Developers can embed the constraint language into chat‑bots, decision‑support tools, or recommendation engines to guarantee they never degrade a human’s baseline performance.
  • Task‑specific customization – Because the rules are user‑specified, teams can tailor the safety envelope to regulatory or domain‑specific needs (e.g., “never suggest a treatment that lowers a patient’s survival probability”).
  • Dynamic environments – The distribution‑free nature makes the approach suitable for rapidly evolving settings such as real‑time monitoring, financial trading, or emergency response where human behavior may shift.
  • Low‑overhead deployment – The algorithm operates online with modest computational cost, meaning it can be added on top of existing LLM APIs or other AI services without retraining the underlying model.
  • Steerable performance – Product managers can deliberately trade off between aggressiveness (more AI intervention) and conservatism (fewer interventions) by tuning the constraint thresholds, providing a transparent way to balance risk and reward.

Limitations & Future Work

  • Constraint expressiveness – While the rule language is simple, it may be insufficient for highly nuanced tasks where “harm” is multi‑dimensional or context‑dependent.
  • Scalability to many constraints – The current theory handles a modest number of constraints; extending to large, possibly conflicting rule sets could increase computational burden.
  • Human behavior modeling – The approach deliberately avoids modeling humans, which is a strength for robustness but may miss opportunities to further improve performance by leveraging predictable human patterns.
  • Real‑world deployment studies – The paper validates the method in simulated LLM and crowdsourcing environments; field trials in high‑stakes domains (e.g., clinical decision support) are needed to assess usability and regulatory acceptance.

Future directions include richer constraint languages (e.g., probabilistic or temporal specifications), integration with reinforcement‑learning agents that can learn to propose better interventions, and large‑scale user studies to understand how practitioners set and adjust the safety “knobs” in practice.

Authors

  • Sima Noorani
  • Shayan Kiyani
  • Hamed Hassani
  • George Pappas

Paper Information

  • arXiv ID: 2602.17646v1
  • Categories: cs.LG
  • Published: February 19, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »