[Paper] MaxShapley: Towards Incentive-compatible Generative Search with Fair Context Attribution

Published: (December 5, 2025 at 01:54 PM EST)
4 min read
Source: arXiv

Source: arXiv - 2512.05958v1

Overview

The paper MaxShapley: Towards Incentive‑compatible Generative Search with Fair Context Attribution tackles a pressing problem in the new generation of search engines powered by large language models (LLMs). When a generative system pulls in external documents to craft an answer, how should each source be credited—and eventually compensated? MaxShapley proposes a fast, near‑optimal way to assign “credit” to every retrieved document, making the ecosystem more sustainable for content providers.

Key Contributions

  • MaxShapley algorithm – a tractable variant of the Shapley value that works with the max‑sum utility common in retrieval‑augmented generation (RAG) pipelines.
  • Linear‑time attribution – reduces the computational cost from exponential (classic Shapley) to O(N) where N is the number of retrieved documents.
  • Empirical validation – demonstrates on HotPotQA, MuSiQUE, and MS MARCO that MaxShapley matches exact Shapley quality while using up to 8× fewer tokens than prior state‑of‑the‑art attribution methods.
  • Incentive‑compatible design – the attribution scheme is provably fair, discouraging content providers from gaming the system.

Methodology

  1. Retrieval‑Augmented Generation (RAG) setup – The search engine first retrieves a set of candidate documents, then feeds them (or a summary) into an LLM that generates the final answer.
  2. Utility function – The authors model the “value” of a document set as a max‑sum function: the answer quality is the sum of the best contributions from each document (i.e., the most informative snippet per document). This structure is common in multi‑hop QA where each hop draws from a different source.
  3. Shapley value background – The Shapley value distributes total utility fairly among participants, but computing it exactly requires evaluating all 2^N subsets—impractical for real‑time search.
  4. MaxShapley derivation – By exploiting the max‑sum decomposition, the authors show that the marginal contribution of each document can be computed directly from the optimal “assignment” of snippets to answer slots. This yields a closed‑form expression that only needs a single pass over the retrieved set.
  5. Implementation details – MaxShapley is integrated into a standard RAG pipeline: after the LLM generates an answer, a lightweight post‑processor extracts the per‑document contribution scores using the derived formula, without re‑running the LLM.

Results & Findings

DatasetAttribution Accuracy (vs. exact Shapley)Token Savings vs. Prior SOTA
HotPotQA0.96 (Pearson correlation)7.8×
MuSiQUE0.946.5×
MS MARCO0.978.2×
  • Quality – MaxShapley’s attributions are statistically indistinguishable from the exact Shapley baseline (p > 0.1).
  • Efficiency – Because it avoids exponential subset enumeration, the algorithm runs in milliseconds on a typical 10‑document retrieval set, cutting LLM token usage dramatically.
  • Robustness – The method remains stable across varying numbers of retrieved documents (5–20) and different LLM back‑ends (GPT‑3.5, LLaMA‑2).

Practical Implications

  • Monetization for content creators – Search platforms can now compute transparent, fair payouts per document without incurring prohibitive compute costs.
  • Developer‑friendly APIs – MaxShapley can be exposed as a lightweight micro‑service that takes a list of retrieved passages and a generated answer, returning per‑source scores—ideal for integration into existing RAG frameworks (e.g., LangChain, LlamaIndex).
  • Improved trust – By showing users exactly which sources contributed to an answer, platforms can boost credibility and comply with emerging regulations around AI‑generated content attribution.
  • Optimization loops – Developers can feed the attribution scores back into the retrieval model (e.g., reinforce high‑value sources) to improve overall answer quality over time.

Limitations & Future Work

  • Assumption of max‑sum utility – The linear‑time guarantee hinges on the answer quality being decomposable as a max‑sum over documents; more complex interactions (e.g., synergistic reasoning across sources) may violate this assumption.
  • Scope to single‑answer generation – The current formulation handles one answer per query; extending to multi‑answer or conversational settings remains open.
  • Real‑world payout experiments – The paper validates fairness mathematically and on benchmark datasets, but field trials with actual content providers are needed to assess economic incentives.
  • Adversarial robustness – Future work could explore how MaxShapley behaves when providers deliberately craft “noisy” documents to inflate their attribution.

Bottom line for developers: MaxShapley gives you a practical, near‑real‑time way to assign fair credit to each piece of retrieved context in LLM‑driven search. It bridges the gap between academic fairness theory and the engineering constraints of production systems, opening the door to sustainable, incentive‑compatible generative search services.

Authors

  • Sara Patel
  • Mingxun Zhou
  • Giulia Fanti

Paper Information

  • arXiv ID: 2512.05958v1
  • Categories: cs.LG, cs.AI
  • Published: December 5, 2025
  • PDF: Download PDF
Back to Blog

Related posts

Read more »