2% of ICML papers desk rejected because the authors used LLM in their reviews

Published: (March 19, 2026 at 06:17 AM EDT)
4 min read

Source: Hacker News

ICML 2026 – LLM‑Use Policy & Enforcement Summary

Program Chairs: Alekh Agarwal, Miroslav Dudik, Sharon Li, Martin Jaggi
Scientific Integrity Chair: Nihar B. Shah
Communications Chairs: Katherine Gorman, Gautam Kamath


1. Why a Policy on LLMs?

Artificial‑intelligence tools are now a routine part of many researchers’ workflows.
If used improperly, they can jeopardise the integrity of peer review.
Consequently, ICML 2026 introduced explicit rules and disciplinary procedures to protect the review process.


2. ICML 2026 LLM Policies

PolicyDescriptionReviewer Choice
Policy A – ConservativeNo LLM usage allowed at any stage of reviewing.Reviewers who selected “Policy A” or “I am okay with either Policy A or B.”
Policy B – PermissiveLLMs may be used to (i) understand the paper and related work, and (ii) polish the review text.Reviewers who opted for the permissive option.

The two‑policy framework reflects community preferences and feedback – see the ICML 2026 LLM‑Policy discussion page.
Full policy details are available here.


3. Enforcement Results

MetricValue
Total submissions
Desk‑rejected papers (policy violation)497 (≈ 2 % of all submissions)
Reviewers assigned to Policy A506 (reciprocal reviewers)
Reviews flagged as LLM‑generated (Policy A reviewers)795 (≈ 1 % of all reviews)
Reviewers with > 50 % of their reviews flagged51 (≈ 10 % of the 506 detected reviewers)

Detection method: a custom, non‑public algorithm (no generic AI‑text detectors). Every flagged review was manually inspected by a human to avoid false positives.


4. Consequences for Violations

SituationAction Taken
Reciprocal reviewer’s review flaggedThe associated submission was rejected (desk‑rejection).
Any Policy A review flaggedThe review was removed from the system.
Reviewer with > 50 % flagged reviewsAll of that reviewer’s remaining reviews were deleted and the reviewer was removed from the reviewer pool.
General impactACs may need to recruit replacement reviewers; some submissions that already had a full review set were still desk‑rejected.

Note: No judgment is made about the scientific quality or intent of the flagged reviews—only that they violated the agreed‑upon policy.


5. Next Steps & Support

  • The program chairs have been in direct contact with affected SACs and ACs and are offering assistance where possible.
  • Removed reviews have been purged; ACs are encouraged to seek new reviewers for impacted papers.
  • We acknowledge the disruption this enforcement causes and appreciate the community’s cooperation in upholding review integrity.

For any questions or further clarification, please reach out to the Scientific Integrity Chair, Nihar B. Shah.

Technical Approach

At a high level, the LLM‑detection effort involved watermarking the submission PDFs with hidden LLM instructions. These instructions subtly influence any review produced via an LLM.

Note: This measure is easy to circumvent if it is publicly known (which was the case for almost the entire review period). It mainly catches the most egregious and careless uses of LLMs—e.g., a reviewer who feeds the PDF to an LLM and then copy‑pastes the output verbatim. Action was taken only for reviews from reviewers who had explicitly agreed not to use LLMs (Policy A).

Despite these caveats, 795 reviews (≈ 1 % of all reviews) were found to violate the policy.


Methodology

The detection method is based on recent work by Rao, Kumar, Lakkaraju, and Shah[^1].

  1. Phrase dictionary – We built a dictionary containing 170 000 distinct phrases.
  2. Random sampling – For each submitted paper we randomly selected two phrases from the dictionary.
    • The probability of any particular pair being chosen is
  • Full paper describing the approach:
0 views
Back to Blog

Related posts

Read more »

Time to Dump Windows?

!Time To Dump Windows?https://dennisforbes.ca/include/images/microblog/2026/02/time_to_dump_windows_og.jpg I primarily use Apple Macs — while Tahoe was a regres...

Our commitment to Windows quality

Hello Windows Insiders, I want to speak to you directly, as an engineer who has spent his career building technology that people depend on every day. Windows to...

Attention Residuals

Paperhttps://github.com/MoonshotAI/Attention-Residuals/blob/master/Attention_Residuals.pdf | arXivhttps://arxiv.org/abs/2603.15031 | Overviewoverview | Resultsr...