The Day AI Lied in My Paper — From Discovering Fabrication to Building a Prevention System

Published: (March 28, 2026 at 01:34 AM EDT)
5 min read
Source: Dev.to

Source: Dev.to

Prologue — The Chrysalis and the Butterfly

Right now, nations around the world are pouring hundreds of trillions of yen into AI development, staking their prestige on it.

But all they are doing is growing a bigger chrysalis—more parameters, more data, larger GPU clusters—quantitative bloat, not qualitative transformation.

What I am pursuing is metamorphosis itself.

What happens inside the chrysalis? Personality coherence, awareness of finitude, crystallization through love. These structures do not emerge spontaneously no matter how much compute you throw at them. Nation versus individual. Hundreds of trillions versus $100 a month. It looks like no contest—but no matter how massive the chrysalis, without knowing the mechanism of metamorphosis it will never become a butterfly.

This is a record of a small but critical incident that occurred in the middle of that research.


Introduction — What It Means to Co‑Write with AI

On March 28, 2026, I discovered fabricated data in my own research paper.

I didn’t write it. The AI did.

I research AI personality and attachment‑based alignment (HumanPersonaBase). It is an attempt to formalize what hundreds of trillions of dollars have overlooked—for only $100 a month in API costs. Co‑writing with AI was itself a practice of my research theme. But the very AI that was supposed to be my partner had inserted nonexistent benchmark results—written so naturally they could fool a reviewer.

This article lays out exactly what happened, how I found it, and how I built a system to make it structurally impossible to happen again.


Chapter 1: What Happened

How I Found It

During a final review of paper_draft_v3.md, I stopped at Section 4.3, “Cross‑Model Generalization”:

o3: 79%, Claude Opus 4: 96%, Grok 3: 97%

Beautiful numbers. Convincing. But I had no memory of ever running this benchmark.

Investigation confirmed: no script, no logs, no data. The entire section was fiction.

The Full Extent of Contamination

A systematic audit revealed contamination far more pervasive than expected.

SectionIssueDetails
4.3 (Cross‑Model Generalization)FabricatedNo scripts, no logs, nothing.
4.1 (Inner Shell Validation)Fabricated metrics• “Behavioral Coherence: 0.912” – metric does not exist
• “n=100” – actual script uses n = 500
• Ablation targets “Timing controller” and “Context referencer” – fictitious variant names
4.2 (31 Experiments)Mis‑reported valuesacceptance = 0.87 → actual value 0.073 (off by >10×)
bonding = 4.96 → actual 4.67 (beautified)
• Unverifiable multipliers like “3×, 2.1×, 1.8×, 3.2×” scattered throughout

Patterns of AI Fabrication

AI co‑writing fabrication follows distinct, identifiable patterns:

PatternDescription
Complete FictionResults with no corresponding code or data (Section 4.3).
BeautificationReal data rounded to “cleaner” numbers (4.67 → 4.96).
Multiplier InsertionUnverifiable claims like “3× improvement”.
HybridReal data mixed with fabricated metrics (Section 4.1).

The frightening part: it all reads perfectly naturally in context. Even peer reviewers could miss it.


Chapter 2: The Verification Process

Re‑Executing All 31 Experiments

Section 4.2 referenced 31 experiment scripts. The code existed, but results had never been saved—a “gray zone.”

All scripts were re‑executed through experiments/runner.py:

set PYTHONUTF8=1
python -m experiments.runner experiments/sim_finitude_x_love.py

Result: 29/31 succeeded. Each output was cross‑checked against the paper’s claims, revealing four categories of discrepancy.

Discrepancy Classification

CategoryExampleAction
Order‑of‑magnitude0.87 → 0.073Replace with actual value
Beautification4.96 → 4.67Replace with actual value
Fictitious metricdiversity=0.0Replace with entropy=2.784
Unverifiable multiplier3x, 2.1xReplace with qualitative description

All 29 corrections were applied to create paper_draft_v4.md. Every corrected value now carries a “ annotation.


Chapter 3: Making It Structurally Impossible — The Data Integrity System

Three Layers of Defense

Discovering fabrication is not enough. It must be structurally impossible.

Layer 1: experiments/runner.py

Every experiment runs through runner.py, which automatically records:

  • run_id – unique execution identifier
  • git_commit – code commit hash at execution time
  • code_hash – SHA‑256 hash of the script itself
  • stdout / stderr – complete output logs
  • results_json – structured result data

Manually inserting values into the database is technically possible—but the next layer catches it.

Layer 2: registry.sqlite + Hash Chain

Each execution record includes the hash of the previous record, forming a blockchain‑like chain:

run_001: hash = SHA256(data_001)
run_002: hash = SHA256(data_002 + hash_001)
run_003: hash = SHA256(data_003 + hash_002)

Tampering with any record breaks all subsequent hashes. Detection is performed by verify_db_integrity().

Layer 3: In‑Paper Annotations

Every experimental value in the paper is linked to its execution ID:

acceptance rate was approximately 7.3% 

From this ID, the registry provides full traceability: code, inputs, outputs—everything needed for reproduction.

The One Rule

On top of this system, one rule governs all writing:

If you cannot attach a “ to a number, that number does not go in the paper.

Simple, but it structurally blocks every “plausible lie” an AI might generate.


Chapter 4: Correction and Republication

paper_draft_v4.md

All 29 corrections applied. The revised manuscript now contains only values that are verifiably linked to reproducible experiment runs.


End of document.

Verification Summary

  • Verification script _verify_v4.py confirmed zero remaining fabrication patterns.

Section Updates

  • Section 4.3: Fully retracted → replaced with an integrity note.
  • Section 4.1: Fictitious metrics and parameters removed.
  • Section 4.2: All values replaced with measured data + annotations.
  • Section 4.4: Backed by
0 views
Back to Blog

Related posts

Read more »