[Paper] Identifying Models Behind Text-to-Image Leaderboards

Published: 3 weeks ago (January 14, 2026 at 12:30 PM EST)

4 min read

Source: arXiv

Source: arXiv - 2601.09647v1

Overview

The paper Identifying Models Behind Text‑to‑Image Leaderboards exposes a hidden privacy flaw in the way popular text‑to‑image (T2I) model leaderboards are run. While these leaderboards hide the model name to keep the competition fair, the authors demonstrate that the visual “fingerprint” of each model can be recovered automatically, effectively de‑anonymizing the submissions. This finding has immediate consequences for how we evaluate, share, and protect generative AI systems.

Key Contributions

Model fingerprinting in image space: Shows that outputs from a given T2I model cluster tightly in a high‑dimensional embedding space, creating a distinctive signature.
Simple, prompt‑agnostic deanonymization: Introduces a centroid‑based classifier that can identify the source model with >90 % accuracy across 22 models and 150 K generated images, without needing to know the prompts or training data.
Prompt‑level distinguishability metric: Proposes a quantitative measure of how “identifiable” a prompt is, revealing that some prompts make models almost trivially separable.
Large‑scale empirical analysis: Evaluates the method on a diverse set of models (diffusion, latent diffusion, GLIDE, etc.) and prompts, confirming the robustness of the fingerprinting effect.
Security recommendations: Highlights the need for stronger anonymization techniques and suggests concrete defenses (e.g., adding noise, style‑transfer post‑processing).

Methodology

Data collection: The authors generated 150 K images using 22 publicly available T2I models on a shared pool of 280 prompts (covering a wide range of subjects, styles, and complexities).
Embedding extraction: Each image was passed through a pre‑trained CLIP vision encoder, producing a 512‑dimensional vector that captures semantic content while being relatively model‑agnostic.
Centroid construction: For every model, the mean (centroid) of all its image embeddings was computed.
Deanonymization classifier: A new image is assigned to the model whose centroid is closest in cosine distance. No additional training or prompt information is required.
Prompt‑level analysis: The authors compute a distinguishability score for each prompt by measuring the separation between model clusters when that prompt is used.
Evaluation: Accuracy, precision, and recall are reported across multiple splits, and ablation studies test the impact of embedding model, number of prompts, and image resolution.

Results & Findings

High deanonymization accuracy: The centroid classifier correctly identified the source model for 92 % of test images (top‑1) and 98 % when allowing a top‑3 guess.
Distinctive model signatures: Even models that share architecture or training data (e.g., two versions of Stable Diffusion) formed separable clusters, suggesting subtle implementation‑level differences (sampling schedule, tokenizer tweaks, etc.).
Prompt influence: Certain prompts (e.g., “a photo of a red apple on a wooden table”) yielded near‑perfect distinguishability (>99 % accuracy), while others (abstract scenes) produced much lower scores.
Robustness to transformations: Simple post‑processing (cropping, JPEG compression) reduced accuracy only modestly (down to ~85 %), indicating that the fingerprint persists through typical image‑hosting pipelines.
Scalability: Adding more models only marginally decreased performance, implying the approach scales to larger leaderboards.

Practical Implications

Leaderboard design: Organizers must rethink anonymity. Simple shuffling of outputs is insufficient; additional steps such as adding stochastic visual noise, applying style‑transfer, or using multiple “cover” models may be required.
Model provenance tracking: The fingerprinting technique could be repurposed as a forensic tool to detect unauthorized reuse of proprietary T2I models in the wild.
Competitive fairness: Developers can no longer rely on blind voting to hide implementation details; strategic prompt selection could unintentionally reveal a model’s identity.
Privacy & IP concerns: Companies that license T2I models may need to embed protective transformations to prevent competitors from reverse‑engineering their model signatures.
Benchmark reproducibility: Researchers should disclose the embedding model and clustering method used for any anonymity claim, enabling reproducible security assessments.

Limitations & Future Work

Dependence on CLIP embeddings: The study uses a single vision encoder; alternative embeddings (e.g., DINO, ViT‑G) might affect fingerprint strength.
Prompt pool bias: Although 280 prompts are diverse, they may not cover niche domains where models behave more similarly.
Defensive strategies not fully evaluated: Proposed anonymization tricks (noise injection, style transfer) are only preliminarily tested; systematic evaluation of their trade‑offs (image quality vs. anonymity) remains open.
Cross‑modal attacks: The paper focuses on image‑only deanonymization; extending the analysis to video or multimodal outputs could reveal further vulnerabilities.

Bottom line: The work shines a light on an overlooked security dimension of generative AI evaluation. For developers, researchers, and platform operators, it’s a call to embed stronger privacy safeguards into the very pipelines that showcase our most impressive AI creations.

Authors

Ali Naseh
Yuefeng Peng
Anshuman Suri
Harsh Chaudhari
Alina Oprea
Amir Houmansadr

Paper Information

arXiv ID: 2601.09647v1
Categories: cs.CV, cs.CR, cs.LG
Published: January 14, 2026
PDF: Download PDF

[Paper] Identifying Models Behind Text-to-Image Leaderboards

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] ShapeR: Robust Conditional 3D Shape Generation from Casual Captures

[Paper] MHA2MLA-VLM: Enabling DeepSeek's Economical Multi-Head Latent Attention across Vision-Language Models

[Paper] PRISM-CAFO: Prior-conditioned Remote-sensing Infrastructure Segmentation and Mapping for CAFOs

[Paper] When Are Two Scores Better Than One? Investigating Ensembles of Diffusion Models