Show HN: We fingerprinted 178 AI models' writing styles and similarity clusters

Published: (April 8, 2026 at 10:06 AM EDT)
1 min read
Source: Hacker News

Source: Hacker News

Dataset

  • 3,095 standardized AI responses across 43 prompts.
  • Each response is represented by a 32‑dimension stylometric fingerprint (lexical richness, sentence structure, punctuation habits, formatting patterns, discourse markers).

Findings

  • 9 clone clusters (> 90 % cosine similarity on z‑normalized feature vectors).
  • Mistral Large 2 and Large 3 achieve an 84.8 % score on a composite metric that combines five independent signals.
  • Gemini 2.5 Flash Lite writes 78 % like Claude 3 Opus, while costing 185× less.
  • Meta exhibits the strongest provider “house style” with a 37.5× distinctiveness ratio.
  • The prompt “Satirical fake news” causes the most writing convergence across all models.
  • The prompt “Count letters” causes the most divergence.

Composite Clone Score

The composite clone score combines:

  1. Prompt‑controlled head‑to‑head similarity.
  2. Per‑feature Pearson correlation across challenges.
  3. Response length correlation.
  4. Cross‑prompt consistency.
  5. Aggregate cosine similarity.

Technology

  • Stylometric extraction implemented in Node.js.
  • Z‑score normalization applied to feature vectors.
  • Cosine similarity used for aggregate comparisons.
  • Pearson correlation employed for per‑feature tracking.
  • Analysis script totals approximately 1,400 lines of code.

Additional Information

  • Comments URL:
  • Points: 24
  • Comments: 6
0 views
Back to Blog

Related posts

Read more »