Show HN: We fingerprinted 178 AI models' writing styles and similarity clusters
Published: (April 8, 2026 at 10:06 AM EDT)
1 min read
Source: Hacker News
Source: Hacker News
Dataset
- 3,095 standardized AI responses across 43 prompts.
- Each response is represented by a 32‑dimension stylometric fingerprint (lexical richness, sentence structure, punctuation habits, formatting patterns, discourse markers).
Findings
- 9 clone clusters (> 90 % cosine similarity on z‑normalized feature vectors).
- Mistral Large 2 and Large 3 achieve an 84.8 % score on a composite metric that combines five independent signals.
- Gemini 2.5 Flash Lite writes 78 % like Claude 3 Opus, while costing 185× less.
- Meta exhibits the strongest provider “house style” with a 37.5× distinctiveness ratio.
- The prompt “Satirical fake news” causes the most writing convergence across all models.
- The prompt “Count letters” causes the most divergence.
Composite Clone Score
The composite clone score combines:
- Prompt‑controlled head‑to‑head similarity.
- Per‑feature Pearson correlation across challenges.
- Response length correlation.
- Cross‑prompt consistency.
- Aggregate cosine similarity.
Technology
- Stylometric extraction implemented in Node.js.
- Z‑score normalization applied to feature vectors.
- Cosine similarity used for aggregate comparisons.
- Pearson correlation employed for per‑feature tracking.
- Analysis script totals approximately 1,400 lines of code.
Additional Information
- Comments URL:
- Points: 24
- Comments: 6