LLMs can unmask pseudonymous users at scale with surprising accuracy
Source: Ars Technica
Experiment Overview

Recall at various precision thresholds.
In a third experiment, the researchers took 5,000 users from the Netflix dataset and added another 5,000 “distraction” identities of people not in the results. They then added to the list of 10,000 candidate profiles 5,000 query distractors comprising users who appear only in a query set, with no true match in the candidate pool.
Results

Precision curves.
The researchers wrote:
(a) The precision of classical attacks drops very fast, explaining its low recall. In contrast, the precision of LLM‑based attacks decays more gracefully as the attacker makes more guesses.
(b) The classical attack almost fails completely even at moderately low precision. In contrast, even the simplest LLM attack (Search) achieves non‑trivial recall at low precision, and extending it with Reason and Calibrate steps doubles Recall @99% Precision.
The results show that LLMs, while still prone to false positives and other weaknesses, are quickly outstripping more traditional, resource‑intensive methods for identifying users online.
Mitigations
The researchers proposed several mitigations, including:
- Platforms enforcing rate limits on API access to user data.
- Detecting automated scraping and restricting bulk data exports.
- LLM providers monitoring for misuse of their models in deanonymization attacks and building guardrails that make models refuse such requests.
Implications
If LLMs’ success in deanonymizing people improves, the researchers warn that:
- Governments could use the techniques to unmask online critics.
- Corporations might assemble customer profiles for “hyper‑targeted advertising.”
- Attackers could build profiles of targets at scale to launch highly personalized social‑engineering scams.
“Recent advances in LLM capabilities have made it clear that there is an urgent need to rethink various aspects of computer security in the wake of LLM‑driven offensive cyber capabilities,” the researchers warned. “Our work shows that the same is likely true for privacy as well.”