[Paper] When Tables Leak: Attacking String Memorization in LLM-Based Tabular Data Generation
Source: arXiv - 2512.08875v1
Overview
Large Language Models (LLMs) are now being used to synthesize realistic tabular datasets—think CSV files for training analytics pipelines or privacy‑preserving data sharing. This paper uncovers a subtle but serious privacy flaw: when LLMs generate rows that contain numeric strings (e.g., credit‑card numbers, IDs, timestamps), they can unintentionally reproduce exact digit sequences they have seen during training. The authors introduce a “no‑box” membership inference attack that detects this leakage using only the synthetic output, and they propose lightweight defenses that keep the data useful while blocking the attack.
Key Contributions
- Identification of a new privacy risk: Demonstrates that LLM‑based tabular generators can memorise and regurgitate numeric digit strings from their training corpora.
- LevAtt attack: A simple, black‑box membership inference method that inspects generated numeric strings to decide whether a particular training record was memorised.
- Comprehensive empirical study: Evaluates LevAtt across fine‑tuned small models (e.g., GPT‑Neo, LLaMA‑7B) and prompting‑based large models (e.g., GPT‑4, Claude) on diverse public tabular benchmarks.
- Defensive strategies: Proposes two mitigation techniques, including a novel digit‑perturbation sampling that randomly tweaks digits during generation without breaking the statistical properties of the table.
- Utility‑privacy trade‑off analysis: Shows that the proposed defenses cut attack success rates dramatically (often to random guessing) while preserving downstream model performance (e.g., classification accuracy, regression R²) within a few percentage points.
Methodology
- Threat model – The attacker only sees the synthetic table produced by the LLM. No access to model weights, prompts, or the original training set is assumed.
- Attack pipeline (LevAtt)
- Extraction: Scan every generated row for contiguous numeric substrings (e.g., “12345678”).
- Hash‑lookup: Compare each substring against a public hash of the training dataset’s numeric fields (the hash can be constructed from any leaked snippet or from a known public subset).
- Decision rule: If a substring matches a hash entry, flag the corresponding original record as a member (i.e., the model memorised it).
- Datasets & models – The authors use 12 public tabular corpora (UCI, OpenML, Kaggle) covering finance, health, and IoT domains. Models include:
- Fine‑tuned LLaMA‑7B, GPT‑Neo‑2.7B, and T5‑base on the raw CSV.
- Prompt‑based generation with GPT‑3.5‑Turbo, GPT‑4, Claude‑2, and Gemini‑Pro using few‑shot examples.
- Defenses –
- Differentially private fine‑tuning (DP‑SGD) as a baseline.
- Digit‑perturbation sampling: During token sampling, numeric tokens are replaced with a nearby digit (±1) with a small probability ε, ensuring the overall distribution of numeric fields stays intact.
Results & Findings
| Model / Setting | Attack Success (Precision) | Utility Drop (ΔAccuracy) |
|---|---|---|
| Fine‑tuned LLaMA‑7B | 0.93 (near‑perfect) | –0.4 % |
| GPT‑4 (prompt) | 0.78 | –0.2 % |
| Claude‑2 (prompt) | 0.71 | –0.3 % |
| DP‑SGD (ε=1.0) | 0.45 | –5.1 % |
| Digit‑perturbation (ε=0.05) | 0.12 | –0.6 % |
- Leakage is pervasive: Even state‑of‑the‑art LLMs leak exact numeric strings for up to 90 % of rows that contain high‑entropy identifiers.
- No‑box attack works: LevAtt achieves near‑perfect membership classification without any model queries, simply by parsing the synthetic CSV.
- Defenses are effective: The proposed digit‑perturbation reduces attack success to near‑random guessing while incurring negligible loss in downstream model performance (often <1 %).
- Differential privacy is overkill: Traditional DP‑SGD eliminates leakage but at a steep utility cost (>5 % accuracy loss), making the lightweight perturbation a more practical option for many pipelines.
Practical Implications
- Data‑sharing platforms (e.g., OpenAI’s fine‑tuned data marketplace, synthetic data vendors) must audit generated tables for numeric memorisation before releasing them to customers.
- Compliance teams should treat synthetic CSVs with the same caution as raw data when handling regulated identifiers (PCI‑DSS, HIPAA). Simple regex scans for long digit strings can flag risky outputs.
- Developers building synthetic data pipelines can integrate the digit‑perturbation sampler as a drop‑in replacement for the default token‑sampling step in libraries like HuggingFace’s
transformers. - Model‑as‑a‑service providers may expose a “privacy‑mode” flag that automatically activates the perturbation strategy, offering a trade‑off between fidelity and regulatory safety.
- Security auditors now have a concrete, reproducible attack (LevAtt) to benchmark privacy guarantees of any LLM‑based tabular generator, similar to how side‑channel tests are used for cryptographic hardware.
Limitations & Future Work
- Scope of numeric leakage: The study focuses on pure digit sequences; mixed alphanumeric identifiers (e.g., UUIDs, hashed emails) were not evaluated and may exhibit different memorisation patterns.
- Assumption of hash availability: LevAtt requires a hash of the training numeric fields. In a real‑world scenario, an attacker may need to obtain or approximate this hash, which could be non‑trivial.
- Dataset size bias: Smaller, high‑entropy datasets showed higher leakage rates; scaling the analysis to massive industrial tables (millions of rows) remains an open question.
- Defensive generalisation: The digit‑perturbation strategy is tailored to numeric tokens; extending a similar low‑overhead perturbation to categorical or free‑text fields warrants further research.
- Formal privacy guarantees: Future work could combine the perturbation approach with provable guarantees (e.g., Rényi DP) to provide quantifiable risk metrics for synthetic tabular releases.
Authors
- Joshua Ward
- Bochao Gu
- Chi-Hua Wang
- Guang Cheng
Paper Information
- arXiv ID: 2512.08875v1
- Categories: cs.LG, cs.AI
- Published: December 9, 2025
- PDF: Download PDF