[Paper] How Do Semantically Equivalent Code Transformations Impact Membership Inference on LLMs for Code?
Source: arXiv - 2512.15468v1
Overview
Large language models (LLMs) for code are trained on massive code corpora that mix publicly available open‑source snippets with proprietary, license‑restricted code. Detecting whether a model has memorized a piece of private code is crucial for intellectual‑property compliance, and membership inference (MI) attacks have emerged as a tool for that purpose. This paper asks a pragmatic question: Can simple, semantics‑preserving code transformations be used to hide code from MI attacks? The authors systematically evaluate several such transformations and uncover a surprisingly effective loophole.
Key Contributions
- Empirical study of semantic transformations (e.g., variable renaming, dead‑code insertion, formatting changes) on MI success rates against state‑of‑the‑art code LLMs.
- Quantitative evidence that most single‑rule transformations degrade model accuracy by ≤ 1.5 % while preserving utility for downstream fine‑tuning.
- Identification of a high‑impact rule: the
RenameVariabletransformation reduces MI attack success by 10.19 %, the largest drop among all tested rules. - Causal analysis confirming variable renaming has the strongest direct effect on weakening MI detection.
- Negative result on composition: stacking multiple transformations does not yield additional MI resistance beyond the best single rule.
- Practical demonstration that transformed code can serve as a drop‑in replacement for original data when fine‑tuning LLMs, without noticeable performance loss.
Methodology
- Dataset Preparation – The authors start from a benchmark of code snippets (both public and private) used to train a popular code LLM.
- Transformation Rules – They implement five widely used, semantics‑preserving transformations:
RenameVariable(systematic identifier renaming)ReorderImports(shuffle import statements)AddDeadCode(inject unreachable statements)FormatChange(reformat whitespace/indentation)InlineComments(move or duplicate comments)
- Membership Inference Attack – They employ a standard black‑box MI attack that queries the model and measures confidence differences between training and non‑training samples.
- Evaluation Pipeline – For each rule, they:
- Apply the transformation to the training set.
- Fine‑tune the LLM on the transformed data.
- Run the MI attack on both original and transformed test sets.
- Causal Impact Estimation – Using a structural causal model, they isolate the effect of each transformation on MI success, controlling for confounding factors like code length and token distribution.
All steps are designed to be reproducible with publicly available tools (e.g., OpenAI’s Codex, Hugging Face Transformers) and standard Python AST manipulation libraries.
Results & Findings
| Transformation | Drop in Model Accuracy* | MI Success Reduction |
|---|---|---|
RenameVariable | 1.5 % (worst case) | 10.19 % |
ReorderImports | ≤ 0.8 % | 3.2 % |
AddDeadCode | ≤ 1.0 % | 4.5 % |
FormatChange | ≤ 0.5 % | 2.1 % |
InlineComments | ≤ 0.7 % | 3.0 % |
*Accuracy measured on a downstream code‑completion benchmark after fine‑tuning.
- Variable renaming is the clear outlier: it both preserves model performance and significantly hampers MI detection.
- Combining rules (e.g., renaming + dead code) does not further lower MI success; the effect plateaus after the strongest single rule.
- Causal analysis shows that the renaming operation directly disrupts the statistical signatures MI attacks rely on (e.g., token‑frequency spikes tied to specific identifiers).
Overall, the study demonstrates that a modest, automated transformation pipeline can meaningfully weaken privacy‑risk assessments without sacrificing model utility.
Practical Implications
- For Companies: Simple preprocessing (e.g., systematic variable renaming) can be integrated into CI pipelines before releasing code to external training services, offering a low‑cost “obfuscation‑as‑privacy” layer.
- For Model Vendors: Relying solely on MI attacks to certify that proprietary code has not been memorized may be insufficient; additional provenance tracking or watermarking may be required.
- For Developers of Code LLMs: Training pipelines should consider normalizing identifier names (or deliberately randomizing them) to reduce inadvertent memorization of sensitive code.
- Tooling Opportunities: Open‑source utilities that automatically apply the identified high‑impact transformations could become part of standard code‑sanitization suites, similar to linting or formatting tools.
- Regulatory Angle: The findings highlight a potential loophole in IP‑compliance audits; regulators may need to mandate more robust verification methods beyond MI.
Limitations & Future Work
- Scope of Transformations – The study examined only five well‑known transformations; more aggressive obfuscation (e.g., control‑flow flattening) could have different trade‑offs.
- Model Diversity – Experiments focused on a single code LLM architecture; results may vary for encoder‑only models or smaller fine‑tuned variants.
- Attack Variants – Only a standard black‑box MI attack was evaluated; adaptive attacks that account for transformed token distributions could regain effectiveness.
- Utility Trade‑off – While accuracy loss was minimal, the impact on downstream tasks like bug detection or code synthesis was not exhaustively measured.
Future research directions include exploring adaptive MI attacks, formal guarantees for transformation‑based privacy, and cross‑model studies to see whether the observed effects generalize across the rapidly expanding ecosystem of code‑focused LLMs.
Authors
- Hua Yang
- Alejandro Velasco
- Thanh Le‑Cong
- Md Nazmul Haque
- Bowen Xu
- Denys Poshyvanyk
Paper Information
- arXiv ID: 2512.15468v1
- Categories: cs.SE, cs.AI, cs.CR
- Published: December 17, 2025
- PDF: Download PDF