[Paper] Fine-tuned LLM-based Code Migration Framework
Source: arXiv - 2512.13515v1
Overview
The paper introduces a fine‑tuned large language model (LLM)‑driven framework for migrating legacy SQL codebases—particularly from Oracle PL/SQL to PostgreSQL—into modern, cloud‑native data platforms. By marrying classic software‑engineering practices with generative AI, the authors demonstrate a scalable, iterative workflow that dramatically cuts manual rewrite effort while preserving business logic.
Key Contributions
- Hybrid migration pipeline that combines traditional static analysis with an LLM fine‑tuned on SQL translation tasks.
- Iterative, semi‑automated conversion loop: automatic syntax mapping → error detection → expert feedback → model refinement.
- Fine‑tuning strategy that outperforms pure prompt‑engineering, yielding higher precision on complex constructs (stored procedures, triggers, views).
- Automated SQL feature detection and semi‑supervised error analysis to surface mismatches between source and target dialects.
- Empirical evaluation showing a 70‑80 % drop in syntax error rates and a 30 % reduction in manual review time across multiple migration cycles.
- Feedback‑in‑the‑loop mechanism that incorporates Subject‑Matter‑Expert (SME) corrections into the training data, enabling continuous improvement.
Methodology
-
Data Collection & Pre‑processing
- Extracted a corpus of Oracle PL/SQL objects (procedures, functions, triggers, views) from three real‑world enterprise databases.
- Generated paired PostgreSQL equivalents using a rule‑based baseline to bootstrap the training set.
-
Model Fine‑tuning
- Started from a publicly available code‑oriented LLM (e.g., CodeLlama‑7B).
- Applied supervised fine‑tuning on the paired corpus, emphasizing edge‑case constructs (cursor loops, bulk collect, autonomous transactions).
- Augmented training with SME‑curated corrections to teach the model how to resolve ambiguous mappings.
-
Iterative Migration Loop
- Automatic conversion: the fine‑tuned model generates PostgreSQL code for each PL/SQL object.
- Static validation: a syntax checker flags errors; a feature‑alignment analyzer scores semantic fidelity.
- Error triage: high‑confidence fixes are applied automatically; low‑confidence cases are presented to SMEs.
- Feedback ingestion: SME edits are fed back into the fine‑tuning dataset for the next iteration.
-
Evaluation
- Measured Syntax Error Rate (SER), Feature Alignment Score (FAS), and Manual Review Effort (MRE) across three migration cycles.
- Compared against a pure prompt‑engineering baseline and a traditional rule‑based converter.
Results & Findings
| Metric | Baseline (Rule‑based) | Prompt‑only LLM | Fine‑tuned LLM (this work) |
|---|---|---|---|
| Syntax Error Rate | 22 % | 12 % | 4 % |
| Feature Alignment Score | 68 % | 78 % | 91 % |
| Manual Review Effort (hrs per 1 k objects) | 15 | 9 | 5 |
- Syntax errors dropped by ≈80 % relative to the rule‑based approach.
- Semantic fidelity (how well the migrated code preserves original behavior) surpassed 90 % after two iterative cycles.
- The feedback loop contributed the most to gains: each SME correction reduced downstream SER by ~2 %.
Practical Implications
- Accelerated Cloud Migration: Enterprises can move legacy Oracle workloads to PostgreSQL or other open‑source platforms with far less manual re‑coding, shortening migration timelines from months to weeks.
- Cost Savings: Reducing manual review effort translates directly into lower consulting and developer hours—potentially saving $200–$500 k on a typical mid‑size migration project.
- Continuous Integration: The framework can be embedded into CI/CD pipelines, automatically flagging newly introduced PL/SQL code and suggesting PostgreSQL equivalents in real time.
- Extensibility: While the study focuses on Oracle→PostgreSQL, the same fine‑tuning + feedback paradigm can be adapted to other dialect pairs (e.g., T‑SQL → Snowflake SQL) or even to NoSQL schema migrations.
- Developer Enablement: By surfacing high‑confidence suggestions, developers spend more time on business‑logic validation rather than syntax fiddling, improving overall code quality.
Limitations & Future Work
- Dataset Scope: The training corpus was limited to three enterprise databases; broader dialect diversity (e.g., DB2, Sybase) remains untested.
- Runtime Semantics: The evaluation focused on syntactic correctness and static feature alignment; full end‑to‑end functional testing (performance, transaction semantics) was outside the paper’s scope.
- Model Size vs. Latency: Fine‑tuning a 7B‑parameter model yields good results, but larger models could further improve edge‑case handling at the cost of higher inference latency.
Future Directions
- Incorporate automated test‑generation to validate functional equivalence post‑migration.
- Explore few‑shot prompting combined with fine‑tuning to reduce the amount of SME‑curated data needed.
- Extend the pipeline to handle schema‑level migrations (data type conversions, indexing strategies) and cloud‑native optimizations (e.g., partitioning, sharding).
Bottom line: By fine‑tuning an LLM on real‑world SQL translation tasks and looping in expert feedback, the authors deliver a practical, repeatable framework that can dramatically streamline database migrations—a win for both developers and business leaders looking to modernize their data stack.
Authors
- Oleg Grynets
- Vasyl Lyashkevych
- Dmytro Baran
- Maksym Orliansky
- Taras Zelenyy
- Markiian Leshchyshyn
Paper Information
- arXiv ID: 2512.13515v1
- Categories: cs.SE, cs.CL, cs.LO
- Published: December 15, 2025
- PDF: Download PDF