[Paper] Mono2Sls: Automated Monolith-to-Serverless Migration via Multi-Stage Pipeline with Static Analysis
Source: arXiv - 2604.24550v1
Overview
Mono2Sls tackles one of the most painful chores for cloud‑first teams: turning a monolithic web backend into a serverless application that runs on AWS. By stitching together lightweight static analysis with a chain of specialized LLM agents, the authors built an end‑to‑end pipeline that automatically produces a deployable AWS SAM (Serverless Application Model) project. Their evaluation on real‑world codebases shows that the system can ship working serverless services without any manual tweaks, dramatically lowering the barrier to adopt serverless architectures.
Key Contributions
- Fully automated monolith‑to‑serverless pipeline – converts a traditional codebase into a ready‑to‑deploy SAM application with zero manual intervention.
- Hybrid static‑analysis + LLM workflow – uses call‑graph and async‑behavior analysis to feed four purpose‑built LLM agents (Architect, Code Developer, SAM Engineer, Consistency Validator).
- Explicit intermediate artifacts – each agent produces concrete artifacts (e.g., service decomposition plan, generated code, SAM templates) that are validated before moving to the next stage, improving reliability.
- Curated SAM knowledge base – a lightweight, domain‑specific prompt library that steers the LLMs toward correct AWS‑native patterns (IAM, API Gateway, Lambda async).
- Strong empirical results – on six open‑source backends (10 K+ LOC, 76 endpoints) Mono2Sls attains 100 % deployment success, 66.1 % end‑to‑end functional correctness, and 98.7 % API‑coverage F1, outperforming commercial baselines by 5–12 percentage points.
- Ablation showing static analysis impact – removing the static‑analysis‑driven architecture step drops correctness by 23.4 pts, confirming its central role.
Methodology
- Static Entry‑Point Discovery – a lightweight analyzer scans the monolith to locate HTTP handlers, background workers, and database access points. It builds a call graph and flags asynchronous constructs (e.g.,
async/await, message queues). - Service Decomposition (Architect Agent) – the call‑graph data is fed to an LLM that proposes a microservice‑style split, mapping each entry point to a candidate Lambda function or a set of related Lambdas. The output is a service blueprint (function boundaries, required IAM permissions, event sources).
- Code Generation (Code Developer Agent) – using the blueprint, a second LLM rewrites the original source into Lambda‑compatible modules, extracting shared utilities, refactoring stateful code into stateless handlers, and inserting AWS SDK calls where needed.
- Infrastructure Synthesis (SAM Engineer Agent) – a third LLM consumes the generated code and produces a SAM template (YAML) that declares functions, API Gateway routes, DynamoDB tables, S3 buckets, and IAM roles. The agent consults the curated SAM knowledge base to ensure best‑practice configurations (e.g., provisioned concurrency, async event bridges).
- Consistency Validation (Validator Agent) – before deployment, the final agent runs a suite of checks: schema validation of the SAM template, static type checks on the generated code, and a diff against the original API surface to guarantee coverage. Any mismatch triggers a feedback loop to the earlier agents.
- Deployment & Test – the resulting SAM package is deployed to a sandbox AWS account and exercised with automatically generated API tests derived from the original monolith’s routing table.
All agents communicate through explicit artifacts (blueprint, code diff, SAM YAML, validation report) rather than hidden prompt state, which makes the pipeline auditable and easier to debug.
Results & Findings
| Metric | Mono2Sls | Best Commercial Baseline |
|---|---|---|
| Deployment success (no manual fix) | 100 % | 78–85 % |
| End‑to‑end functional correctness* | 66.1 % | 53.7–61.2 % |
| API‑coverage F1 (detecting all original endpoints) | 98.7 % | 88.4 % |
| Average time to migrate a 10 K‑LOC app | ~45 min (including AWS deployment) | 2–4 h (manual effort) |
*Correctness measured by passing a curated integration test suite that exercises business logic and data‑store interactions.
Additional observations
- Migrated services consistently used AWS‑native authentication (Cognito, IAM) instead of custom token checks, reducing surface area for security bugs.
- Asynchronous patterns (SQS, EventBridge) were introduced automatically where the original code used background threads, leading to better scalability.
- The ablation study showed that removing static‑analysis‑guided architecture planning caused a 23.4‑point drop in correctness, confirming that the upfront decomposition is the pipeline’s linchpin.
Practical Implications
- Speed up cloud migration projects – Teams can spin up a serverless version of an existing monolith in under an hour, freeing engineers to focus on business features rather than plumbing.
- Lower the expertise barrier – Developers unfamiliar with SAM, IAM, or Lambda nuances can rely on the curated knowledge base, reducing the need for specialized cloud architects.
- Cost‑effective scaling – By automatically refactoring long‑running background jobs into event‑driven Lambdas, organizations can benefit from pay‑per‑use pricing and avoid over‑provisioned VMs.
- Security hardening out‑of‑the‑box – The pipeline’s validator enforces least‑privilege IAM roles and replaces ad‑hoc auth checks with managed services, helping compliance teams.
- Blueprint for LLM‑augmented DevOps – The multi‑stage, artifact‑driven design can be adapted to other migration scenarios (e.g., monolith‑to‑Kubernetes, legacy‑to‑microservices) or to generate infrastructure‑as‑code for new projects.
Limitations & Future Work
- Language & framework scope – The current prototype targets Python‑based web backends (Flask/Django). Extending to Java, Node.js, or Go will require additional static‑analysis adapters.
- Functional correctness ceiling – While 66 % end‑to‑end correctness is impressive for a fully automated run, the remaining gaps stem from complex business rules that the LLMs misinterpret; a hybrid human‑in‑the‑loop review could push this higher.
- Runtime performance variance – Serverless execution introduces cold‑start latency and different concurrency limits; the paper does not evaluate performance or cost trade‑offs compared to the original monolith.
- Reliance on LLM quality – The pipeline’s success hinges on the underlying LLM’s ability to follow prompts accurately; model updates or hallucinations could break the artifact chain.
- Security audit depth – The validator checks IAM permissions but does not perform full static security analysis (e.g., secret leakage, injection risks). Future work could integrate dedicated security scanners.
By addressing these points, Mono2Sls could evolve from a research prototype into a production‑grade migration service that many cloud‑first enterprises can adopt today.
Authors
- Xingyan Chen
- Yuxin Su
- Zishan Su
- Yang Yu
- Zibin Zheng
Paper Information
- arXiv ID: 2604.24550v1
- Categories: cs.SE
- Published: April 27, 2026
- PDF: Download PDF