[Paper] High-Performance Serverless Computing: A Systematic Literature Review on Serverless for HPC, AI, and Big Data

Published: 3 weeks ago (January 14, 2026 at 05:10 AM EST)

4 min read

Source: arXiv

Source: arXiv - 2601.09334v1

Overview

The paper surveys the fast‑growing intersection of serverless computing with high‑performance computing (HPC), artificial intelligence (AI), and big‑data workloads. By systematically reviewing 122 articles from 2018‑2025, the authors map how the “functions‑as‑a‑service” model is being adapted to run compute‑intensive, parallel jobs on cloud, HPC, and hybrid infrastructures.

Key Contributions

Comprehensive systematic literature review (SLR) covering 122 peer‑reviewed papers, providing the most up‑to‑date snapshot of serverless for HPC/AI/Big Data.
Taxonomy of research directions: eight primary categories (e.g., runtime optimization, data locality, scheduling, security, programming models, resource provisioning, hybrid orchestration, and performance benchmarking).
Use‑case taxonomy: nine domains such as scientific simulations, deep‑learning training/inference, graph analytics, stream processing, and large‑scale ETL pipelines.
Trend analysis: visualizes publication spikes, emerging sub‑fields, and the rise of cross‑disciplinary collaborations.
Collaboration network mapping: identifies key research clusters, influential authors, and institutions driving the field.
Practical guidance: distilled best‑practice recommendations for engineers looking to adopt serverless for compute‑heavy workloads.

Methodology

Search Strategy – The authors queried major digital libraries (IEEE Xplore, ACM DL, Scopus, arXiv) using a curated list of keywords (e.g., “serverless”, “FaaS”, “HPC”, “AI”, “big data”).
Inclusion/Exclusion Criteria – Papers had to (a) focus on serverless as the primary execution model, (b) target compute‑intensive workloads, and (c) present empirical results or a solid conceptual framework. Non‑English papers, tutorials, and pure cloud‑only case studies were filtered out.
Data Extraction – For each selected article, the team recorded metadata (year, venue, authors), research objectives, architectural choices, performance metrics, and reported challenges.
Synthesis – Using qualitative coding, the authors clustered the papers into thematic groups, then built the two taxonomies (research directions & use‑case domains). Bibliometric tools (VOSviewer) generated the collaboration graphs and trend plots.

The process follows standard SLR guidelines (Kitchenham & Charters) to ensure reproducibility and minimize bias.

Results & Findings

Finding	What it means
Rapid growth – Annual publications rose from <5 in 2018 to >30 in 2024.	The community is quickly recognizing serverless as a viable HPC/AI platform.
Hybrid orchestration dominates – 38 % of papers focus on bridging cloud FaaS with traditional HPC schedulers (e.g., Slurm, PBS).	Real‑world deployments need seamless integration with existing HPC clusters.
Performance bottlenecks – Cold‑start latency and limited GPU/FPGA access remain the top challenges.	Optimizing function initialization and exposing accelerators are critical research fronts.
Data locality matters – 62 % of successful prototypes co‑locate storage and compute (e.g., using object‑store triggers).	Reducing data movement is essential for scaling AI training and big‑data analytics.
Programming model evolution – New DSLs and extensions to existing frameworks (e.g., PyWren, CloudBurst) are emerging to express parallelism.	Developers can write familiar Python/Scala code while the runtime handles function sharding.
Security & multi‑tenant isolation – Only 15 % of studies address isolation guarantees for HPC workloads.	There is a gap in robust security models for sensitive scientific data.

Practical Implications

For Cloud‑Native AI Engineers – Serverless can offload bursty inference workloads, auto‑scale GPU functions, and reduce operational overhead compared to managing VMs or containers.
For HPC Administrators – Hybrid orchestration layers allow existing batch systems to tap into elastic cloud bursts without re‑architecting job scripts.
For Data Engineers – Event‑driven pipelines (e.g., Kafka → Lambda → S3) can now incorporate heavy transformations (e.g., map‑reduce, graph processing) by leveraging function parallelism.
Cost Optimization – Pay‑per‑use billing aligns well with irregular scientific workloads, potentially lowering total cost of ownership when combined with spot‑instance or pre‑emptible function offerings.
Tooling Roadmap – The taxonomy highlights where tooling is mature (e.g., Python‑based FaaS SDKs) and where gaps remain (e.g., GPU‑aware schedulers, secure multi‑tenant data pipelines).

Developers can start experimenting with open‑source serverless runtimes (OpenFaaS, Knative) that expose low‑level resource controls, or use managed services that now support GPUs (AWS Lambda GPU, Azure Functions on N-series).

Limitations & Future Work

Scope of literature – The review only includes papers up to early 2025; fast‑moving pre‑prints and industry whitepapers may be under‑represented.
Empirical depth – While the SLR aggregates reported metrics, it does not re‑run experiments, so cross‑paper performance comparisons may be affected by differing hardware and benchmark setups.
Security focus – The authors note a paucity of work on isolation and compliance for HPC data; future research should explore sandboxing, attestation, and confidential computing in serverless contexts.
Standardization – The field lacks a unified API for exposing accelerators and high‑speed interconnects; establishing open standards could accelerate adoption.

The authors call for more cross‑disciplinary collaborations, benchmark suites tailored to serverless HPC, and deeper investigations into latency‑critical AI inference at the edge.

Authors

Valerio Besozzi
Matteo Della Bartola
Patrizio Dazzi
Marco Danelutto

Paper Information

arXiv ID: 2601.09334v1
Categories: cs.DC, cs.LG
Published: January 14, 2026
PDF: Download PDF

[Paper] High-Performance Serverless Computing: A Systematic Literature Review on Serverless for HPC, AI, and Big Data

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Do explanations generalize across large reasoning models?

[Paper] Building Production-Ready Probes For Gemini

[Paper] ShapeR: Robust Conditional 3D Shape Generation from Casual Captures

[Paper] MetaboNet: The Largest Publicly Available Consolidated Dataset for Type 1 Diabetes Management