[Paper] Green or Fast? Learning to Balance Cold Starts and Idle Carbon in Serverless Computing

Published: 3 days ago (February 27, 2026 at 06:35 AM EST)

4 min read

Source: arXiv

Source: arXiv - 2602.23935v1

Overview

The paper tackles a core tension in serverless platforms: keeping function instances “warm” to avoid costly cold‑starts versus releasing them to cut idle power use and associated carbon emissions. By treating the keep‑alive decision as a sequential learning problem, the authors introduce LACE‑RL, a reinforcement‑learning controller that dynamically adjusts warm‑up times based on real‑time workload and grid carbon intensity. Their results show dramatic reductions in both latency spikes and carbon waste, making serverless more sustainable without sacrificing performance.

Key Contributions

LACE‑RL framework: a deep reinforcement‑learning (RL) controller that jointly optimizes latency (cold‑start avoidance) and carbon impact (idle emissions).
Latency‑aware carbon model: integrates per‑function cold‑start probabilities, function‑specific latency penalties, and real‑time grid carbon intensity into a single reward signal.
Dynamic keep‑alive policy: replaces static, one‑size‑fits‑all keep‑alive timers with per‑function, time‑varying decisions learned online.
Extensive evaluation: uses the Huawei Public Cloud trace (real‑world workload + regional carbon data) to benchmark against Huawei’s static policy, heuristic baselines, and single‑objective RL approaches.
Near‑optimal trade‑off: achieves performance close to an oracle that knows future workload, while cutting cold‑starts by ≈ 52 % and idle carbon by ≈ 77 % relative to the static baseline.

Methodology

Problem formulation – The decision of how long to keep a function pod alive after it finishes is modeled as a Markov Decision Process (MDP). The state captures recent request rates, the current carbon intensity of the power grid, and the function’s historical cold‑start latency.
Reward design – The reward penalizes both latency (cold‑start delay weighted by a service‑level objective) and carbon emissions (kWh × grid carbon intensity) incurred while a pod stays idle. A tunable coefficient lets operators prioritize one objective over the other.
Deep RL agent – A dueling DQN (Deep Q‑Network) with experience replay learns a policy that maps states to a keep‑alive duration (discrete action space). The network is trained online using the live trace, allowing it to adapt to diurnal workload and carbon‑intensity patterns.
Baseline comparisons – The authors implement Huawei’s static keep‑alive timer, a heuristic that scales the timer with request rate, and single‑objective RL agents (latency‑only, carbon‑only).
Evaluation metrics – Cold‑start frequency, average request latency, idle‑time carbon emissions, and a combined latency‑carbon trade‑off score are reported.

Results & Findings

Metric	Huawei static policy	Heuristic	Latency‑only RL	Carbon‑only RL	LACE‑RL
Cold‑starts (↓)	100 % (baseline)	–22 %	–41 %	–35 %	‑52 %
Idle carbon (kWh)	100 % (baseline)	–45 %	–60 %	–71 %	‑77 %
Avg. latency (ms)	120	115	108	112	104
Combined score (higher is better)	0.0	0.12	0.18	0.16	0.23

Cold‑starts drop by more than half, directly improving user‑perceived latency.
Idle carbon falls by three‑quarters, showing that the RL agent aggressively releases pods when the grid is dirty or workload is low.
The latency‑carbon trade‑off curve of LACE‑RL dominates all baselines and sits within 5 % of an oracle that knows future requests, confirming near‑optimal decision making.

Practical Implications

Serverless providers can embed LACE‑RL (or a similar RL controller) into their orchestration layer to automatically adapt keep‑alive timers per function, reducing operational costs and carbon footprints without manual tuning.
DevOps teams gain a knob (the latency‑vs‑carbon weight) to align platform behavior with corporate sustainability goals or SLA requirements.
Edge and hybrid cloud deployments—where carbon intensity can swing dramatically—benefit especially from dynamic policies that react to real‑time grid data.
Cost modeling: lower idle power translates to measurable savings on cloud bills, while fewer cold‑starts improve end‑user experience, potentially increasing adoption of serverless architectures.

Limitations & Future Work

Training data dependency: LACE‑RL relies on historical request traces and accurate, timely grid carbon intensity feeds; noisy or delayed data could degrade performance.
Scalability of the RL agent: The current implementation uses a single global model; scaling to thousands of functions with heterogeneous characteristics may require hierarchical or federated learning approaches.
Policy interpretability: Deep RL policies are opaque, making it hard for operators to audit decisions—future work could explore explainable RL or rule‑extraction techniques.
Generalization across clouds: Experiments are limited to Huawei’s public cloud trace; validating on other providers (AWS, Azure, GCP) and on multi‑region workloads is an open step.

Overall, the paper demonstrates that intelligent, data‑driven keep‑alive management can reconcile the competing goals of low latency and low carbon, paving the way for greener serverless computing.

Authors

Bowen Sun
Christos D. Antonopoulos
Evgenia Smirni
Bin Ren
Nikolaos Bellas
Spyros Lalis

Paper Information

arXiv ID: 2602.23935v1
Categories: cs.DC, cs.AI, cs.PF
Published: February 27, 2026
PDF: Download PDF

[Paper] Green or Fast? Learning to Balance Cold Starts and Idle Carbon in Serverless Computing

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Mode Seeking meets Mean Seeking for Fast Long Video Generation

[Paper] DARE-bench: Evaluating Modeling and Instruction Fidelity of LLMs in Data Science

[Paper] Do LLMs Benefit From Their Own Words?

[Paper] CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation