[Paper] Delay-Aware Multi-Stage Edge Server Upgrade with Budget Constraint
Source: arXiv - 2512.16792v1
Overview
The paper introduces Multi‑stage Edge Server Upgrade (M‑ESU), a planning framework that helps operators evolve a Multi‑Access Edge Computing (MEC) infrastructure over several years while staying within a fixed budget per stage. By jointly deciding where to add new edge servers, which existing ones to upgrade, and how to offload tasks, the authors aim to maximize the proportion of user requests that meet their latency requirements as workloads grow and become more demanding.
Key Contributions
- Formal definition of a multi‑stage MEC upgrade problem that captures realistic constraints: per‑stage budget, depreciation of hardware costs, evolving task volumes, larger payloads, and tighter delay bounds.
- Two solution approaches:
- An exact Mixed‑Integer Linear Programming (MILP) model for small‑scale networks.
- A scalable heuristic (M‑ESU/H) that delivers near‑optimal decisions for large deployments.
- Comprehensive evaluation showing that the heuristic stays within 1.25 % of the MILP optimum on small instances and outperforms three baseline heuristics by up to 21.57 % in task‑satisfaction ratio on large networks.
- Practical guidelines on how to balance server deployment vs. capacity upgrades under a rolling‑budget scenario.
Methodology
- System Model – The authors model a set of edge sites, each with an existing compute capacity and a cost to either add a new server or upgrade the current one. User tasks arrive with known size distributions, growth rates, and deadline constraints that become stricter each stage.
- Decision Variables – For every stage the optimizer chooses:
- Deploy (binary) – whether to install a brand‑new server at a site.
- Upgrade (binary) – whether to increase the CPU cores/memory of an existing server.
- Offload (continuous) – the fraction of tasks routed to each server.
- Objective – Maximize the average task‑satisfaction ratio, i.e., the proportion of tasks that finish within their deadline across all stages.
- Constraints – Include per‑stage budget, depreciation of capital expenses, server capacity limits, and the evolving demand profile.
- Solution Strategies –
- MILP: Encodes the problem as a linear program with integer variables; solved with commercial solvers for tiny topologies (≤ 10 sites).
- M‑ESU/H heuristic: Greedy‑based algorithm that iteratively evaluates the marginal gain in task satisfaction per dollar spent, alternating between deployment and upgrade actions while respecting the budget. The heuristic also re‑optimizes task offloading after each hardware decision using a simple linear assignment.
Results & Findings
| Scenario | Approach | Gap to Optimal (small) | Speedup vs. MILP | Improvement over Baselines (large) |
|---|---|---|---|---|
| ≤ 10 edge sites | M‑ESU/H vs. MILP | ≤ 1.25 % | 10³–10⁴× faster | — |
| 50–200 edge sites | M‑ESU/H vs. Deploy‑Only heuristic | +12.3 % | — | +21.57 % task satisfaction |
| 200+ edge sites | M‑ESU/H vs. Upgrade‑First heuristic | +9.8 % | — | +15.4 % task satisfaction |
- Budget sensitivity: With tighter per‑stage budgets, the heuristic automatically favors upgrades (cheaper marginal gain) over new deployments, preserving satisfaction levels.
- Demand growth: As task volume grows at 10 %/yr and deadlines shrink, the combined deployment‑upgrade strategy maintains a higher satisfaction ratio than any single‑focus policy.
- Scalability: The heuristic runs in linear time with respect to the number of sites, making it suitable for city‑wide MEC rollouts.
Practical Implications
- Road‑map planning – Network operators can use M‑ESU/H to generate a multi‑year upgrade calendar that aligns capital expenditures with expected traffic growth, avoiding over‑provisioning.
- Dynamic offloading – The framework’s offloading component can be integrated into existing edge orchestrators (e.g., Kubernetes‑based edge clusters) to automatically rebalance workloads after each hardware change.
- Cost‑effective scaling – By quantifying the trade‑off between adding new nodes vs. boosting existing ones, engineers can make data‑driven decisions that maximize ROI under fiscal constraints.
- Vendor‑agnostic – The model only requires cost, capacity, and depreciation inputs, so it can be applied regardless of hardware vendor or cloud‑edge provider.
- Edge‑AI workloads – For latency‑critical AI inference (AR/VR, autonomous driving), the approach ensures that the edge infrastructure evolves fast enough to meet tightening latency SLAs.
Limitations & Future Work
- Static demand forecasts – The current model assumes deterministic growth rates; real‑world demand can be bursty or seasonal. Incorporating stochastic demand or learning‑based forecasts would improve robustness.
- Single‑objective focus – Maximizing task‑satisfaction ignores other QoS dimensions such as energy consumption or reliability; a multi‑objective extension could be valuable.
- Hardware heterogeneity – The heuristic treats all servers as interchangeable upgrades; future work could model diverse hardware generations (GPUs, TPUs) and their specific cost‑performance curves.
- Real‑time re‑optimization – The study evaluates offline, stage‑by‑stage planning. Integrating the algorithm into an online controller that reacts to unexpected load spikes is an open research direction.
Authors
- Endar Suprih Wihidayat
- Sieteng Soh
- Kwan‑Wu Chin
- Duc‑son Pham
Paper Information
- arXiv ID: 2512.16792v1
- Categories: cs.DC, cs.AI
- Published: December 18, 2025
- PDF: Download PDF