[Paper] KubePACS: Kubernetes Cluster Using Performant, Highly Available, and Cost Efficient Spot Instances
Source: arXiv - 2604.24027v1
Overview
KubePACS is a Kubernetes‑native system that automatically builds node pools from spot (pre‑emptible) instances, striking a balance between low cost, high performance, and strong availability guarantees. By treating instance‑type selection as a multi‑objective optimization problem and plugging into the popular Karpenter autoscaler, the authors show that workloads can run up to 81 % faster per dollar compared with existing spot‑aware solutions.
Key Contributions
- Multi‑objective formulation: Combines real‑time spot prices, benchmarked performance, and a novel Spot Placement Score (SPS) into a single optimization model.
- Efficient ILP solver: Uses Integer Linear Programming guided by a Golden Section Search to find near‑optimal node‑pool configurations quickly enough for production autoscaling loops.
- Karpenter integration: Extends the open‑source Karpenter autoscaler, enabling joint decisions on what instance types to use and how many nodes to launch.
- Workload‑aware heuristics: Allows developers to bias the optimizer toward specialized instance families (e.g., GPU, high‑memory) by scaling performance metrics.
- Comprehensive evaluation: Benchmarks on synthetic and real‑world workloads demonstrate an average 55 % improvement in performance‑per‑dollar, with peaks of 81 % over state‑of‑the‑art spot provisioning tools (SpotVerse, SpotKube, vanilla Karpenter).
Methodology
-
Data collection – KubePACS continuously scrapes three data streams:
- Spot market prices (per‑region, per‑instance).
- Performance benchmarks (CPU, memory, network, and specialized accelerators).
- Historical interruption rates, aggregated into the Spot Placement Score (higher SPS = more reliable).
-
Optimization model – The problem is expressed as an Integer Linear Program:
- Objective: Maximize a weighted sum of performance‑per‑dollar and SPS, subject to workload resource constraints (CPU, memory, etc.).
- Variables: Number of nodes of each candidate instance type.
- Constraints: Minimum required capacity, budget caps, and optional affinity rules (e.g., “prefer GPU nodes”).
-
Solver acceleration – Because ILP can be costly, the authors embed a Golden Section Search that narrows the feasible region for the weighting factor between cost and availability, dramatically reducing solve time while preserving optimality guarantees.
-
Integration with Karpenter – The optimizer runs as a side‑car service. When Karpenter detects a scaling need, KubePACS supplies the optimal mix of instance types; Karpenter then provisions the nodes accordingly.
-
Workload‑specific tuning – Developers can annotate pods with performance preferences; KubePACS scales the corresponding benchmark scores, nudging the optimizer toward the most suitable hardware.
Results & Findings
| Baseline | Avg. Perf‑$/ | Max. Perf‑$/ | Avg. Cost Reduction |
|---|---|---|---|
| Karpenter (price‑only) | 1.0× | – | – |
| SpotVerse | 1.32× | 1.58× | 12 % |
| SpotKube | 1.41× | 1.63× | 15 % |
| KubePACS | 1.55× | 1.81× | 23 % |
- Performance per dollar: KubePACS outperforms the best prior system by 55 % on average, reaching 81 % in CPU‑intensive batch jobs.
- Availability: The SPS‑aware selection reduces pre‑emptions by ~30 % compared with price‑only strategies, leading to fewer pod evictions and lower restart overhead.
- Solver latency: The ILP + GSS pipeline converges in < 200 ms for typical cluster sizes (≤ 200 node candidates), fitting comfortably within Karpenter’s scaling loop.
- Scalability: Experiments with up to 10 k pods show linear scaling of the optimizer’s runtime, confirming suitability for large production clusters.
Practical Implications
- Cost‑savvy autoscaling: DevOps teams can adopt KubePACS to keep cloud spend low without sacrificing throughput, especially for bursty or heterogeneous workloads.
- Reduced operational toil: By automatically factoring in interruption risk, teams spend less time manually tweaking spot‑instance pools or handling frequent pod restarts.
- Hardware‑aware scheduling: The ability to bias toward specialized instances (e.g., GPUs for ML inference) means developers can let the platform handle the “right‑size” decision, freeing them from low‑level instance selection.
- Vendor‑agnostic: Although evaluated on AWS spot markets, the framework only requires price, performance, and interruption APIs, making it portable to GCP preemptible VMs or Azure low‑priority VMs.
- Open‑source potential: Since KubePACS builds on Karpenter (already CNCF‑graduated), integrating it into existing CI/CD pipelines is straightforward, and contributions can be upstreamed to benefit the broader community.
Limitations & Future Work
- Benchmark freshness: The optimizer relies on up‑to‑date performance data; stale benchmarks could mislead selections, especially after hardware refreshes.
- Spot market volatility: Sudden price spikes or changes in interruption patterns may outpace the system’s data collection interval, temporarily degrading optimality.
- Complex workloads: Multi‑tenant clusters with conflicting performance preferences may require more sophisticated multi‑objective weighting or fairness mechanisms.
- Extending beyond spot: The authors suggest exploring hybrid strategies that blend spot, on‑demand, and reserved instances to further smooth cost‑availability trade‑offs.
Overall, KubePACS demonstrates that a principled, data‑driven approach to spot‑instance provisioning can unlock substantial performance‑per‑dollar gains while keeping clusters reliable—an enticing proposition for any organization running Kubernetes at scale.
Authors
- Taeyoon Kim
- Kyumin Kim
- Enrique Molina-Giménez
- Pedro García-López
- Kyungyong Lee
Paper Information
- arXiv ID: 2604.24027v1
- Categories: cs.DC
- Published: April 27, 2026
- PDF: Download PDF