[Paper] KubePACS: Kubernetes Cluster Using Performant, Highly Available, and Cost Efficient Spot Instances

Published: 2 days ago (April 27, 2026 at 12:28 AM EDT)

5 min read

Source: arXiv

Source: arXiv - 2604.24027v1

Overview

KubePACS is a Kubernetes‑native system that automatically builds node pools from spot (pre‑emptible) instances, striking a balance between low cost, high performance, and strong availability guarantees. By treating instance‑type selection as a multi‑objective optimization problem and plugging into the popular Karpenter autoscaler, the authors show that workloads can run up to 81 % faster per dollar compared with existing spot‑aware solutions.

Key Contributions

Multi‑objective formulation: Combines real‑time spot prices, benchmarked performance, and a novel Spot Placement Score (SPS) into a single optimization model.
Efficient ILP solver: Uses Integer Linear Programming guided by a Golden Section Search to find near‑optimal node‑pool configurations quickly enough for production autoscaling loops.
Karpenter integration: Extends the open‑source Karpenter autoscaler, enabling joint decisions on what instance types to use and how many nodes to launch.
Workload‑aware heuristics: Allows developers to bias the optimizer toward specialized instance families (e.g., GPU, high‑memory) by scaling performance metrics.
Comprehensive evaluation: Benchmarks on synthetic and real‑world workloads demonstrate an average 55 % improvement in performance‑per‑dollar, with peaks of 81 % over state‑of‑the‑art spot provisioning tools (SpotVerse, SpotKube, vanilla Karpenter).

Methodology

Data collection – KubePACS continuously scrapes three data streams:
- Spot market prices (per‑region, per‑instance).
- Performance benchmarks (CPU, memory, network, and specialized accelerators).
- Historical interruption rates, aggregated into the Spot Placement Score (higher SPS = more reliable).
Optimization model – The problem is expressed as an Integer Linear Program:
- Objective: Maximize a weighted sum of performance‑per‑dollar and SPS, subject to workload resource constraints (CPU, memory, etc.).
- Variables: Number of nodes of each candidate instance type.
- Constraints: Minimum required capacity, budget caps, and optional affinity rules (e.g., “prefer GPU nodes”).
Solver acceleration – Because ILP can be costly, the authors embed a Golden Section Search that narrows the feasible region for the weighting factor between cost and availability, dramatically reducing solve time while preserving optimality guarantees.
Integration with Karpenter – The optimizer runs as a side‑car service. When Karpenter detects a scaling need, KubePACS supplies the optimal mix of instance types; Karpenter then provisions the nodes accordingly.
Workload‑specific tuning – Developers can annotate pods with performance preferences; KubePACS scales the corresponding benchmark scores, nudging the optimizer toward the most suitable hardware.

Results & Findings

Baseline	Avg. Perf‑$/	Max. Perf‑$/	Avg. Cost Reduction
Karpenter (price‑only)	1.0×	–	–
SpotVerse	1.32×	1.58×	12 %
SpotKube	1.41×	1.63×	15 %
KubePACS	1.55×	1.81×	23 %

Performance per dollar: KubePACS outperforms the best prior system by 55 % on average, reaching 81 % in CPU‑intensive batch jobs.
Availability: The SPS‑aware selection reduces pre‑emptions by ~30 % compared with price‑only strategies, leading to fewer pod evictions and lower restart overhead.
Solver latency: The ILP + GSS pipeline converges in < 200 ms for typical cluster sizes (≤ 200 node candidates), fitting comfortably within Karpenter’s scaling loop.
Scalability: Experiments with up to 10 k pods show linear scaling of the optimizer’s runtime, confirming suitability for large production clusters.

Practical Implications

Cost‑savvy autoscaling: DevOps teams can adopt KubePACS to keep cloud spend low without sacrificing throughput, especially for bursty or heterogeneous workloads.
Reduced operational toil: By automatically factoring in interruption risk, teams spend less time manually tweaking spot‑instance pools or handling frequent pod restarts.
Hardware‑aware scheduling: The ability to bias toward specialized instances (e.g., GPUs for ML inference) means developers can let the platform handle the “right‑size” decision, freeing them from low‑level instance selection.
Vendor‑agnostic: Although evaluated on AWS spot markets, the framework only requires price, performance, and interruption APIs, making it portable to GCP preemptible VMs or Azure low‑priority VMs.
Open‑source potential: Since KubePACS builds on Karpenter (already CNCF‑graduated), integrating it into existing CI/CD pipelines is straightforward, and contributions can be upstreamed to benefit the broader community.

Limitations & Future Work

Benchmark freshness: The optimizer relies on up‑to‑date performance data; stale benchmarks could mislead selections, especially after hardware refreshes.
Spot market volatility: Sudden price spikes or changes in interruption patterns may outpace the system’s data collection interval, temporarily degrading optimality.
Complex workloads: Multi‑tenant clusters with conflicting performance preferences may require more sophisticated multi‑objective weighting or fairness mechanisms.
Extending beyond spot: The authors suggest exploring hybrid strategies that blend spot, on‑demand, and reserved instances to further smooth cost‑availability trade‑offs.

Overall, KubePACS demonstrates that a principled, data‑driven approach to spot‑instance provisioning can unlock substantial performance‑per‑dollar gains while keeping clusters reliable—an enticing proposition for any organization running Kubernetes at scale.

Authors

Taeyoon Kim
Kyumin Kim
Enrique Molina-Giménez
Pedro García-López
Kyungyong Lee

Paper Information

arXiv ID: 2604.24027v1
Categories: cs.DC
Published: April 27, 2026
PDF: Download PDF

[Paper] KubePACS: Kubernetes Cluster Using Performant, Highly Available, and Cost Efficient Spot Instances

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Pythia: Toward Predictability-Driven Agent-Native LLM Serving

[Paper] SpecFed: Accelerating Federated LLM Inference with Speculative Decoding and Compressed Transmission

[Paper] Two Efficient Message-passing Exclusive Scan Algorithms

[Paper] Volitional Multiagent Atomic Transactions: Describing People and their Machines