[Paper] Evaluation of Dynamic Vector Bin Packing for Virtual Machine Placement

Published: (February 16, 2026 at 07:51 AM EST)
5 min read
Source: arXiv

Source: arXiv - 2602.14704v1

Overview

The paper tackles one of the most pressing operational problems in cloud data centers: where to place virtual machines (VMs) so that physical servers are used as efficiently as possible. By framing VM placement as a Dynamic Vector Bin Packing (DVBP) problem that seeks to minimize the total active time of servers, the authors evaluate a suite of algorithms—including classic, newly‑designed, and learning‑augmented variants—under three realistic information regimes: unknown lifetimes (non‑clairvoyant), known lifetimes (clairvoyant), and predicted lifetimes (learning‑augmented). Experiments on real Azure traces reveal which algorithmic ideas actually pay off in production‑scale settings.

Key Contributions

  • Comprehensive benchmark of state‑of‑the‑art MinUsageTime DVBP algorithms across three online settings (non‑clairvoyant, clairvoyant, learning‑augmented).
  • Introduction of several novel algorithms and enhancements, such as adaptive packing heuristics and hybrid clairvoyant‑learning strategies.
  • Large‑scale empirical evaluation using Microsoft Azure workload traces, providing the first real‑world performance numbers for DVBP in cloud environments.
  • Insightful analysis of algorithmic structures that consistently outperform others, highlighting practical design patterns (e.g., early‑release packing, duration‑aware sorting).
  • Guidelines for practitioners on selecting and tuning VM placement strategies based on the availability of duration information.

Methodology

  1. Problem Formalization – The authors model VM placement as a Dynamic Vector Bin Packing problem where each VM is a multi‑dimensional item (CPU, memory, etc.) with a start time and a lifetime. The objective, MinUsageTime, is to minimize the sum of active periods of all physical machines.
  2. Algorithmic Landscape – They collect existing algorithms from the literature (e.g., First‑Fit Decreasing, Best‑Fit, Harmonic‑Based) and implement three new variants:
    • Adaptive First‑Fit (AFF) – dynamically adjusts bin‑selection thresholds based on recent packing density.
    • Hybrid Clairvoyant‑Learning (HCL) – combines exact lifetime knowledge (when available) with machine‑learning predictions for unknown jobs.
    • Release‑Aware Packing (RAP) – prioritizes packing VMs that will free up resources soon, reducing fragmentation.
  3. Online Settings
    • Non‑clairvoyant: lifetimes are hidden; decisions must be made without future knowledge.
    • Clairvoyant: true lifetimes are known at placement time (idealized scenario).
    • Learning‑augmented: a predictor (trained on historical Azure data) supplies estimated lifetimes with bounded error.
  4. Experimental Setup – The team runs each algorithm on several weeks of Azure VM request logs, reproducing realistic arrival patterns, resource demands, and heterogeneity of physical hosts. Metrics captured include total server active time, number of active servers, and runtime overhead.
  5. Statistical Analysis – Results are aggregated over multiple random seeds, and significance testing (paired t‑tests) is used to confirm observed performance gaps.

Results & Findings

SettingBest Performing AlgorithmAvg. Reduction vs. Baseline (First‑Fit)
Non‑clairvoyantRAP (Release‑Aware Packing)12.4 %
ClairvoyantHCL (clairvoyant branch)18.7 %
Learning‑augmentedHCL (learning‑augmented branch)15.3 %
  • Release‑Aware Packing consistently outperforms generic heuristics when lifetimes are unknown, because it proactively groups short‑lived VMs together, freeing bins earlier.
  • In the clairvoyant scenario, Hybrid Clairvoyant‑Learning (which simply uses the exact lifetimes) achieves the largest savings, confirming the theoretical advantage of perfect duration knowledge.
  • When only predictions are available, HCL still beats all non‑clairvoyant baselines as long as the predictor’s mean absolute error stays below ~20 % of true lifetimes.
  • Runtime overhead for the new algorithms is modest (≤ 5 ms per VM placement), making them viable for real‑time schedulers.
  • The experiments also reveal diminishing returns for overly complex heuristics; simple duration‑aware sorting often captures most of the benefit.

Practical Implications

  • Cloud Operators can adopt Release‑Aware Packing in existing schedulers (e.g., OpenStack Nova, Kubernetes with custom scheduler extensions) to shave 10‑15 % off server active time without any extra data collection.
  • Predictive Analytics Teams can integrate lightweight lifetime predictors (e.g., gradient‑boosted trees trained on historical VM logs) to enable the learning‑augmented branch of HCL, gaining most of the clairvoyant gains with only modest prediction error.
  • Cost‑Sensitive Enterprises can translate reduced active time into lower electricity bills and extended hardware lifespan, directly impacting the bottom line.
  • Tooling – The authors release their implementation as an open‑source Python library, complete with Azure trace loaders, allowing developers to benchmark their own workloads or plug the heuristics into custom orchestrators.
  • Edge & Fog Computing – The same DVBP formulation applies to resource‑constrained edge nodes; the insights about early‑release packing are especially valuable where power budgets are tight.

Limitations & Future Work

  • Predictor Quality Dependency – The learning‑augmented gains hinge on reasonably accurate lifetime forecasts; the paper does not explore robust methods for handling highly noisy predictions.
  • Static Resource Profiles – VMs are assumed to have fixed CPU/memory demands; dynamic scaling (e.g., auto‑scaling groups) is not modeled.
  • Single‑Data‑Center Scope – Experiments are limited to Azure’s internal data‑center traces; cross‑region or multi‑cloud scenarios may introduce additional constraints (network latency, placement groups).
  • Future Directions suggested by the authors include:
    1. Extending the model to incorporate network bandwidth and storage I/O as additional dimensions.
    2. Exploring reinforcement‑learning agents that adapt packing policies online.
    3. Evaluating the impact of migration costs when re‑packing VMs after placement decisions.

Authors

  • Zong Yu Lee
  • Xueyan Tang

Paper Information

  • arXiv ID: 2602.14704v1
  • Categories: cs.DC, cs.DS
  • Published: February 16, 2026
  • PDF: Download PDF
0 views
Back to Blog

Related posts

Read more »