[Paper] Diagonal Scaling: A Multi-Dimensional Resource Model and Optimization Framework for Distributed Databases
Source: arXiv - 2511.21612v1
Overview
Modern cloud databases still treat scaling as a simple “add more nodes or make each node bigger” decision. Abdullah and Zaman show that this one‑dimensional view hides costly inefficiencies. Their paper introduces a Scaling Plane that jointly models the number of nodes and per‑node resources, and they present DIAGONALSCALE, an algorithm that automatically moves along the plane—often diagonally—to find the cheapest configuration that meets latency and throughput SLAs.
Key Contributions
- Scaling Plane model: a two‑dimensional representation (horizontal = node count, vertical = vector of CPU, memory, network, storage) with smooth approximations for latency, throughput, coordination overhead, and monetary cost.
- Analytical insight that optimal scaling paths frequently follow diagonal trajectories, i.e., simultaneous horizontal and vertical adjustments, rather than pure horizontal or vertical moves.
- DIAGONALSCALE algorithm: a discrete local‑search optimizer that evaluates horizontal, vertical, and diagonal moves and selects the configuration minimizing a multi‑objective cost function under SLA constraints.
- Comprehensive evaluation across synthetic surfaces, micro‑benchmarks, and real distributed SQL (e.g., CockroachDB) and key‑value (e.g., TiKV) workloads, showing up to 40 % latency reduction, 37 % lower cost‑per‑query, and 2–5× less data rebalancing versus traditional autoscalers.
- Open‑source prototype (link in the paper) that can be plugged into existing cloud‑native orchestration stacks.
Methodology
- Model Construction – The authors treat each possible cluster configuration as a point ((H, V)) on the Scaling Plane. They fit smooth functions (using regression on benchmark data) that map any point to expected latency, throughput, coordination overhead, and cloud cost.
- Objective Definition – A weighted multi‑objective function combines latency SLA violation penalty, monetary cost, and rebalancing overhead.
- Local‑Search Algorithm – DIAGONALSCALE starts from the current configuration and explores three neighbor types:
- Horizontal move: add/remove a node (keeping per‑node resources fixed).
- Vertical move: increase/decrease a single resource dimension (e.g., CPU) on all nodes.
- Diagonal move: simultaneously add a node and boost a resource (e.g., add a node and increase its memory).
The algorithm picks the neighbor with the best objective improvement and repeats until no further gain is possible.
- Evaluation – Experiments were run on a public‑cloud testbed (AWS m5.large, r5.xlarge, etc.) with workloads that stress CPU, memory, network, and storage in different proportions. Baselines were pure horizontal autoscaling (Kubernetes HPA) and pure vertical autoscaling (VPA).
Results & Findings
| Metric | Horizontal‑only | Vertical‑only | DIAGONALSCALE (Diagonal) |
|---|---|---|---|
| 95th‑percentile latency reduction | – (baseline) | –12 % | ‑40 % |
| Cost‑per‑query (USD) | 1.00× | 0.85× | 0.63× |
| Data rebalancing volume | 1.00× | 0.78× | 0.20–0.50× |
| SLA violation frequency | 8 % | 5 % | 1 % |
Key takeaways
- Diagonal moves capture the sweet spot where adding a node reduces coordination overhead while a modest per‑node upgrade lifts per‑node throughput, yielding a multiplicative performance boost.
- The algorithm converges in ≤ 5 iterations on average, making it suitable for real‑time autoscaling loops.
- Workloads dominated by memory pressure benefit most from diagonal scaling, whereas CPU‑bound workloads see modest gains (still better than pure horizontal scaling).
Practical Implications
- Cloud‑native DBaaS providers can embed DIAGONALSCALE into their autoscaling controllers to cut operational costs while delivering tighter latency SLAs.
- DevOps teams gain a single knob (the multi‑objective weight vector) instead of juggling separate horizontal and vertical policies, simplifying policy management.
- Capacity planning tools can use the Scaling Plane to forecast the impact of workload growth across multiple resource dimensions, enabling more accurate budgeting.
- The reduction in rebalancing traffic translates to lower network egress charges and less disruption for multi‑region deployments.
Limitations & Future Work
- The model relies on offline benchmark data to fit latency/throughput surfaces; sudden workload pattern changes may require re‑training.
- DIAGONALSCALE assumes homogeneous nodes; extending the framework to heterogeneous clusters (e.g., mixed instance types) is non‑trivial.
- The current prototype only supports single‑tenant scenarios; multi‑tenant fairness and interference need further study.
- Future research directions include online learning of the Scaling Plane, integration with predictive workload forecasting, and exploration of reinforcement‑learning‑based scaling policies that can handle richer state spaces.
Authors
- Shahir Abdullah
- Syed Rohit Zaman
Paper Information
- arXiv ID: 2511.21612v1
- Categories: cs.DC
- Published: November 26, 2025
- PDF: Download PDF