[Paper] Equilibria: Fair Multi-Tenant CXL Memory Tiering At Scale

Published: 3 days ago (February 9, 2026 at 10:39 AM EST)

5 min read

Source: arXiv

Source: arXiv - 2602.08800v1

Overview

Memory is the biggest cost and power driver in modern datacenters, and the emergence of Compute Express Link (CXL) promises cheaper, low‑power memory expansion. However, turning raw CXL capacity into predictable performance for dozens of co‑running services is surprisingly hard. The paper Equilibria: Fair Multi‑Tenant CXL Memory Tiering At Scale introduces an operating‑system framework that lets cloud operators allocate, monitor, and enforce fair‑share policies for tiered CXL memory across many containers, while keeping latency‑sensitive workloads on track.

Key Contributions

Per‑container fair‑share control – a new OS interface that lets admins specify how much CXL memory each container may use, independent of the host’s global memory manager.
Fine‑grained observability – lightweight metrics and tracing hooks that expose promotion (fast → slow tier) and demotion activity per tenant, enabling root‑cause analysis at scale.
Policy‑driven promotion/demotion – a flexible regulator that can enforce arbitrary fairness policies (e.g., proportional share, min‑max) while throttling aggressive thrashing that would otherwise cause noisy‑neighbor effects.
Production‑grade implementation – patches integrated into the mainline Linux kernel (released to the community) and evaluated on a hyperscaler’s fleet with real workloads.
Performance gains – up to 52 % improvement over the existing Linux tiering solution (TPP) on production services and 1.7× on benchmark mixes.

Methodology

Design of a new memory tiering layer – built on top of the Linux page‑fault path, the layer intercepts allocation requests and decides whether a page lives in local DRAM or remote CXL memory.
Container‑aware accounting – each cgroup gets a “fair‑share quota” that the tiering layer consults before promoting a page to the slower CXL tier.
Regulated promotion engine – instead of naïvely moving pages whenever DRAM pressure rises, the engine applies a token‑bucket‑style regulator that respects the per‑tenant quota and caps promotion rates.
Observability hooks – the authors added per‑cgroup counters (promotions, demotions, thrash events) and exposed them via procfs/sysfs and eBPF maps, allowing operators to build dashboards without heavy tracing overhead.
Evaluation – the system was deployed on a real hyperscaler cluster (hundreds of nodes, each with several terabytes of CXL memory). Workloads included production micro‑services, batch jobs, and standard memory‑intensive benchmarks (e.g., Memcached, Redis, SPEC‑CPU). Metrics collected: SLO compliance (latency tail), overall throughput, and fairness indices (Jain’s fairness).

Results & Findings

Metric	Linux TPP (baseline)	Equilibria	Improvement
99th‑percentile latency (prod micro‑service)	12 ms	7 ms	+42 %
Throughput (Redis workload)	1.2 M ops/s	1.8 M ops/s	+52 %
Benchmark (memory‑bandwidth bound)	0.9× baseline	1.7× baseline	+1.7×
Fairness (Jain index)	0.71	0.94	+0.23
Promotion thrash events	3.4 k /h	0.9 k /h	‑73 %

Key takeaways:

By preventing a single tenant from monopolizing the CXL tier, overall latency tails shrink dramatically.
The regulated promotion logic cuts down on “ping‑pong” page migrations that otherwise waste bandwidth and increase power.
Operators can now pinpoint which container is causing excessive promotions, something that was impossible with the prior kernel implementation.

Practical Implications

Cloud providers can roll out CXL‑backed memory pools without fearing that a noisy tenant will degrade the entire node’s performance, enabling cheaper hardware refresh cycles.
DevOps teams gain a programmable API (cgroup extensions) to enforce memory budgets per service, aligning resource usage with business‑level SLAs.
Application architects can design workloads that deliberately spill to CXL for large, cold data structures, knowing the OS will keep hot paths in DRAM and prevent surprise latency spikes.
Observability platforms (Prometheus, Grafana, etc.) can ingest the new metrics with minimal changes, providing real‑time dashboards for memory tier health and fairness compliance.
The open‑source patches mean any Linux‑based stack—from edge servers to hyperscalers—can adopt the framework without waiting for a vendor‑specific fork.

Limitations & Future Work

Hardware dependency: The current prototype assumes CXL‑type‑3 devices with predictable latency; performance on newer CXL‑type‑4 or heterogeneous memory (e.g., NVDIMM) remains untested.
Policy complexity: While the regulator supports proportional‑share policies, more sophisticated QoS models (e.g., deadline‑aware or burstable memory) would require additional kernel extensions.
Scalability of counters: Per‑cgroup counters scale well up to a few thousand containers per node, but ultra‑dense workloads (tens of thousands) may need hierarchical aggregation to avoid overhead.
Cross‑node tiering: The work focuses on intra‑node memory tiering; extending fairness guarantees across a cluster of nodes with shared CXL pools is an open research direction.

Overall, Equilibria demonstrates that with the right OS abstractions, CXL memory can be turned into a practical, fair, and observable resource for modern multi‑tenant datacenters.

Authors

Kaiyang Zhao
Neha Gholkar
Hasan Maruf
Abhishek Dhanotia
Johannes Weiner
Gregory Price
Ning Sun
Bhavya Dwivedi
Stuart Clark
Dimitrios Skarlatos

Paper Information

arXiv ID: 2602.08800v1
Categories: cs.OS, cs.DC
Published: February 9, 2026
PDF: Download PDF

[Paper] Equilibria: Fair Multi-Tenant CXL Memory Tiering At Scale

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Min-Sum Uniform Coverage Problem by Autonomous Mobile Robots

[Paper] BOute: Cost-Efficient LLM Serving with Heterogeneous LLMs and GPUs via Multi-Objective Bayesian Optimization

[Paper] Computing Least Fixed Points with Overwrite Semantics in Parallel and Distributed Systems

[Paper] Implementability of Global Distributed Protocols modulo Network Architectures