[Paper] Lotus: Optimizing Disaggregated Transactions with Disaggregated Locks

Published: 1 month ago (December 17, 2025 at 10:49 PM EST)

4 min read

Source: arXiv

Source: arXiv - 2512.16136v1

Overview

The paper introduces Lotus, a new distributed transaction system designed for disaggregated memory (DM) architectures. By moving lock management from memory‑node NICs to compute nodes, Lotus eliminates a major network bottleneck and delivers up to 2× higher throughput for OLTP workloads.

Key Contributions

Lock disaggregation: Locks are stored and processed on compute nodes (CNs) instead of memory nodes (MNs), freeing the MN RDMA NICs from heavy atomic‑operation traffic.
Application‑aware lock sharding: Lotus partitions locks based on workload locality, achieving balanced load across CNs while preserving cache friendliness.
Lock‑first transaction protocol: Transactions acquire all required locks before any data access, enabling early conflict detection and proactive aborts.
Lock‑rebuild‑free recovery: Treats locks as transient; after a CN crash, the system recovers without reconstructing lock state, keeping recovery lightweight.
Performance gains: Empirical evaluation shows up to 2.1× higher throughput and ≈50 % lower latency versus the best existing DM transaction systems.

Methodology

System model: The authors assume a typical DM deployment where multiple CNs communicate with a pool of MNs via RDMA. Traditional designs place lock metadata on MNs, causing the NIC to handle a flood of one‑sided atomic ops (e.g., compare‑and‑swap).
Lock disaggregation design: Lotus relocates each lock to the CN that initiates the transaction. A lightweight lock table is kept in the CN’s local memory, and lock ownership is advertised to MNs via a small “lock‑owner” directory that can be cached.
Sharding algorithm: Locks are grouped by the primary data items they protect. Using workload statistics (e.g., hot keys), Lotus assigns groups to CNs so that most lock requests stay local, while a simple hash‑based fallback ensures even distribution when hotspots shift.
Transaction flow:
- Lock‑first phase: The CN sends a batch of lock‑acquire messages to the relevant CNs (including itself).
- Validation: If any lock fails, the transaction aborts immediately—no data reads are performed.
- Execution phase: Once all locks are held, the CN performs RDMA reads/writes on the data residing in MNs.
- Commit/Release: Locks are released atomically after the write‑back, using a non‑blocking RDMA write.
Recovery: Upon CN failure, the system treats all locks held by that CN as expired. Since locks are not persisted, other CNs can simply retry the aborted transactions without a costly reconstruction step.

Results & Findings

Metric	Lotus vs. Baseline (state‑of‑the‑art DM txn system)
Throughput	↑ 2.1× (up to 2.1× higher)
Average latency	↓ 49.4 % (nearly half)
NIC atomic‑op load	↓ ≈ 70 % on MN RNICs
Scalability	Near‑linear throughput increase when adding CNs, while baseline plateaus due to NIC saturation
Recovery time	≈ 30 % lower than lock‑rebuild approaches

The experiments span YCSB‑type workloads and a TPC‑C‑like OLTP benchmark, demonstrating that the lock‑first protocol dramatically reduces wasted network traffic caused by aborts after data reads.

Practical Implications

For cloud providers: Deploying Lotus on disaggregated‑memory clusters (e.g., NVIDIA DGX‑SuperPOD, Azure’s memory‑pool services) can improve resource utilization without extra hardware investment.
For database engineers: Existing RDMA‑based transaction engines can adopt the lock‑first protocol and lock sharding logic to gain immediate performance wins, especially for workloads with high contention.
For developers of micro‑services: When services share a common DM store, moving lock state to the service host (CN) reduces cross‑node latency, making fine‑grained transactional semantics feasible at scale.
For system architects: The lock‑rebuild‑free recovery model simplifies failure handling, lowering the operational complexity of large‑scale DM deployments.

Limitations & Future Work

Workload dependence: Lotus relies on locality patterns (e.g., hot keys staying on a few CNs). Highly random access patterns could degrade sharding balance and re‑introduce network hot spots.
Memory overhead on CNs: Storing lock tables locally consumes additional memory on compute nodes, which may be constrained in some environments.
Fault‑tolerance scope: The current design handles CN crashes gracefully but assumes MNs remain reliable; extending the model to tolerate MN failures is left for future research.
Broader protocol integration: The authors plan to explore how Lotus interacts with multi‑version concurrency control (MVCC) and hybrid transaction models to further boost performance under mixed read‑write workloads.

Authors

Zhisheng Hu
Pengfei Zuo
Junliang Hu
Yizou Chen
Yingjia Wang
Ming-Chang Yang

Paper Information

arXiv ID: 2512.16136v1
Categories: cs.DC
Published: December 18, 2025
PDF: Download PDF

[Paper] Lotus: Optimizing Disaggregated Transactions with Disaggregated Locks

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Asymptotic behaviour of galactic small-scale dynamos at modest magnetic Prandtl number

[Paper] Torrent: A Distributed DMA for Efficient and Flexible Point-to-Multipoint Data Movement

[Paper] The HEAL Data Platform

[Paper] Democratizing Scalable Cloud Applications: Transactional Stateful Functions on Streaming Dataflows