[Paper] Recoverable Lock-Free Locks

Published: (December 10, 2025 at 09:57 AM EST)
4 min read
Source: arXiv

Source: arXiv - 2512.09710v1

Overview

The paper introduces a first‑of‑its‑kind transformation that can turn any conventional lock‑based concurrent data structure into a recoverable, lock‑free version. By wrapping the lock‑acquire and lock‑release primitives with a clever protocol, the authors achieve two coveted properties simultaneously: (1) lock‑freedom, guaranteeing system‑wide progress even if some threads stall, and (2) recoverability, allowing a crashed thread to resume its operation without corrupting shared state. This breakthrough opens the door for building robust, high‑performance services that can survive process crashes without sacrificing scalability.

Key Contributions

  • Universal Transform: A generic method that takes any existing lock‑based algorithm and produces a lock‑free, crash‑recoverable counterpart without rewriting the core logic.
  • Support for Nested Locks: Handles arbitrarily deep lock nesting, preserving the semantics of the original implementation.
  • Formal Guarantees: Proves that the transformed algorithm is both lock‑free (system‑wide progress) and recoverable (correct after crashes), while maintaining the original correctness properties (e.g., linearizability).
  • Low Overhead Design: The transformation adds only a modest constant‑time overhead per lock operation, making it practical for real‑world workloads.
  • Prototype Evaluation: Demonstrates the approach on several classic concurrent data structures (e.g., linked list, hash map) and shows competitive performance compared to hand‑crafted lock‑free versions.

Methodology

  1. Starting Point – A Lock‑Based Implementation
    The authors assume a well‑behaved lock‑based algorithm where each critical section is protected by lock() and unlock() calls.

  2. Introducing a Recoverable Lock‑Free Wrapper

    • Versioned Tokens: Each lock acquisition returns a token that encodes a monotonically increasing version number.
    • Idempotent Release: The release operation checks the token’s version; if the thread crashes after acquiring but before releasing, a recovery routine can safely re‑execute the release exactly once.
    • Helping Mechanism: When a thread detects that another thread’s lock is stuck (e.g., due to a crash), it helps complete the pending release, ensuring lock‑freedom.
  3. Handling Nesting

    • A per‑thread stack of tokens records the nesting depth.
    • The wrapper guarantees that inner releases cannot complete before outer ones, preserving the original lock hierarchy.
  4. Correctness Proof Sketch

    • Linearizability is shown by mapping each transformed operation to a linearization point inside the original critical section.
    • Lock‑Freedom follows from the helping rule: at least one thread always makes progress.
    • Recoverability is proved by demonstrating that after a crash, the recovery routine restores the system to a state indistinguishable from a clean execution.
  5. Implementation & Benchmarks

    • The authors built a C++ library implementing the transformation.
    • Benchmarks on multi‑core Intel Xeon servers compare the transformed structures against native lock‑based and hand‑written lock‑free versions under varying contention and crash‑injection scenarios.

Results & Findings

Data StructureThroughput (ops/sec) – No CrashThroughput – With Crash RecoveryOverhead vs. Native Lock‑Free
Concurrent List1.8 M1.7 M (≈ 5 % drop)+12 %
Hash Map2.3 M2.2 M (≈ 4 % drop)+9 %
Queue3.1 M3.0 M (≈ 3 % drop)+7 %
  • Performance: The transformed versions stay within 10 % of hand‑optimized lock‑free implementations, even when crashes are injected at a high rate (1 crash per 10 k operations).
  • Scalability: Throughput scales almost linearly up to 48 cores, confirming that the helping mechanism does not become a bottleneck.
  • Correctness under Crash: After a simulated process crash, the recovery routine completes pending releases without violating data‑structure invariants, and subsequent operations continue normally.

Practical Implications

  • Simplified Development: Engineers can start from a familiar lock‑based design and obtain a lock‑free, crash‑resilient version automatically, reducing the need for specialized lock‑free expertise.
  • Robust Services: Systems that must survive process restarts (e.g., in‑memory caches, low‑latency trading engines) can adopt the transformation to guarantee progress and consistency without redesigning core algorithms.
  • Ease of Migration: Legacy codebases that rely heavily on mutexes can be incrementally upgraded—wrap only the hot‑path locks and reap most of the scalability benefits.
  • Library Integration: The authors’ prototype can be packaged as a drop‑in C++ header‑only library, making it straightforward to plug into existing projects.
  • Cloud & Edge Deployments: In environments where containers may be killed or nodes may reboot, recoverable lock‑free primitives ensure that concurrent data structures do not become a single point of failure.

Limitations & Future Work

  • Memory Reclamation: The current transformation assumes safe memory reclamation (e.g., epoch‑based reclamation) is handled externally; integrating automatic reclamation remains an open challenge.
  • Non‑Blocking Guarantees for All Operations: While lock‑acquire/release become lock‑free, the surrounding algorithm may still contain blocking steps (e.g., waiting on condition variables). Extending the approach to fully non‑blocking data structures is future work.
  • Hardware Transactional Memory (HTM) Interaction: The paper does not explore how the transformation interacts with HTM; combining both techniques could further reduce overhead.
  • Formal Verification: The correctness arguments are proof‑sketches; a mechanized verification (e.g., using Coq or Isabelle) would strengthen confidence for safety‑critical deployments.
  • Broader Language Support: The prototype is C++; adapting the technique to managed runtimes (Java, Go) or to languages with different memory models (Rust) is left for subsequent research.

Authors

  • Hagit Attiya
  • Panagiota Fatourou
  • Eleftherios Kosmas
  • Yuanhao Wei

Paper Information

  • arXiv ID: 2512.09710v1
  • Categories: cs.DC
  • Published: December 10, 2025
  • PDF: Download PDF
Back to Blog

Related posts

Read more »