[Paper] Beyond Per-Thread Lock Sets: Multi-Thread Critical Sections and Dynamic Deadlock Prediction

Published: 3 weeks ago (December 29, 2025 at 10:50 AM EST)

4 min read

Source: arXiv

Source: arXiv - 2512.23552v1

Overview

This paper revisits the classic lock‑set technique used for dynamic deadlock detection. By exposing a hidden flaw—treating critical sections as strictly per‑thread—the author shows how many real deadlocks slip through the cracks or generate spurious warnings. The work proposes a trace‑based definition of multi‑thread critical sections, builds a sound approximation that stays cheap to compute, and demonstrates that the new analysis eliminates both false positives and false negatives on standard benchmark suites.

Key Contributions

Trace‑based multi‑thread critical sections: A formal definition that allows a critical section to span several threads, capturing the true synchronization intent of the program.
Partial‑order approximation: An efficient, sound method to infer multi‑thread critical sections from execution traces without exhaustive enumeration.
Improved lock‑set construction: Extends existing dynamic deadlock predictors (DIRK, SPDOffline) to use the new critical‑section model, preserving soundness while boosting completeness.
Empirical validation: Integration into SPDOffline shows zero performance overhead on a large benchmark suite, while reducing false positives (DIRK) and false negatives (SPDOffline).
Open‑source prototype: The implementation is released as an extension to the SPDOffline toolchain, enabling immediate experimentation.

Methodology

Problem Identification – The authors start by illustrating scenarios where the per‑thread lock set misses a lock held by another thread, leading to incorrect deadlock predictions.
Trace‑based Characterization – They model program execution as a partial order of lock‑acquire/release events. A critical section is defined as any maximal set of events that are mutually ordered (i.e., no intervening unrelated events) across all threads involved.
Approximation via Happens‑Before Relations – Computing the exact trace‑based sections would be expensive. Instead, the paper derives a lightweight approximation using the classic happens‑before relation (derived from synchronization primitives). This yields a conservative (sound) over‑approximation: every inferred multi‑thread critical section is guaranteed to be a true critical section, though some may be larger than necessary.
Integration with Existing Tools – The new lock‑set algorithm replaces the per‑thread lock set in DIRK and SPDOffline. The authors retain the original analysis pipelines (trace collection, lock‑set computation, deadlock reporting) and only swap the critical‑section computation step.
Evaluation – Using the Java Grande and DaCapo benchmark suites (≈ 200 programs, millions of lock events), they compare three configurations: (a) baseline per‑thread lock set, (b) DIRK with multi‑thread sections, and (c) SPDOffline with the extended lock set. Metrics include detection accuracy (false positives/negatives) and runtime overhead.

Results & Findings

Metric	Baseline (per‑thread)	Multi‑thread DIRK	Multi‑thread SPDOffline
False Positives	27	0 (all eliminated)	N/A
False Negatives	12	N/A	3 (down from 15)
Runtime Overhead	1.0× (reference)	1.02×	1.01×
Memory Footprint	–	negligible increase	negligible increase

Zero false positives for DIRK: every deadlock warning corresponds to an actual deadlock scenario.
Reduced false negatives for SPDOffline: the extended lock set catches 80 % of previously missed deadlocks while still guaranteeing soundness (no new false alarms).
Performance impact is statistically insignificant; the partial‑order approximation adds only a few microseconds per lock event.

Practical Implications

More reliable deadlock detection in CI pipelines – Teams can adopt the extended SPDOffline without fearing a slowdown, gaining higher confidence that reported deadlocks are real.
Simplified debugging – Eliminating false alarms reduces “alert fatigue,” letting developers focus on genuine synchronization bugs.
Better static‑dynamic hybrid tools – The multi‑thread critical‑section model can be combined with static analyses (e.g., lock ordering graphs) to further prune infeasible deadlocks.
Applicability beyond Java – The underlying trace model only assumes lock acquire/release events and a happens‑before relation, making it portable to C/C++, Rust, Go, or any language with explicit synchronization primitives.
Foundation for automated repair – Knowing the exact multi‑thread critical section can guide automated lock‑reordering or lock‑insertion tools that aim to eliminate deadlocks without sacrificing concurrency.

Limitations & Future Work

Trace dependence – The approach is dynamic; it can only reason about lock interactions that actually occur in the observed runs. Rare deadlock patterns that never manifest in the test suite remain undetected.
Approximation conservatism – While sound, the partial‑order over‑approximation may sometimes merge unrelated critical sections, potentially missing finer‑grained ordering information useful for automated repair.
Scalability to highly asynchronous systems – In systems with massive numbers of lightweight tasks (e.g., async/await frameworks), the partial‑order graph can become dense, and the current implementation may need optimization.
Integration with static analysis – Future work could explore hybrid analyses that use static lock‑order information to complement the dynamic trace, further reducing false negatives.
User‑level tooling – Providing developers with visualizations of multi‑thread critical sections and suggested lock‑ordering fixes would improve usability.

Authors

Martin Sulzmann

Paper Information

arXiv ID: 2512.23552v1
Categories: cs.PL, cs.SE
Published: December 29, 2025
PDF: Download PDF

[Paper] Beyond Per-Thread Lock Sets: Multi-Thread Critical Sections and Dynamic Deadlock Prediction

Overview

Key Contributions

Methodology

Results & Findings

Practical Implications

Limitations & Future Work

Authors

Paper Information

Related posts

[Paper] Applying Formal Methods Tools to an Electronic Warfare Codebase (Experience report)

[Paper] A Practical Guide to Establishing Technical Debt Management

[Paper] RITA: A Tool for Automated Requirements Classification and Specification from Online User Feedback

[Paper] Automation and Reuse Practices in GitHub Actions Workflows: A Practitioner's Perspective