Spinlocks vs. Mutexes: When to Spin and When to Sleep

Published: 2 days ago (December 7, 2025 at 07:38 PM EST)

2 min read

Source: Hacker News

Introduction

You’re staring at perf top showing 60 % CPU time in pthread_mutex_lock. Your latency is terrible. Someone suggests “just use a spinlock,” and suddenly your 16‑core server is pegged at 100 % doing nothing useful. This is the synchronization‑primitive trap: engineers often choose the wrong primitive because they don’t understand when each makes sense.

Spinlocks vs. Mutexes

Property	Mutex	Spinlock
Behavior	Sleeps when contention occurs.	Busy‑waits (spins) in userspace.
Cost when uncontended	25–50 ns (fast path)	One atomic `LOCK CMPXCHG` (≈40–80 ns).
Cost when contended	Syscall (`futex(FUTEX_WAIT)`) ≈ 500 ns + context switch (3–5 µs).	100 % CPU while looping; each failed attempt bounces the cache line between cores.
Preemptible contexts	Safe – thread can be descheduled.	Dangerous – if the holder is preempted, other threads waste a full timeslice.
Priority inversion	Solved with Priority‑Inheritance (PI) mutexes.	Cannot be solved; high‑priority thread may spin forever while low‑priority holder never runs.
False sharing	Same issue – every atomic invalidates the cache line.	Same issue – extra care needed to align locks.

When to Use Which Primitive

Critical‑section duration	Contention level	Recommended primitive
10 µs or high contention	High	Regular mutex – let the scheduler handle sleeping/waking.
Real‑time requirements	Any	Priority‑Inheritance mutex on a PREEMPT_RT kernel.

Profiling Tips

CPU vs. context switches – Run:
```
perf stat -e context-switches,cache-misses 
```
- High context‑switch count with low CPU usage → mutex overhead may dominate.
- High cache‑misses with 100 % CPU → lock contention / false sharing.
Syscall count – Use strace -c to count futex() calls. Millions per second indicate a hot contended lock that might benefit from sharding or lock‑free techniques.
Voluntary vs. involuntary switches – Check /proc//status. Involuntary switches while holding a spinlock suggest preemption problems.

Real‑World Examples

Redis – Uses spinlocks for its tiny job queue (critical sections).

#include <pthread.h>
#include <stdatomic.h>

#define NUM_THREADS   4
#define ITERATIONS    1000000
#define HOLD_TIME_NS  100   // simulated work

typedef struct {
    atomic_int lock;
    long       counter;
} spinlock_t;

static void spinlock_acquire(spinlock_t *s) {
    int expected;
    do {
        expected = 0;
    } while (!atomic_compare_exchange_weak(&s->lock, &expected, 1));
}

static void spinlock_release(spinlock_t *s) {
    atomic_store(&s->lock, 0);
}

static void *worker_thread(void *arg) {
    spinlock_t *s = (spinlock_t *)arg;
    for (long i = 0; i < ITERATIONS; i++) {
        spinlock_acquire(s);
        s->counter++;
        /* Simulate ~100 ns of work */
        for (volatile int j = 0; j < HOLD_TIME_NS; j++) {
            // busy‑wait
        }
        spinlock_release(s);
    }
    return NULL;
}

Illustrations

Spinlock vs. Mutex performance diagram

Profiling context switches and cache misses

Spinlocks vs. Mutexes: When to Spin and When to Sleep

Introduction

Spinlocks vs. Mutexes

When to Use Which Primitive

Profiling Tips

Real‑World Examples

Illustrations

Related posts

Web Developer Travis McCracken on The Most Overused Patterns in Backend Dev

The missing standard library for multithreading in JavaScript

Mastering Java Multithreading : Thread Control, Synchronization & Concurrency Utilities

Planning My Next Open-Source Contributions