The Engine Under the Hood: Go’s GMP, Java’s Locks, and Erlang’s Heaps

Published: (January 11, 2026 at 05:09 AM EST)
4 min read
Source: Dev.to

Source: Dev.to

Introduction

As backend engineers we often treat concurrency as a black box: we write go func(){} or spawn() and expect magic. Understanding how the runtime schedules these tasks separates a senior engineer from an architect.

The GMP Scheduler

Go’s scheduler follows the G‑M‑P model:

ComponentDescription
G (Goroutine)Lightweight user‑space thread (starts at ~2 KB stack). Holds the instruction pointer and stack.
M (Machine)An OS thread managed by the kernel. It is the actual worker that executes CPU instructions.
P (Processor)A logical token that owns a local run queue and a portion of the memory cache. An M must hold a P to execute a G.
  • Rule: An M must have a P to run a G.
  • P = logical cores: By default GOMAXPROCS equals the number of CPU cores, limiting parallelism while allowing unlimited concurrency.

When is a G created?

A goroutine is created whenever you call go func(){}. It is allocated in user space by the Go runtime, costs ~2 KB, and is placed on the local run queue of the current P.

When is an M created?

The runtime keeps the M count low, spawning a new OS thread only when:

  1. A goroutine makes a blocking system call (e.g., CGO, heavy file I/O) that cannot be handled asynchronously.
  2. The current M gets stuck inside the OS kernel.
  3. Other Ps are waiting for work but no M is available (creating a new M is expensive, ~1–2 MB).

The Watcher: sysmon and SIGURG

What is sysmon?

sysmon (system monitor) is a special runtime thread that does not hold a P and runs on a dedicated M. It wakes up periodically (20 µs – 10 ms) to enforce fairness.

How preemption works

Since Go 1.14, the scheduler uses signals to force work stealing:

  1. sysmon scans all Ps. If it finds a goroutine that has run on a processor for > 10 ms, it sends a SIGURG to the M executing that goroutine.
  2. Why SIGURG?
    • Out‑of‑band: rarely used by modern apps, so it doesn’t clash with user signals.
    • Non‑destructive: unlike SIGINT, it does not terminate the process.
    • Libc‑safe: safe for programs that use CGO.
  3. The OS interrupts the M; Go’s signal handler injects a call to asyncPreempt onto the goroutine’s stack.
  4. The goroutine yields, is moved to the global run queue, and the P picks a new G to run.

Model Comparison: “Communicate by Sharing Memory” vs. “Share Memory by Communicating”

Go / Java: Shared Heap

All threads share the same heap. Data is passed by mutating shared objects.

Failure Mode (Java Example)

// Java: Explicit Locking (The Bottleneck)
class Counter {
    private int count = 0;

    // synchronized forces the OS to pause other threads (context switch)
    public synchronized void increment() {
        count++;
    }
}
  • Race conditions: Forgetting synchronized leads to corrupted data.
  • Performance: Locks require OS intervention, costing thousands of cycles.
  • Deadlocks: Circular waiting can freeze the application.

Erlang: Private Heaps

Each process has its own heap, eliminating “noisy neighbor” effects.

Why Erlang Is “Better” (Bank Example)

-module(bank_server).
-behaviour(gen_server).

%% 1. The Safe Bank Process
init([]) -> {ok, 100}.   %% Balance is $100

%% 2. The Dangerous Crash Process
trigger_crash() ->
    spawn(fun() ->
        %% A. This allocates 1 GB on a PRIVATE heap
        CrashList = lists:seq(1, 100000000),
        %% B. Crashes immediately
        1 / 0
    end).
  • Allocation: The spawned process allocates 1 GB on its private heap. In Java/Go this would fill the global heap and trigger a stop‑the‑world GC.
  • Crash: The process dies (divide‑by‑zero).
  • Cleanup: The Erlang VM simply discards the private heap.
  • Zero GC cost: No need to scan the memory of other processes.
  • Zero impact: The bank_server continues handling the $100 balance with microsecond latency, unaffected by the crash.

Final Takeaway

  • Java’s shared‑memory model places a heavy correctness burden on engineers, making large‑scale concurrency harder to reason about.
  • Erlang excels in reliability because private heaps prevent “noisy neighbors” from affecting the whole system.
  • Go offers a pragmatic middle ground: it uses a shared heap for raw speed (no data copying) while encouraging CSP‑style communication (channels) to avoid the complexity of explicit locks.
Back to Blog

Related posts

Read more »