Writing a tiny PID 1 for containers in pure assembly (x86-64 + ARM64)

Published: 5 days ago (December 6, 2025 at 01:43 PM EST)

5 min read

Source: Dev.to

Most of us don’t think much about PID 1 when building Docker images. We just slap a CMD on the Dockerfile, run the container, and move on—until one day:

docker stop hangs forever
Ctrl+C doesn’t terminate your container
you discover a pile of zombie processes inside

All of these symptoms point to the same root cause: your application is running as PID 1 and doesn’t behave like an init process. In Linux, PID 1 has special semantics around signal handling and zombie reaping, and normal apps rarely implement those correctly.

Tools like Tini solve this brilliantly: a tiny process that runs as PID 1, forwards signals to your app, and reaps zombies. Docker even ships with Tini built‑in via --init.

In this post, I’ll walk through an alternative implementation: mini-init-asm, a small PID 1 designed for containers, written entirely in x86‑64 NASM and ARM64 GAS. It’s not meant to replace Tini everywhere; instead it is:

PGID‑first init for containers (always uses a separate session and process group)
Pure‑assembly implementation of the same core ideas
Includes a few extra tricks like restart‑on‑crash

Design goals

Before writing a single line of assembly, I set a few constraints.

Behave like a responsible PID 1

Forward termination signals to the whole process group
Reap zombies, including grandchildren if needed (sub‑reaper mode)
Exit with a meaningful status (child exit code or 128+signal style)

Be small and auditable

No libc, no runtime, no hidden magic
A single statically‑linked binary per architecture
Clear, reviewable control flow

Be container‑friendly

Easy to drop into FROM scratch images
Explicit support for graceful shutdown (grace period + SIGKILL escalation)
Optional restart logic, but not a full‑blown process manager

Support amd64 and arm64 from day one

x86‑64 NASM for the “normal” Docker host
ARM64 GAS for modern ARM servers and SBCs

The container PID 1 problem in one picture

When your app runs directly as PID 1, everything inside the container hangs off it:

Container PID 1 problem

If your‑app:

ignores SIGTERM, SIGINT, etc., docker stop won’t work properly, and k8s will eventually send SIGKILL
never calls wait() / waitpid(), then exited children become zombies until PID 1 cleans them up

An init like Tini or mini-init-asm inserts itself as PID 1 and makes your app “just another process” with a normal parent:

Init inserts itself as PID 1

PID 1 now:

forwards signals to a process group
reaps zombies
decides when to exit and with what status

High‑level architecture of `mini-init-asm`

mini-init-asm follows a PGID‑centric design:

Block signals in PID 1.
Spawn a child under a new session + process group (PGID = child PID).
Create:
- a signalfd listening to HUP, INT, QUIT, TERM, CHLD plus optional extra signals
- a timerfd for the graceful shutdown window
- an epoll instance watching both fds

Run an event loop on epoll_wait:

Soft signals (TERM/INT/HUP/QUIT): forward to the whole process group and start the grace timer.
SIGCHLD: reap children with waitpid(-1, WNOHANG) and track the main child.
Timer expiry: if the child is still alive, send SIGKILL to the process group.

On exit, mini-init-asm returns:

the child’s exit status (normal exit), or
BASE + signal_number if the child died by a signal.

The base is customizable via EP_EXIT_CODE_BASE, defaulting to 128 (POSIX shell convention).

Sequence: from `docker run` to graceful shutdown

Running the init looks like:

mini-init-amd64 -- ./your-app --flag

The flow from docker run to graceful shutdown is illustrated below:

Docker run → graceful shutdown

If the child ignores SIGTERM and is still alive when the timer expires, mini-init-asm escalates:

Escalation to SIGKILL

Pure‑assembly implementation: structure

The repository is organized to keep the assembly readable and reviewable:

src/amd64/   # NASM sources (SysV ABI, x86‑64)
src/arm64/   # GAS sources (AArch64)
include/syscalls_*.inc   # syscall numbers per arch
include/macros*.inc      # helpers for syscalls / logging

Example syscall wrapper (NASM)

; rax = syscall number
; rdi, rsi, rdx, r10, r8, r9 = args

%macro SYSCALL 0
    syscall
    cmp rax, 0
    jge .ok
    ; handle -errno in rax if needed...
.ok:
%endmacro

Forking and execing the child (NASM)

; 1) Fork/clone a child
mov     eax, SYS_clone
mov     rdi, SIGCHLD          ; flags
xor     rsi, rsi              ; child_stack (unused for simple clone)
xor     rdx, rdx
xor     r10, r10
xor     r8,  r8
xor     r9,  r9
syscall

cmp     rax, 0
je      .in_child
jl      .fork_error

; ----- Parent (PID 1) -----
; rax = child_pid
mov     [child_pid], rax
; continue with signalfd/epoll setup...
jmp     .parent_after_fork

.in_child:
    ; 2) Create new session and PGID
    mov     eax, SYS_setsid
    syscall

    ; Optionally setpgid(0, 0)
    xor     rdi, rdi
    xor     rsi, rsi
    mov     eax, SYS_setpgid
    syscall

    ; 3) execve() target program
    mov     eax, SYS_execve
    mov     rdi, [target_path]
    mov     rsi, [target_argv]
    mov     rdx, [target_envp]
    syscall

    ; If execve returns, it's an error → exit(127)
    mov     edi, 127
    mov     eax, SYS_exit
    syscall

On the ARM64 side the logic is analogous, using x8 for the syscall number and x0‑x5 for arguments.

The `epoll` + `signalfd` + `timerfd` loop

The main event loop is where most of the logic lives. In pseudo‑C:

for (;;) {
    int n = epoll_wait(epfd, events, MAX_EVENTS, -1);
    if (n < 0 && errno == EINTR) continue;

    for (int i = 0; i < n; i++) {
        if (events[i].data.fd == signalfd_fd) {
            struct signalfd_siginfo si;
            read(signalfd_fd, &si, sizeof(si));
            int sig = si.ssi_signo;

            if (is_soft_shutdown(sig)) {
                forward_to_pgid(sig);
                if (!grace_timer_armed) {
                    arm_timerfd(grace_seconds);
                }
            } else if (sig == SIGCHLD) {
                reap_children();
            }
        } else if (events[i].data.fd == timerfd_fd) {
            /* Grace period expired – force kill */
            killpg(pgid, SIGKILL);
        }
    }
}

The actual assembly implements the same state machine using epoll_wait, read, kill, waitpid, and exit syscalls, all without any external libraries.

Writing a tiny PID 1 for containers in pure assembly (x86-64 + ARM64)

Design goals

Behave like a responsible PID 1

Be small and auditable

Be container‑friendly

Support amd64 and arm64 from day one

The container PID 1 problem in one picture

High‑level architecture of `mini-init-asm`

Sequence: from `docker run` to graceful shutdown

Pure‑assembly implementation: structure

Example syscall wrapper (NASM)

Forking and execing the child (NASM)

The `epoll` + `signalfd` + `timerfd` loop

Related posts

Renuncio a hacer consultoría de FinOps

Building a Quantum-Enhanced API Gateway: MCP Secure Gateway

Start Hacking Now: What a €XXM API Migration Taught Me About AI in Production

I Almost Used LangGraph for Social Media Automation (Here's Why I Built an MCP Server Instead)

Design goals

Behave like a responsible PID 1

Be small and auditable

Be container‑friendly

Support amd64 and arm64 from day one

The container PID 1 problem in one picture

High‑level architecture of mini-init-asm

Sequence: from docker run to graceful shutdown

Pure‑assembly implementation: structure

Example syscall wrapper (NASM)

Forking and execing the child (NASM)

The epoll + signalfd + timerfd loop

Related posts

Renuncio a hacer consultoría de FinOps

Building a Quantum-Enhanced API Gateway: MCP Secure Gateway

Start Hacking Now: What a €XXM API Migration Taught Me About AI in Production

I Almost Used LangGraph for Social Media Automation (Here's Why I Built an MCP Server Instead)

Behave like a responsible PID 1

The container PID 1 problem in one picture

High‑level architecture of `mini-init-asm`

Sequence: from `docker run` to graceful shutdown

The `epoll` + `signalfd` + `timerfd` loop