Why Docker Breaks Inside MicroVMs (Part 1): The Linux Assumptions You Didn’t Know You Were Relying On
Source: Dev.to
The Initial Failure
We tried running Docker inside a microVM and it failed before the first container even started:
cgroup mountpoint does not exist
On a normal EC2 instance the same Docker binary works without issue. The problem wasn’t Docker itself or a kernel bug—it was something more subtle: we were relying on parts of Linux that weren’t present in the microVM.
Cgroup Assumptions
Docker’s error mentioned cgroups, so we inspected the filesystem:
ls /sys/fs/cgroup
No output.
mount | grep cgroup
Again, nothing.
On a typical Linux system /sys/fs/cgroup exists automatically because something mounts it during boot. Inside the microVM that step never happened, so Docker tried to create its cgroup hierarchy and the kernel responded with “there’s no interface here”.
We manually mounted the hierarchy:
mount -t cgroup2 none /sys/fs/cgroup
Docker progressed a bit further, but then hit the next wall. This taught us to ask “What is Docker assuming exists right now?” rather than simply “Why is Docker failing?”.
Systemd and the Boot Process
A full Linux distribution (with systemd) performs many tasks before you ever log in:
- mounts
/proc,/sys,/dev - sets up cgroups
- initializes networking
- prepares the runtime environment
A microVM provides none of these automatically—there is no systemd unless you explicitly include it. Consequently, missing or incomplete /proc or /sys mounts are never fixed later, and you effectively have to recreate the boot process yourself.
Networking Stack Inside Containers
After fixing the basic mounts, Docker began initializing containers, but networking broke. Understanding how a container reaches the internet clarified the issue.
Container Network Namespace
- Each container gets its own IP (e.g.,
172.17.x.x). - It does not share the host’s network interface.
- It runs in its own network namespace.
Docker’s Network Setup
- Docker creates a new network namespace for the container.
- It creates a veth pair: one end stays on the host, the other moves into the container.
- The host side of the veth is attached to the
docker0bridge, allowing containers to talk to each other. - NAT is applied so that packets leaving the container have their source IP rewritten to the host’s IP, enabling external connectivity.
Additional Layer in a MicroVM
In our setup the container ran inside a microVM, which itself had a virtual NIC backed by a tap device on the host. The full path for outbound traffic became:
container → bridge (docker0) → VM eth0 → virtual NIC → host → internet
Two stacked networking environments mean that a missing piece at either level can prevent packets from reaching their destination, and the resulting errors are often opaque.
iptables and Packet Filtering
Docker eventually failed with:
iptables: Failed to initialize nft: Protocol not supported
This error is rooted in deeper kernel capabilities:
- How packet filtering is implemented in the kernel.
- How
iptablesinteracts with that implementation. - Which netfilter (nftables) backend the kernel was compiled with.
Resolving it required understanding these lower‑level details.
Shifting the Mental Model
The biggest change was not technical but conceptual. We stopped treating the microVM like a normal machine and adopted a new approach:
- Nothing is assumed to exist.
- Every layer must be verified.
- Each fix reveals the next dependency.
Instead of “debugging Docker”, we were discovering what a “working Linux environment” actually consists of.
Part 2 will explore how this mental model pays off, especially when tackling the iptables failure.