2.3 Docker Kernel Primitives
A focused guide to Docker Kernel Primitives, connecting core concepts with practical Docker and container operations.
Docker kernel primitives are the specific Linux kernel features — namespaces, cgroups, and union/overlay filesystems — that Docker, through containerd and runc, relies on to provide process isolation, resource limiting, and efficient layered image storage, without needing to implement any of this isolation itself.
Namespaces: Isolating What a Process Can See
Namespaces partition kernel resources so that a process inside one namespace cannot see or affect resources in another. Docker uses several namespace types together for a single container.
docker run --rm alpine readlink /proc/self/ns/pid
This reports the PID namespace identifier the container's process belongs to, distinct from the host's own PID namespace identifier.
Cgroups: Limiting What a Process Can Use
Control groups limit and account for resource usage — CPU, memory, I/O — per group of processes, which Docker uses to enforce the resource limits specified when a container is started.
docker run --memory=256m --cpus=1 alpine sleep 100
Union/Overlay Filesystems: Layering Images Efficiently
Overlay filesystems let multiple read-only layers be combined with a writable layer on top, presenting a single merged filesystem view to the container while keeping the underlying layers unmodified and shareable across multiple images.
docker info --format '{{.Driver}}'
This typically reports overlay2, the storage driver Docker uses by default on Linux, which is what implements this layered filesystem behavior.
Capabilities: Fine-Grained Privilege Control
Beyond namespaces and cgroups, Linux capabilities let a container be granted a specific subset of root-level privileges rather than all of them, reducing what a compromised container process could do even if it managed to escalate privileges within its own namespace.
docker run --cap-drop=ALL --cap-add=NET_BIND_SERVICE myapp:1.0
Why These Primitives Matter
Because Docker's isolation rests entirely on real, independently verifiable kernel features rather than software emulated by Docker itself, understanding these primitives directly is the most reliable way to reason precisely about what a container actually can and cannot do, beneath any abstraction Docker's tooling presents.