20.3.2.2 Resource Limit Practice

A focused guide to Resource Limit Practice, connecting core concepts with practical Docker and container operations.

Resource limit practice applies CPU, memory, and process count constraints to containers, enforcing isolation between workloads on the same host and preventing runaway processes from consuming all available host resources. Without limits, a single container with a memory leak, an infinite loop, or a fork bomb can starve every other container and system process running on the same host. Resource limits are not optional in production — they are the mechanism that makes co-tenancy safe.

How Docker Enforces Resource Limits

Docker uses the Linux kernel's cgroups (control groups) to enforce resource constraints. When a container is created with --memory 512m, Docker creates a cgroup hierarchy for the container and sets the memory limit in the cgroup's configuration. The kernel enforces the limit; Docker is the configuration interface.

This means resource limits are a kernel-enforced boundary, not a software-level suggestion. A process inside the container cannot bypass a cgroup memory limit by any application-level technique.

Memory Limits

Setting a Memory Limit

docker run -d \
  --name my-api \
  --memory 512m \
  my-api:latest

The container can use at most 512MB of RAM. If the process exceeds this limit, the Linux kernel's OOM (Out of Memory) killer terminates a process within the container — typically the container's PID 1, causing the container to exit.

Memory units accepted by Docker: b (bytes), k (kilobytes), m (megabytes), g (gigabytes).

Memory + Swap

--memory sets RAM-only. --memory-swap sets the total of RAM plus swap:

docker run -d \
  --memory 512m \
  --memory-swap 512m \
  my-api:latest

When --memory-swap equals --memory, swap is disabled for this container. When a container with a memory limit reaches it, it cannot fall back to swapping to disk. This causes the OOM kill to happen sooner but prevents disk I/O latency spikes from swapping.

Setting --memory-swap to -1 gives the container unlimited swap (bounded only by host swap availability).

Memory Reservation (Soft Limit)

docker run -d \
  --memory 512m \
  --memory-reservation 256m \
  my-api:latest

--memory-reservation is a soft limit. When the host is under memory pressure, Docker's memory management will try to shrink the container's memory usage to the reservation level. It does not enforce a hard cap like --memory. Use both together: hard limit prevents runaway growth; reservation guides memory reclamation under host pressure.

CPU Limits

CPU Core Fraction

docker run -d \
  --cpus 1.5 \
  my-api:latest

--cpus 1.5 allocates at most 1.5 CPU cores worth of processing time. On a 4-core host, this container can use 37.5% of total CPU. The value is a decimal number representing cores, not a percentage.

A container on a 4-core host with --cpus 0.5 gets at most 12.5% of total host CPU time.

CPU Period and Quota (Lower-Level Control)

The --cpus flag is a convenient abstraction over cgroup CPU period/quota:

docker run -d \
  --cpu-period 100000 \
  --cpu-quota 50000 \
  my-api:latest

This is equivalent to --cpus 0.5: 50,000 microseconds of CPU time per 100,000 microsecond period. --cpus is preferred for clarity.

CPU Shares (Relative Weight)

docker run -d --cpu-shares 512 my-api:latest
docker run -d --cpu-shares 1024 my-critical-api:latest

--cpu-shares sets a relative weight (default 1024). Under CPU contention, the higher-weight container gets proportionally more CPU time. Shares are relative — a container with 512 shares gets half the CPU time of a container with 1024 shares when both are competing. When CPU is not contended, both can use 100%.

CPU shares are soft limits and do not create a hard cap.

PID Limit

docker run -d \
  --pids-limit 200 \
  my-api:latest

Limits the number of processes (and threads) the container can create to 200. This prevents fork bombs — processes that recursively spawn child processes — from consuming all available process table entries on the host.

Without a PID limit, a fork bomb inside a container can create thousands of processes before the kernel's PID table is exhausted, making the host unresponsive.

Verifying Applied Limits

docker inspect my-api --format '{{json .HostConfig}}' | python -m json.tool | grep -E "Memory|NanoCpus|PidsLimit"

Or more directly:

docker inspect my-api --format 'Memory={{.HostConfig.Memory}} NanoCpus={{.HostConfig.NanoCpus}} PidsLimit={{.HostConfig.PidsLimit}}'

Memory=536870912 NanoCpus=1000000000 PidsLimit=200

536870912 bytes = 512MB. 1000000000 NanoCPUs = 1 CPU core.

Monitoring Resource Usage

docker stats my-api

CONTAINER ID  NAME    CPU %  MEM USAGE / LIMIT   MEM %  NET I/O      BLOCK I/O
a1b2c3d4e5f6  my-api  0.5%   82MiB / 512MiB      16%    1.2kB/0.8kB  0B/0B

MEM USAGE / LIMIT shows current usage against the configured limit. When MEM % approaches 100%, the container is approaching the OOM kill threshold.

For all containers:

docker stats --no-stream

--no-stream outputs a snapshot without continuously refreshing.

Detecting OOM Kills

When a container is killed by the OOM killer, Docker reports it:

docker inspect my-api --format '{{.State.OOMKilled}}'

true

The container's status shows as Exited (137) — exit code 137 indicates a kill signal (SIGKILL = 9, and 128 + 9 = 137):

docker ps -a | grep my-api

a1b2c3d4e5f6   my-api   Exited (137) 2 minutes ago

An OOM kill indicates the memory limit is too low for the workload. Either increase the limit or investigate the memory leak in the application.

Resource Limits in Docker Compose

services:
  api:
    image: my-api:latest
    deploy:
      resources:
        limits:
          cpus: '1.0'
          memory: 512M
        reservations:
          cpus: '0.25'
          memory: 128M

  db:
    image: postgres:15-alpine
    deploy:
      resources:
        limits:
          cpus: '2.0'
          memory: 1G

The deploy.resources syntax is the Docker Compose v3 format used by Swarm mode. For standalone docker compose up (without Swarm), use:

services:
  api:
    image: my-api:latest
    mem_limit: 512m
    cpus: 1.0

Sizing Guidelines

Setting appropriate limits requires knowing the application's normal resource usage. Under-limiting causes OOM kills; over-limiting wastes host capacity.

The process for sizing limits:

Run the container without limits in a staging environment.
Apply realistic load and observe peak resource usage with docker stats.
Set memory limit at 1.5–2x peak observed usage, leaving headroom for load spikes.
Set CPU limit based on the required throughput and latency targets.
Monitor in production for OOM kills and CPU throttling events.

CPU throttling (the container hitting its CPU limit) does not cause an exit — it causes latency. Throttled containers take longer to respond. If latency SLAs are exceeded and CPU is at the limit, increase the CPU allocation.