15.2.1.1 Container CPU Usage

A focused guide to Container CPU Usage, connecting core concepts with practical Docker and container operations.

Container CPU usage describes how much processor time a container's processes consume, reported by Docker through the kernel's cgroups CPU accounting, and interpreting it correctly requires understanding the relationship between the raw measurement, the container's configured limits, and the host's total available capacity, since the same percentage figure can mean very different things depending on that context.

Reading basic CPU usage

The most direct way to observe a container's CPU usage is through Docker's built-in stats interface, which reports a percentage relative to a single CPU core's worth of capacity by default:

docker stats my-api --no-stream

CONTAINER ID   NAME     CPU %
3f29a8c1d8e2   my-api   145.2%

A reading above 100% is normal and expected for a multi-threaded or multi-process container running on a host with more than one CPU core available, since the percentage here represents the sum of usage across all cores the container is using, not a single core's utilization capped at 100%.

CPU limits and how they change interpretation

Setting an explicit CPU limit on a container caps how much CPU time it can actually consume, and that limit becomes the denominator against which usage should be judged, rather than the host's total core count:

docker run -d --cpus=2 my-api

docker stats my-api --no-stream

CPU %: 195.0%

With a --cpus=2 limit, a reading near 200% means the container is using nearly its full allotment of 2 cores' worth of CPU time; the same raw percentage without a configured limit would instead need to be interpreted against however many cores the host actually has, which is a meaningfully different comparison.

CPU shares versus hard limits

Docker supports two distinct mechanisms for controlling CPU allocation: --cpus, a hard ceiling the container cannot exceed even if the host has spare capacity, and --cpu-shares, a relative weighting that only matters when multiple containers are actively competing for the same CPU resources:

docker run -d --cpu-shares=512 my-api

docker run -d --cpu-shares=1024 my-worker

With these relative shares configured and no hard limits, both containers can use up to 100% of available CPU when the host is otherwise idle; the shares only take effect and begin proportionally dividing available CPU time once the host's CPU is genuinely under contention from multiple containers competing simultaneously.

CPU throttling under a hard limit

When a container hits its configured --cpus limit, the kernel throttles it, restricting further CPU time until the next accounting period, which can be observed directly through cgroups throttling statistics and is a frequently overlooked cause of degraded application performance that looks like a CPU shortage but is actually an artificially imposed ceiling:

cat /sys/fs/cgroup/cpu/docker/<container-id>/cpu.stat

nr_throttled 142
throttled_time 8234000000

A high nr_throttled count alongside a CPU usage reading that never quite reaches the configured limit indicates the container is being throttled frequently, which manifests to the application as intermittent slowdowns precisely when it most needs burst capacity, even though the host itself may have plenty of spare CPU available that the limit is preventing the container from using.

Distinguishing host saturation from container-specific limits

A container reporting consistently high CPU usage could be saturating its own configured limit, or it could be one of several containers collectively saturating the host's actual total CPU capacity; these require different responses, raising the container's own limit versus addressing host-level capacity or rebalancing workloads across more hosts:

docker stats --no-stream

top

Comparing per-container CPU usage against overall host CPU utilization at the same moment clarifies which situation is actually occurring: if the host has substantial idle capacity while a specific container is pegged at its own limit, raising that container's limit is the appropriate fix; if the host itself is near full utilization across many containers, the fix is at the host or cluster capacity level instead.

CPU usage trends over time

A single point-in-time CPU reading is far less useful than a trend observed over a representative period, since usage often varies significantly with request volume, scheduled jobs, or time-of-day traffic patterns that a single snapshot cannot reveal:

scrape_configs:
  - job_name: 'cadvisor'
    static_configs:
      - targets: ['cadvisor:8080']

rate(container_cpu_usage_seconds_total{name="my-api"}[5m])

Exporting CPU usage to a time-series metrics system and reviewing it over days or weeks, rather than relying on point-in-time docker stats checks, is necessary for sizing decisions like setting an appropriate --cpus limit or deciding when to scale a service horizontally.

Sizing CPU limits based on observed usage

Setting a CPU limit too conservatively causes unnecessary throttling during legitimate, expected load spikes; setting it too generously risks one container consuming more host capacity than intended at the expense of others sharing the same host. Reviewing actual peak usage over a representative period, then setting the limit with reasonable headroom above that observed peak, is a more reliable approach than guessing or copying a value used for an unrelated service:

docker stats --no-stream --format "{{.CPUPerc}}" my-api

docker update --cpus=3 my-api

Common mistakes

Interpreting a CPU percentage above 100% as an error, rather than understanding it as expected multi-core usage reported relative to a single core's capacity.
Setting a --cpus limit without monitoring for throttling, leaving a self-imposed performance ceiling unnoticed and misattributed to some other cause.
Confusing CPU shares with hard limits, expecting shares to cap usage the way --cpus does, when shares only matter under actual host contention.
Reacting to a single high CPU reading without checking the trend over a representative period, leading to a limit change based on an atypical spike rather than genuine sustained usage.
Raising a container's CPU limit without first checking whether the host itself has the spare capacity to actually grant it, rather than just shifting the bottleneck elsewhere.

Container CPU usage is only meaningful when interpreted alongside the container's configured limits (or lack thereof) and the host's actual available capacity, and the most common diagnostic gap is failing to check for throttling specifically, since a throttled container can show CPU usage that never appears to be "maxed out" while still suffering real, limit-induced performance degradation.