15.2.1.2 Container Memory Usage

A focused guide to Container Memory Usage, connecting core concepts with practical Docker and container operations.

Container memory usage describes how much physical memory a container's processes are consuming, accounted through the kernel's cgroups memory subsystem, and reading it accurately requires understanding what is and is not included in the reported figure, since a naive interpretation of "memory usage" can significantly overstate or understate how close a container actually is to genuine memory pressure.

Reading basic memory usage

Docker reports memory usage directly relative to the container's configured limit, when one is set:

docker run -d --memory=512m my-api
docker stats my-api --no-stream

MEM USAGE / LIMIT   MEM %
312MiB / 512MiB      61.0%

Without an explicit --memory limit, the "LIMIT" shown typically reflects the host's total available memory, which means the percentage in that case reflects usage relative to the entire host's capacity, not any boundary actually being enforced against the container itself.

Page cache inclusion in reported usage

The reported memory figure for a container generally includes page cache used for file reads and writes performed by processes in that container, not just memory the application has actually allocated for its own working data:

docker exec my-api cat /proc/meminfo | grep -i cached

cat /sys/fs/cgroup/memory/docker/<container-id>/memory.stat

This matters because page cache is reclaimable by the kernel under memory pressure, unlike genuinely allocated application memory; a container whose reported usage is dominated by page cache from heavy file I/O is generally in a much healthier position than one using the same amount of memory through actual application allocations, even though both might report an identical total figure through docker stats.

Distinguishing RSS from cache-inclusive totals

For a more precise view of what an application is actually holding onto, as opposed to reclaimable cache, the cgroups memory.stat file breaks down usage into more specific categories:

docker exec my-api cat /sys/fs/cgroup/memory/memory.stat | grep -E '^(rss|cache|mapped_file)'

rss 184320000
cache 98304000
mapped_file 12288000

The rss (resident set size) figure is generally a closer approximation of genuine application memory pressure than the combined total docker stats reports by default, since it excludes the reclaimable cache portion.

What happens when a container hits its memory limit

When a container's actual, non-reclaimable memory usage reaches its configured --memory limit, the kernel's out-of-memory killer terminates a process inside that container's cgroup, which Docker surfaces as the container exiting with a specific signal and an OOM-related status:

docker inspect my-api --format '{{.State.OOMKilled}}'

true

Checking this flag directly after an unexpected container restart is one of the fastest ways to confirm whether memory exhaustion specifically was the cause, rather than a crash for an unrelated reason that happened to also result in the container exiting.

Setting an appropriate memory limit

A memory limit set too low causes premature OOM kills during legitimate, expected memory usage; set too high, or left unset entirely, a single misbehaving container can consume enough host memory to affect every other container or process sharing that host:

docker stats --no-stream --format "{{.MemUsage}}" my-api

docker update --memory=768m my-api

Reviewing actual peak memory usage over a representative period, accounting for legitimate spikes such as periodic batch processing or cache warming, and setting the limit with reasonable headroom above that observed peak, generally produces a more reliable limit than an arbitrary round number chosen without reference to actual behavior.

Memory limits and language runtime awareness

Some language runtimes, particularly those with their own internal memory management like the JVM, may not automatically detect and respect a container's cgroups memory limit unless explicitly configured to do so, which can result in the runtime allocating memory up to what it believes is available on the host, only to be OOM-killed once it exceeds the container's actual, lower limit:

docker run -d --memory=512m -e JAVA_OPTS="-XX:MaxRAMPercentage=75" my-api

Explicitly configuring a runtime to respect container memory limits, rather than assuming modern runtime versions handle this transparently in every case, avoids a class of OOM kill that is confusing to diagnose because the application's own internal memory accounting may show usage well within what it believes its available memory to be.

Monitoring memory trends rather than point-in-time snapshots

A single memory reading reveals little about whether usage is stable, growing toward a limit, or fluctuating with expected workload patterns; tracking memory usage as a time series surfaces the difference between healthy, plateauing usage and a slow, continuous climb characteristic of a memory leak:

container_memory_usage_bytes{name="my-api"}

A memory usage graph that climbs steadily over hours or days without ever plateauing, even under roughly steady request volume, is a strong, early signal of a leak worth investigating well before it results in an actual OOM kill.

Common mistakes

Interpreting the total memory figure from docker stats as entirely application-allocated memory, without accounting for the often substantial page cache component included in that total.
Setting no explicit memory limit at all, leaving a container free to consume host memory without any boundary until it affects every other process on the host.
Not checking the OOMKilled flag after an unexpected restart, missing a clear, direct signal that memory exhaustion specifically was the cause.
Running a language runtime that does not automatically respect container memory limits without explicitly configuring it to do so, leading to confusing OOM kills that contradict the application's own internal memory reporting.
Reacting to a single memory snapshot rather than reviewing the trend over time, missing the distinction between a legitimate, plateauing working set and a continuously growing leak.

Container memory usage is accurately understood only by distinguishing reclaimable page cache from genuine application allocation, checking the OOMKilled flag directly when investigating unexpected restarts, and tracking trends over time rather than relying on isolated snapshots, since each of these distinctions changes what the right response to a given memory reading actually is.