2.3.3.3 Shared Layer Storage
A focused guide to Shared Layer Storage, connecting core concepts with practical Docker and container operations.
Shared layer storage is the practice of storing each unique image layer exactly once on a host, regardless of how many images or containers reference it, identified by the content hash of the layer rather than by which image it happens to belong to.
Layers Are Identified by Content, Not by Image
Two images that both happen to use, for example, the same version of a debian:bookworm-slim base layer reference that layer by its content digest. Since the digest is identical, Docker recognizes it as the same layer and stores it only once.
docker pull myapp-a:1.0
docker pull myapp-b:1.0
docker system df -v
If both images share several base layers, the verbose disk usage report shows those shared layers contributing to the total size only once, not duplicated per image.
Why This Matters for Pull Efficiency
When pulling a new image that shares layers with an image already present locally, Docker skips downloading the layers that are already available, transferring only the layers genuinely missing.
docker pull python:3.12-slim
docker pull python:3.12-slim-bullseye
If these two tags share a substantial number of underlying layers, pulling the second image after the first completes noticeably faster, since much of the data it needs is already stored locally.
Why This Matters for Disk Usage at Scale
A host running many different application images, all built on a small number of common base images, benefits significantly from shared layer storage — the marginal disk cost of each additional image is limited to whatever is genuinely unique about it, not the full size of its base layers.
docker images
docker system df
Comparing the sum of individual image sizes reported by docker images against the actual total disk usage reported by docker system df often reveals substantial savings due to layer sharing.
Why Shared Layer Storage Matters
Shared layer storage is a direct consequence of content-addressable layer identification, and it is one of the most significant practical efficiency benefits of the layered image model, particularly valuable in environments running many related images built from common bases.