3.2.3.1 Image Content Addressing

A focused guide to Image Content Addressing, connecting core concepts with practical Docker and container operations.

Image content addressing is the underlying principle that every image and every individual layer is identified by a hash of its own content, rather than by an arbitrary name assigned to it — meaning identical content always produces the identical identifier, no matter where or when it was created.

Content Addressing at the Layer Level

Each layer's identifier is computed from the layer's actual data, which is why two completely unrelated images that happen to produce byte-for-byte identical layers — for example, both based on the same version of the same base image — are recognized as sharing that exact layer.

docker inspect myapp-a:1.0 --format '{{json .RootFS.Layers}}'
docker inspect myapp-b:1.0 --format '{{json .RootFS.Layers}}'

If both images share a base layer, the corresponding layer digest appears identically in both outputs, confirming that the same underlying data is being referenced rather than two separate, coincidentally similar copies.

Content Addressing at the Image Level

An image's own digest is computed from its manifest, which in turn references its configuration and its layers by their own digests — meaning any change anywhere in an image's content propagates upward into a different overall image digest.

docker inspect myapp:1.0 --format '{{.Id}}'

Why Content Addressing Enables Verification

Because an identifier is derived directly from content, verifying that a downloaded image matches what was expected is straightforward: recompute the digest from the received content and compare it against the expected digest, with any mismatch indicating corruption or tampering.

docker pull myapp@sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855

If the content actually retrieved did not match this digest, the pull would fail verification rather than silently succeeding with incorrect content.

Why Content Addressing Underlies Layer Sharing and Caching

Every efficiency benefit discussed elsewhere — shared layer storage, build caching, deduplicated transfers — ultimately traces back to this one principle: because identity is derived from content, identical content is always recognized as identical, regardless of which image, build, or host it originated from.

docker system df -v

Why Content Addressing Matters

Content addressing is the foundational technical mechanism beneath nearly every other property discussed in relation to Docker images — immutability, reproducibility, efficient storage, and verifiable integrity all follow directly from identifying content by what it actually is, rather than by an arbitrary name attached to it.