✦ For everyone, free.

Practical knowledge for real and everyday life

Home

16.1.3.5 BuildKit Cache Surprise

A focused guide to BuildKit Cache Surprise, connecting core concepts with practical Docker and container operations.

A BuildKit cache surprise is a caching behavior specific to the modern BuildKit build backend that differs from the legacy builder's simpler, strictly sequential layer cache, most often encountered when sharing cache across machines or CI runs through explicit cache export and import, or when BuildKit's parallel execution model produces caching behavior that does not match the linear, top-to-bottom intuition the legacy builder's model encourages.

BuildKit's content-addressable cache model

Unlike the legacy builder's purely sequential, position-dependent layer cache, BuildKit identifies cache entries by the content and inputs of an operation, which allows it to reuse a cached result even when the same operation appears in a different position or a different stage than where it was originally cached, something the legacy builder's model could not do:

FROM node:20 AS deps
COPY package*.json .
RUN npm ci

FROM node:20 AS build
COPY package*.json .
RUN npm ci
COPY . .
RUN npm run build

BuildKit can recognize that the RUN npm ci step in the build stage has identical inputs to the one already cached from the deps stage and reuse that cached result directly, even though the instruction appears in a different stage, which is a meaningfully different and more flexible caching behavior than the legacy builder's strictly linear model would provide.

Exporting and importing cache explicitly

BuildKit supports explicitly exporting build cache to a registry or local directory and importing it in a separate, later build invocation, often on entirely different infrastructure, which is commonly used in CI to share cache between ephemeral build runners that would otherwise each start with an empty cache on every run:

docker buildx build --cache-to=type=registry,ref=registry.example.com/my-api:cache --push -t registry.example.com/my-api:1.4.0 .
docker buildx build --cache-from=type=registry,ref=registry.example.com/my-api:cache -t registry.example.com/my-api:1.4.0 .

The surprise here often comes from the cache import succeeding and reporting cache hits even when the actual build context has meaningfully changed since the cache was exported, if the changed content happens to hash identically for the specific layers being checked, or from a stale exported cache being imported across builds separated by enough time that the underlying base images or dependencies have since drifted.

Inline cache versus a separate cache image

BuildKit also supports embedding cache metadata directly within the pushed image itself (inline cache), as an alternative to exporting a fully separate cache reference, which has different trade-offs around how much cache history is retained and how cache import behaves on a fresh pull:

docker buildx build --build-arg BUILDKIT_INLINE_CACHE=1 -t registry.example.com/my-api:1.4.0 --push .

Inline cache is simpler to set up but generally retains less cache history than an explicit, separate cache export, which can produce a surprise where cache hits are less frequent than expected compared to a more thorough, separately exported and imported cache reference.

Parallel execution and unexpected ordering in build output

BuildKit executes independent build steps in parallel where dependencies allow, which produces interleaved, non-sequential build output that can be surprising to read for anyone expecting the strictly linear, top-to-bottom output the legacy builder always produced:

docker build --progress=plain -t my-api .
#3 [build 2/4] COPY package.json .
#4 [assets 1/3] COPY assets/ .
#3 [build 2/4] RUN npm ci

This interleaving is purely a display characteristic of BuildKit's parallel execution and does not itself indicate any caching problem, but it is worth recognizing explicitly so that interleaved output is not mistaken for an indication that steps executed out of their logical dependency order.

Cache mounts persisting state in ways that surprise reproducibility expectations

BuildKit's cache mount feature, useful for speeding up package manager downloads across builds, persists state outside of the standard layer cache entirely, in a location that is not part of the final image and not invalidated by the same rules as ordinary layers:

RUN --mount=type=cache,target=/root/.npm npm install

A build relying on a cache mount can behave differently on a fresh build machine with no existing mount cache than on a machine that has accumulated mount cache from many previous builds, which is a source of "works on this machine, slower (or occasionally subtly different) on a fresh one" surprises that are specific to this BuildKit feature rather than standard layer caching.

Cache invalidation differences between BuildKit and the legacy builder

Because BuildKit's caching model is more sophisticated, certain edge cases that would have reliably invalidated a layer under the legacy builder's simpler position-based model may behave differently under BuildKit's content-based model, which is occasionally a source of confusion when migrating a Dockerfile or build pipeline that was originally tuned around the legacy builder's specific caching characteristics:

DOCKER_BUILDKIT=0 docker build -t my-api .
DOCKER_BUILDKIT=1 docker build -t my-api .

Comparing build behavior explicitly between the two backends, when a caching discrepancy is suspected to be backend-specific rather than a genuine Dockerfile issue, isolates whether the surprise is attributable to this difference.

Common mistakes

  • Assuming BuildKit's caching model is simply a faster version of the legacy builder's, rather than a genuinely different, content-addressable model with different invalidation characteristics.
  • Importing cache from a registry-exported reference without considering how stale that cache might be relative to current base images and dependencies.
  • Choosing inline cache for convenience without recognizing it generally retains less cache history than an explicit, separately exported cache reference.
  • Misreading BuildKit's interleaved, parallel build output as evidence of out-of-order execution or a caching problem, when it is simply a display characteristic of parallel execution.
  • Not accounting for cache mounts persisting state outside the standard layer cache, leading to inconsistent build behavior between a fresh build machine and one with significant accumulated mount cache.

BuildKit cache surprises generally stem from its more sophisticated, content-addressable caching model behaving differently from the legacy builder's simpler, strictly sequential one, and understanding the specific mechanisms, content-based reuse across stages, explicit cache export and import, inline cache, and cache mounts, resolves most confusion that arises when a BuildKit-based build's caching behavior does not match the more linear, position-based intuition the legacy builder encourages.