✦ For everyone, free.

Practical knowledge for real and everyday life

Home

16.1.1.4 Context Transfer Bloat

A focused guide to Context Transfer Bloat, connecting core concepts with practical Docker and container operations.

Context transfer bloat is the performance cost specifically incurred by sending an oversized build context from the client to wherever the actual build is executing, the local daemon over a Unix socket, a remote daemon over the network, or a remote builder in CI, and while a .dockerignore file addresses the root cause, understanding how and why this transfer cost varies across different build setups clarifies why the same oversized context can be a minor inconvenience in one setup and a significant bottleneck in another.

Why transfer cost varies by connection type

When building against a local daemon over a Unix socket, context transfer is essentially a fast, local file copy operation, and even a fairly large context transfers quickly enough that the overhead is barely noticeable:

docker build -t my-api .
Sending build context to Docker daemon  1.2GB

The same context, sent to a remote daemon over a network connection, or to a remote builder in a CI pipeline running on different infrastructure entirely, incurs actual network transfer time proportional to both the context size and the available bandwidth, which can turn a context-size problem that was invisible in local development into a noticeable bottleneck once builds move to CI or a remote build environment:

docker -H tcp://remote-builder:2375 build -t my-api .

A context that takes a negligible fraction of a second to transfer locally might take tens of seconds or more over a slower or more constrained network connection to a remote daemon, which is worth measuring directly rather than assuming local build performance generalizes to every build environment.

Measuring transfer time specifically

Isolating how much of total build time is attributable specifically to context transfer, as opposed to actual instruction execution, clarifies whether context size reduction would meaningfully help:

time docker build -t my-api . 2>&1 | tee build-output.log
grep "Sending build context" build-output.log

A build where the "Sending build context" line reports a large size and a noticeable pause occurs immediately after that line, before any actual build instruction output appears, is a build where context transfer is a real, measurable contributor to total build time, distinct from any slowness in the instructions themselves.

CI and remote builder scenarios

CI pipelines often check out a full repository, including history and every branch's worth of accumulated artifacts in some configurations, before invoking a build, which means the context-size problem in CI can be considerably worse than what a local development checkout, typically more thoroughly cleaned and free of stray build artifacts, would show:

git clean -fdx
du -sh .

Running an explicit clean step before a build, or configuring the CI checkout itself to avoid unnecessary depth and history, addresses a source of context bloat that is specific to CI environments and would not necessarily be visible when testing build performance only on a local development machine.

Build cache mounts as an alternative to copying large directories

For cases where a large directory genuinely needs to be available during the build, but does not need to become part of the final image or even part of the transferred context on every single build, BuildKit's cache mount feature can provide access to a persistent directory across builds without that directory ever being part of the transferred context at all:

RUN --mount=type=cache,target=/root/.npm npm install

This is a more targeted solution than including a large dependency cache directory in the build context directly, since the cache mount persists between builds at the BuildKit level without ever needing to be packaged and transferred as part of the context itself.

Splitting a large monorepo context

For a large monorepo where a specific service's build only needs a fraction of the overall repository, scoping the build context to that specific service's subdirectory, rather than the entire repository root, directly reduces what needs to be transferred for that specific build:

docker build -t service-a ./service-a

This requires the relevant Dockerfile to only reference files within that narrower context, which sometimes requires restructuring shared code into something referenced via a package registry or a BuildKit named context instead of a direct relative path reaching outside the service's own directory.

Remote Git context as a bloat-avoidance strategy

Specifying a remote Git repository URL directly as the build context bypasses local context packaging and transfer entirely, since the build environment clones the repository itself rather than receiving a context package from the local client:

docker build https://github.com/example/my-api.git#main

This is most useful in CI or automation scenarios building directly from a repository reference, where it avoids the local checkout, packaging, and transfer steps entirely in favor of having the build environment fetch exactly what it needs directly.

Layer caching versus context transfer

It is worth distinguishing context transfer cost from build layer caching: even with effective layer caching reusing most build steps from a previous build, the context transfer step itself still happens on every build invocation, since the daemon needs the current context to determine which layers can actually be reused; an oversized context is therefore a recurring cost on every single build, not something layer caching mitigates on its own.

docker build -t my-api . # context transfer happens even if every subsequent layer is cached

Common mistakes

  • Assuming a build's local performance generalizes directly to a remote daemon or CI environment, missing a context transfer cost that only becomes noticeable once network transfer is actually involved.
  • Not cleaning a CI checkout of accumulated history or artifacts before a build, carrying CI-specific context bloat that would not appear in a typical local development checkout.
  • Copying a large, frequently-changing dependency cache directory into the build context directly instead of using a BuildKit cache mount that persists outside the transferred context entirely.
  • Assuming effective layer caching eliminates the recurring cost of context transfer, when the daemon still needs the current context on every build invocation regardless of how many layers are reused.
  • Not measuring context transfer time directly, relying on a general impression of "the build feels slow" rather than identifying specifically how much of that slowness is attributable to context size.

Context transfer bloat is fundamentally the same root cause as any other build context size issue, but its actual performance impact depends heavily on the specific build environment, and the right mitigation, a more thorough .dockerignore, a narrower context for a monorepo service, a cache mount, or a remote Git context, depends on which specific scenario, local, remote daemon, or CI, is actually experiencing the cost.