4.2.5.5 RUN Layer Cleanup
A focused guide to RUN Layer Cleanup, connecting core concepts with practical Docker and container operations.
RUN layer cleanup is the practice of removing temporary files, package manager caches, or other unnecessary data within the same RUN instruction that created them, since cleanup performed in a separate, later instruction cannot reduce the size already committed by an earlier layer.
Why Cleanup Must Happen in the Same Layer
Because each layer permanently captures the filesystem state at the point it was committed, deleting a file in a later RUN instruction does not shrink the image — the file still exists, unmodified, in the earlier layer beneath it, simply hidden from view in the combined filesystem.
RUN curl -O https://example.com/large-file.tar.gz
RUN rm large-file.tar.gz
This produces an image that still contains the full size of large-file.tar.gz, despite it appearing deleted from the final filesystem view, because it was committed to a layer before being removed in a separate, later one.
The Correct Pattern: Cleanup Within the Same Instruction
Combining the operation that creates temporary data with its own cleanup, within a single RUN instruction, ensures the resulting layer reflects only the final, post-cleanup state.
RUN curl -O https://example.com/large-file.tar.gz \
&& tar -xzf large-file.tar.gz \
&& rm large-file.tar.gz
Common Cleanup Targets
Package manager caches, downloaded archive files no longer needed after extraction, and temporary build artifacts are all common candidates for this kind of same-layer cleanup.
RUN apt-get update && apt-get install -y curl && rm -rf /var/lib/apt/lists/*
RUN pip install --no-cache-dir -r requirements.txt
The second example avoids the issue altogether by instructing pip not to cache downloaded packages in the first place, rather than needing a separate cleanup step.
Verifying the Effect of Cleanup
Comparing image size before and after restructuring cleanup to occur within the same layer is a direct way to confirm the technique is working as intended.
docker history myapp:1.0
Why RUN Layer Cleanup Matters
This single technique — combining creation and cleanup of temporary data within the same RUN instruction — is one of the most impactful, broadly applicable optimizations available for reducing unnecessary image bloat, and overlooking it is one of the most common reasons images end up larger than they need to be.