✦ For everyone, free.

Practical knowledge for real and everyday life

Home

20.2.2 Image Optimization Step

A focused guide to Image Optimization Step, connecting core concepts with practical Docker and container operations.

Image optimization reduces the size, build time, and security surface of Docker images. An unoptimized image built from a naive Dockerfile can easily reach 1GB or more for a typical application; an optimized image of the same application may be under 100MB. Smaller images pull faster from registries, start containers faster, consume less disk space across deployments, and present fewer packages for vulnerability scanners to flag. The image optimization step in the intermediate track introduces the techniques that achieve these results systematically.

Why Images Grow Large

Unoptimized images accumulate size from several sources:

  • Build tools and compilers installed in the image that are needed to build the application but not to run it (gcc, make, node-gyp, webpack, TypeScript compiler).
  • Package manager caches left behind after apt-get install or npm install — these files are downloaded during the build but never used at runtime.
  • Source code and test files copied into the image when only the compiled output or production dependencies are needed at runtime.
  • Intermediate artifacts from build steps that are no longer needed once the final binary or bundle is produced.
  • Oversized base images — using FROM ubuntu:22.04 or FROM node:20 when a minimal variant like node:20-alpine or node:20-slim would suffice.

Key Optimization Techniques

The optimization step covers five main techniques, each targeting a different source of image bloat or inefficiency:

1. Choosing a minimal base image

The base image is the largest single contributor to image size. Switching from node:20 (1.1GB) to node:20-alpine (135MB) reduces the starting point by 87% before adding a single line of application code. The minimal base also has fewer packages, meaning fewer CVEs and a smaller attack surface.

2. Cleaning up in the same RUN layer

Package manager installs create temporary cache files that inflate the image if not removed in the same layer:

Unoptimized (cache survives into the layer):

RUN apt-get update && apt-get install -y curl
RUN rm -rf /var/lib/apt/lists/*

Optimized (cache never enters the stored layer):

RUN apt-get update && apt-get install -y --no-install-recommends curl \
    && rm -rf /var/lib/apt/lists/*

Because Docker stores each layer as a diff, files removed in a later RUN instruction are hidden but not actually deleted from the image — they still occupy space in the earlier layer. Only removing them within the same RUN prevents them from being stored at all.

3. Layer ordering for cache efficiency

Dockerfile instructions that change less frequently should appear before instructions that change more frequently. Docker reuses cached layers until it encounters a changed instruction, then re-executes everything from that point.

Slow (changes to source code force reinstalling all dependencies):

COPY . .
RUN npm install

Fast (dependencies are reinstalled only when package.json changes):

COPY package.json package-lock.json ./
RUN npm install
COPY . .

A well-ordered Dockerfile turns a 30-second full build into a 1-second cache hit for typical source-code-only changes.

4. The .dockerignore file

Without .dockerignore, COPY . . sends the entire project directory into the image — including node_modules (hundreds of megabytes), .git history (tens of megabytes), test fixtures, and log files. A .dockerignore file excludes these:

node_modules
.git
*.log
.env
dist
coverage
__tests__

The .dockerignore file reduces both the build context size (speeding up context transfer to the daemon) and the image size (preventing unnecessary files from being included in layers).

5. Multi-stage builds

Multi-stage builds are the most powerful optimization technique. They allow separate build and runtime stages in a single Dockerfile, copying only the final artifacts into the runtime image:

# Stage 1: Build
FROM node:20-alpine AS builder
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build

# Stage 2: Runtime
FROM node:20-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
EXPOSE 3000
CMD ["node", "dist/server.js"]

The builder stage installs all dependencies (including dev dependencies) and compiles the application. The runtime stage starts fresh from the same base image and copies only the compiled output and production node_modules from the builder. The final image contains no TypeScript compiler, no webpack, no test libraries — only what is needed to run the application.

Measuring the Impact

Before optimization:

docker images my-app:unoptimized
REPOSITORY   TAG           SIZE
my-app       unoptimized   1.14GB

After applying all five techniques:

docker images my-app:optimized
REPOSITORY   TAG        SIZE
my-app       optimized  98MB

A 90% size reduction is typical for Node.js applications when moving from an unoptimized to a fully optimized Dockerfile.

Inspecting What Takes Space

docker history my-app:optimized
IMAGE          CREATED BY                         SIZE
a1b2c3d4e5f6   /bin/sh -c #(nop) CMD ...         0B
b2c3d4e5f6a1   /bin/sh -c #(nop) EXPOSE 3000     0B
c3d4e5f6a1b2   /bin/sh -c #(nop) COPY ...        45.2MB
d3e4f5a6b7c8   /bin/sh -c #(nop) COPY ...        52.1MB

docker history shows the size contribution of each layer. Large layers from COPY instructions point to large files being included; large layers from RUN instructions point to packages being installed without cache cleanup.

For a more detailed analysis:

docker image inspect my-app:optimized --format '{{json .RootFS.Layers}}'

Security Benefit

Every package in the base image is a potential CVE. Minimal images reduce the installed software inventory. A Distroless image for a Java application has no shell, no package manager, and no OS utilities — the attack surface for a compromised container is drastically smaller because there are fewer tools available to an attacker.

docker scout cves my-app:unoptimized
docker scout cves my-app:optimized

The optimized image typically shows significantly fewer known vulnerabilities because the reduced package count eliminates many of the CVEs present in unnecessary OS packages and development tools.

Content in this section