20.1.2.1 First Dockerfile Creation

A focused guide to First Dockerfile Creation, connecting core concepts with practical Docker and container operations.

A Dockerfile is a plain text file that describes, step by step, how to construct a Docker image. Writing one for the first time requires understanding its structure, the role of each instruction, and the relationship between the Dockerfile, the build context, and the resulting image.

Setting Up the Project Directory

A Dockerfile lives in a directory that also contains all the files the build will need — source code, scripts, configuration. This directory is called the build context. Docker sends everything in it to the build daemon when docker build runs, so it should contain only what is necessary.

mkdir my-project
cd my-project

Create the Dockerfile:

touch Dockerfile

On Windows (PowerShell):

New-Item Dockerfile -ItemType File

The FROM Instruction

Every Dockerfile must begin with a FROM instruction. It names the base image — the starting point whose filesystem and configuration the new image builds upon.

FROM node:20-alpine

This line selects the official Node.js 20 image built on Alpine Linux, a minimal distribution that keeps image sizes small. The base image is pulled from Docker Hub if it is not already present locally.

The choice of base image matters. A minimal base (Alpine, Distroless, Debian Slim) produces smaller images. A full OS base (ubuntu, debian) includes more preinstalled tools. For production images, smaller is generally better. For development or debugging, a fuller base is sometimes more convenient.

The WORKDIR Instruction

WORKDIR sets the working directory inside the container for all subsequent instructions. If the directory does not exist, Docker creates it.

FROM node:20-alpine
WORKDIR /app

All COPY, RUN, and CMD instructions that follow use /app as their working directory. This avoids having to specify absolute paths in every instruction.

The COPY Instruction

COPY transfers files and directories from the build context on the host into the image:

FROM node:20-alpine
WORKDIR /app
COPY package.json .

The first argument is the source path in the build context; the second is the destination path in the image. The . means "the current working directory in the image", which is /app because of the WORKDIR instruction.

To copy multiple files or an entire directory:

COPY . .

This copies everything in the build context into /app in the image.

The RUN Instruction

RUN executes a shell command inside the image during the build. Each RUN instruction creates a new image layer:

FROM node:20-alpine
WORKDIR /app
COPY package.json .
RUN npm install
COPY . .

The npm install command runs inside the image and installs the Node.js dependencies listed in package.json. The resulting node_modules directory becomes part of the image layer created by this step.

It is common to chain multiple commands in a single RUN instruction using && to reduce the number of layers:

RUN apt-get update && apt-get install -y curl git && rm -rf /var/lib/apt/lists/*

The rm -rf /var/lib/apt/lists/* at the end cleans the package manager cache from that layer, reducing image size.

The EXPOSE Instruction

EXPOSE documents which port the application inside the container listens on:

EXPOSE 3000

This is documentation — it does not actually publish the port to the host. Publishing requires the -p flag at docker run time. However, EXPOSE is important because it informs users and tools (like Docker Compose) about the application's intended port.

The CMD Instruction

CMD defines the default command to execute when a container is started from the image. It is not executed during the build — only when a container is run:

CMD ["node", "server.js"]

The exec form (a JSON array of strings) is preferred over the shell form (CMD node server.js) because it starts the process directly without a shell wrapper, which makes signal handling behave correctly when the container is stopped.

A Complete Dockerfile for a Node.js Application

FROM node:20-alpine
WORKDIR /app
COPY package.json .
RUN npm install
COPY . .
EXPOSE 3000
CMD ["node", "server.js"]

The order of instructions matters for build performance. package.json is copied before the rest of the source code, and npm install runs before copying the rest. This means Docker can cache the npm install layer: if only source files (not package.json) change between builds, Docker reuses the cached layer for the dependency install step and only re-runs COPY . . and later instructions.

Building the Image

docker build -t my-node-app .

Docker reads the Dockerfile from the current directory, sends the build context, and executes each instruction in sequence:

[+] Building 18.4s (9/9) FINISHED
 => [internal] load build definition from Dockerfile
 => [internal] load metadata for docker.io/library/node:20-alpine
 => [1/5] FROM node:20-alpine
 => [2/5] WORKDIR /app
 => [3/5] COPY package.json .
 => [4/5] RUN npm install
 => [5/5] COPY . .
 => exporting to image
 => => naming to docker.io/library/my-node-app:latest

Verifying the Build

docker images my-node-app

REPOSITORY     TAG       IMAGE ID       CREATED         SIZE
my-node-app    latest    c1d2e3f4a5b6   2 minutes ago   128MB

Running the Image

docker run -d -p 3000:3000 my-node-app

The application inside the container starts on port 3000. The -p 3000:3000 flag maps that port to port 3000 on the host, making the application accessible at http://localhost:3000.

Instruction Execution Order and Caching

Instructions execute top to bottom. Changing an early instruction invalidates all layers below it, forcing Docker to re-execute those steps. Changing a later instruction (like the final COPY . .) only invalidates layers from that point forward, keeping earlier cached layers intact.

Common Mistakes in a First Dockerfile

Copying everything before installing dependencies: If COPY . . appears before RUN npm install, any source file change forces a reinstall of all packages, even if package.json did not change. Always copy dependency manifests first, install, then copy the rest of the source.

Running multiple RUN instructions for cleanup: If you install packages in one RUN layer and delete cache in a separate RUN layer, the cache data still exists in the first layer. Combine install and cleanup in a single RUN using &&.

Not using .dockerignore: Without a .dockerignore file, COPY . . can accidentally include node_modules, .git, log files, and other large or sensitive items from the build context. Create a .dockerignore with at minimum:

node_modules
.git
*.log
.env