4.2.4.1 ADD Archive Extraction
A focused guide to ADD Archive Extraction, connecting core concepts with practical Docker and container operations.
ADD archive extraction is the automatic behavior ADD performs when its source is a recognized compressed archive format: rather than copying the archive file itself into the image, ADD extracts its contents directly into the specified destination.
Recognized Archive Formats
ADD recognizes common compression formats — including .tar, .tar.gz, and .tar.bz2 — and extracts them automatically rather than copying them as opaque files.
ADD release.tar.gz /app/
If release.tar.gz contains files at its root, those files end up directly inside /app/, not as a release.tar.gz file sitting inside that directory.
Why This Is Useful
For distributing pre-built software as a compressed archive, ADD's automatic extraction saves an explicit extraction step that would otherwise require a separate RUN tar -xzf instruction.
ADD app-release.tar.gz /opt/app/
COPY app-release.tar.gz /tmp/
RUN tar -xzf /tmp/app-release.tar.gz -C /opt/app/ && rm /tmp/app-release.tar.gz
The first, using ADD, achieves the same practical result in a single instruction, without needing an explicit extraction and cleanup step.
Risks of Unintended Extraction Behavior
If a Dockerfile author intends to copy an archive file as-is — for distribution, for example, rather than extraction — using ADD instead of COPY would unexpectedly extract it, which is a common source of confusion for anyone not aware of this specific behavioral difference.
ADD backup.tar.gz /backups/
If the intent here was simply to store the archive file inside the image unmodified, this instruction would not achieve that, since ADD extracts it automatically.
Verifying Extraction Behavior
After a build, checking whether the destination contains an archive file or its extracted contents confirms exactly what behavior occurred.
docker run --rm myapp ls /app/
Why Understanding Archive Extraction Matters
ADD's automatic extraction is a genuinely useful capability when actually intended, but it is also one of the most common sources of confusion between ADD and COPY, making a clear understanding of exactly when and how it triggers essential for writing a Dockerfile that behaves as intended.