19.2.3.1 Stop Graceful Shutdown
A focused guide to Stop Graceful Shutdown, connecting core concepts with practical Docker and container operations.
Graceful shutdown is the process by which a container's primary process receives a termination signal, completes any in-progress work, and exits cleanly rather than being killed abruptly. docker stop implements graceful shutdown by sending SIGTERM to the container's main process, waiting for it to respond, and only escalating to SIGKILL if it does not exit within the configured timeout.
Why Graceful Shutdown Matters
Abrupt termination (via SIGKILL) interrupts whatever the process was doing at that instant:
- Active HTTP requests are dropped mid-response.
- Database transactions are rolled back or left in an incomplete state.
- Write buffers are not flushed, risking data loss or file corruption.
- Temporary files and locks are not cleaned up.
- Connections from downstream services remain open until they time out.
Graceful shutdown avoids all of this by giving the process control over its own exit sequence.
The Two-Phase Shutdown in docker stop
docker stop my-container
Phase 1: SIGTERM
Docker sends SIGTERM to PID 1 inside the container. This is the standard Unix signal for "please shut down." A well-behaved process catches this signal and begins its shutdown procedure:
- Stops accepting new incoming connections or requests.
- Waits for in-flight operations to complete (with an internal timeout).
- Flushes buffers and syncs data to storage.
- Releases locks and deletes temporary files.
- Closes database connections and connection pools.
- Exits with a meaningful exit code.
Phase 2: SIGKILL (if needed)
If the process has not exited after the timeout (default 10 seconds), Docker sends SIGKILL, which cannot be caught or ignored. The process is immediately terminated by the kernel.
Handling SIGTERM in Applications
For graceful shutdown to work, the application must explicitly handle SIGTERM. Here are examples in common languages:
Go
c := make(chan os.Signal, 1)
signal.Notify(c, syscall.SIGTERM)
go func() {
<-c
server.Shutdown(context.Background())
}()
Node.js
process.on('SIGTERM', () => {
server.close(() => {
process.exit(0);
});
});
Python
import signal, sys
def handle_sigterm(signum, frame):
# cleanup code here
sys.exit(0)
signal.signal(signal.SIGTERM, handle_sigterm)
PID 1 and Signal Propagation
The process that runs as PID 1 inside a container has special behavior regarding signals. Unlike other processes, PID 1 does not inherit default signal handlers from the kernel. If PID 1 does not register a handler for SIGTERM, the signal is silently ignored, and the container will not stop until SIGKILL is sent after the timeout.
This is a common issue when containers run shell scripts as their entrypoint. The shell script becomes PID 1, but it may not forward SIGTERM to child processes.
Example of the problem:
CMD ["./start.sh"]
When docker stop sends SIGTERM, the shell running start.sh may not forward the signal to the actual server process it launched. The server keeps running, the timeout expires, and SIGKILL terminates everything abruptly.
Solutions for PID 1 Signal Handling
Use exec form instead of shell form
# Shell form — shell becomes PID 1, server is a child
CMD ./server
# Exec form — server becomes PID 1 directly
CMD ["./server"]
The exec form runs the binary directly as PID 1 without a shell wrapper, so it receives SIGTERM directly.
Use an init process
Docker provides a built-in --init flag that runs tini (a minimal init process) as PID 1:
docker run --init myapp:1.0.0
tini properly forwards SIGTERM to child processes and reaps zombie processes. This is the simplest solution when modifying the application is not an option.
Custom STOPSIGNAL
If the application does not handle SIGTERM but does handle another signal (e.g., SIGINT), specify this in the Dockerfile:
STOPSIGNAL SIGINT
Docker will send SIGINT instead of SIGTERM when docker stop is called.
Adjusting the Timeout
For applications that need more time to shut down gracefully:
docker stop --time 60 my-database
60 seconds gives the database time to flush all pending writes to disk. Without a sufficient timeout, SIGKILL interrupts a flush mid-operation, potentially corrupting data.
For orchestrators using Docker Compose, the timeout is set per-service:
services:
database:
image: postgres:15
stop_grace_period: 60s
Verifying Graceful Shutdown Occurred
After stopping, the exit code reveals how the process ended:
docker inspect --format '{{.State.ExitCode}}' my-container
0— The process handled SIGTERM and exited cleanly with code 0.143— The process received SIGTERM and exited with code 143 (128 + SIGTERM's number 15). This is also a graceful shutdown.137— The process was killed with SIGKILL (128 + 9). Graceful shutdown did not complete before the timeout.
An exit code of 137 after docker stop indicates that the application either ignored SIGTERM or needed more time than the configured timeout.
Graceful Shutdown in Orchestrated Environments
In Docker Swarm and Kubernetes, graceful shutdown interacts with load balancer lifecycle hooks. The sequence for a zero-downtime deployment is:
- Stop routing new requests to the container (the orchestrator removes it from the load balancer).
- Send SIGTERM to let in-flight requests complete.
- Wait for the grace period.
- Remove the container.
The stop_grace_period in Docker Compose or terminationGracePeriodSeconds in Kubernetes controls how long step 3 is allowed to take. Setting it too low causes requests to be dropped; setting it too high slows deployments.