14.1.2.1 Production Restart Policy
A focused guide to Production Restart Policy, connecting core concepts with practical Docker and container operations.
Production restart policy configures how Docker should respond when a container's process exits unexpectedly, allowing automatic recovery from a crash without requiring manual intervention, an essential resilience measure for any production deployment.
Configuring an Appropriate Restart Policy
The on-failure policy restarts a container only when it exits with a non-zero (error) status, leaving a deliberate, clean exit alone.
services:
app:
restart: on-failure
docker run -d --restart on-failure:3 myapp:1.0
The :3 here limits automatic restart attempts to three, avoiding an infinite restart loop if the container is genuinely, persistently broken.
Why always Differs From on-failure in an Important Way
The always policy restarts a container regardless of how it exited, even after a deliberate, intentional stop — generally too aggressive for most production services, where on-failure or unless-stopped better reflects intended behavior.
restart: unless-stopped
This restarts automatically after a crash, but respects an intentional docker stop, not restarting a container someone deliberately stopped.
Why an Unbounded Restart Policy Can Mask an Underlying Problem
A container restarting indefinitely without any limit can hide a persistent, unaddressed issue, continuously crash-looping without anyone necessarily noticing.
docker run -d --restart on-failure:5 myapp:1.0
Bounding the restart attempts, combined with appropriate alerting on repeated failures, ensures a persistent issue still gets surfaced rather than silently looping indefinitely.
Monitoring Restart Frequency as a Signal of an Underlying Issue
Tracking how often a container actually restarts provides an important signal that something may be wrong, even if each individual restart succeeds.
docker inspect myapp --format '{{.RestartCount}}'
Why Production Restart Policy Matters
A properly configured, appropriately bounded restart policy provides essential automatic resilience against transient failures, while still surfacing a genuinely persistent issue rather than masking it behind an endless, unmonitored restart loop.