6.3.3.2 On Failure Policy
A focused guide to On Failure Policy, connecting core concepts with practical Docker and container operations.
The on-failure restart policy automatically restarts a container only if its main process exits with a non-zero (failure) status, leaving it stopped if it exits cleanly (status 0) — distinguishing between an intentional, successful completion and an actual crash.
Configuring on-failure
The policy can optionally specify a maximum number of restart attempts, preventing an indefinitely repeating restart loop for a container that fails consistently.
docker run -d --restart=on-failure:5 myapp:1.0
This attempts to restart the container up to five times if it continues exiting with a failure status, after which it gives up and leaves the container stopped.
Why Distinguishing Success From Failure Matters Here
A container performing a finite task that completes successfully (exiting with status 0) shouldn't be restarted, since there's nothing further for it to do — but a container that crashed unexpectedly (a non-zero exit) often genuinely should be retried, since the crash may have been transient.
docker run -d --restart=on-failure:3 myapp:1.0 run-batch-job.sh
If run-batch-job.sh completes successfully, the container stays stopped; if it crashes, Docker attempts up to three automatic restarts before giving up.
Why an Unlimited Retry Count Can Be Risky
Without a maximum retry count, a container that fails consistently (due to a persistent configuration or environment problem) would be restarted indefinitely, potentially obscuring an underlying issue that needs actual attention rather than repeated automatic retries.
docker run -d --restart=on-failure myapp:1.0
Specifying an explicit limit, rather than leaving it unbounded, is often the more prudent choice.
Observing Restart Attempts
The number of times a container has actually been restarted under this policy can be inspected directly.
docker inspect myapp --format '{{.RestartCount}}'
Why the on-failure Policy Matters
This policy is well suited to containers performing finite, potentially retryable tasks, where a clean exit means genuine completion but a failed exit might reasonably be worth a limited number of automatic retry attempts before requiring manual attention.