14.1.2.1 Production Restart Policy

A focused guide to Production Restart Policy, connecting core concepts with practical Docker and container operations.

Production restart policy configures how Docker should respond when a container's process exits unexpectedly, allowing automatic recovery from a crash without requiring manual intervention, an essential resilience measure for any production deployment.

Configuring an Appropriate Restart Policy

The on-failure policy restarts a container only when it exits with a non-zero (error) status, leaving a deliberate, clean exit alone.

services:
  app:
    restart: on-failure

docker run -d --restart on-failure:3 myapp:1.0

The :3 here limits automatic restart attempts to three, avoiding an infinite restart loop if the container is genuinely, persistently broken.

Why always Differs From on-failure in an Important Way

The always policy restarts a container regardless of how it exited, even after a deliberate, intentional stop — generally too aggressive for most production services, where on-failure or unless-stopped better reflects intended behavior.

restart: unless-stopped

This restarts automatically after a crash, but respects an intentional docker stop, not restarting a container someone deliberately stopped.

Why an Unbounded Restart Policy Can Mask an Underlying Problem

A container restarting indefinitely without any limit can hide a persistent, unaddressed issue, continuously crash-looping without anyone necessarily noticing.

docker run -d --restart on-failure:5 myapp:1.0

Bounding the restart attempts, combined with appropriate alerting on repeated failures, ensures a persistent issue still gets surfaced rather than silently looping indefinitely.

Monitoring Restart Frequency as a Signal of an Underlying Issue

Tracking how often a container actually restarts provides an important signal that something may be wrong, even if each individual restart succeeds.

docker inspect myapp --format '{{.RestartCount}}'

Why Production Restart Policy Matters

A properly configured, appropriately bounded restart policy provides essential automatic resilience against transient failures, while still surfacing a genuinely persistent issue rather than masking it behind an endless, unmonitored restart loop.