4.2.13.4 HEALTHCHECK Retries
A focused guide to HEALTHCHECK Retries, connecting core concepts with practical Docker and container operations.
The HEALTHCHECK retries setting controls how many consecutive failed health check attempts must occur before Docker actually marks a container as unhealthy, preventing a single transient failure from immediately triggering an unhealthy status.
Setting the Retry Threshold
The --retries option specifies the number of consecutive failures required before the container's status changes to unhealthy.
HEALTHCHECK --interval=30s --retries=3 CMD curl -f http://localhost:8080/health || exit 1
With this configuration, the container is only marked unhealthy after three consecutive failed checks in a row — a single isolated failure does not immediately change its reported status.
Why Tolerating Some Failures Is Useful
Transient issues — a brief network blip, a momentary spike in load causing a slow response — can cause an occasional health check failure even when the application is otherwise fine. Requiring multiple consecutive failures before declaring the container unhealthy avoids overreacting to this kind of normal, temporary noise.
HEALTHCHECK --retries=1 CMD curl -f http://localhost:8080/health || exit 1
HEALTHCHECK --retries=3 CMD curl -f http://localhost:8080/health || exit 1
The first configuration reacts immediately to any single failure; the second requires a more sustained pattern of failure before reaching the same conclusion.
Choosing an Appropriate Retry Count
The right number of retries depends on how tolerant a particular application reasonably should be of occasional transient issues, balanced against how quickly a genuine, sustained problem needs to be detected and acted upon.
HEALTHCHECK --interval=10s --retries=5 CMD curl -f http://localhost:8080/health || exit 1
A shorter interval combined with a higher retry count can detect sustained problems reasonably quickly while still tolerating brief, isolated blips.
Observing the Effect of Retries in Practice
The container's recent health check history, including individual pass and fail results, can be inspected to understand how the configured retry threshold is actually behaving under real conditions.
docker inspect myapp --format '{{json .State.Health.Log}}'
Why the Retries Setting Matters
An appropriately tuned retry count prevents a health check from being either too trigger-happy (reacting to harmless, transient noise) or too tolerant (failing to detect a genuine, ongoing problem within a reasonable time frame), making it an important part of designing a health check that produces a genuinely useful and trustworthy status signal.