14.1.1.3 Production Healthcheck

A focused guide to Production Healthcheck, connecting core concepts with practical Docker and container operations.

Production healthcheck configures Docker (or an orchestrator) to actively, periodically verify a container is genuinely functioning correctly, not just that its process happens to still be running, allowing automatic detection and remediation of a container that's technically alive but actually unable to serve traffic correctly.

Defining a Meaningful Healthcheck

A healthcheck should verify the application can genuinely fulfill its purpose, not just that the process exists.

HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
  CMD curl -f http://localhost:8080/health || exit 1

services:
  app:
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 30s
      timeout: 5s
      retries: 3

This periodically calls the application's own health endpoint, marking the container unhealthy if that endpoint fails to respond successfully.

Why a Health Endpoint Should Verify Genuine Functionality

A health endpoint that simply returns success unconditionally provides no real signal — a meaningful health check verifies something the application actually depends on, like database connectivity.

app.get('/health', async (req, res) => {
  await db.query('SELECT 1');
  res.status(200).send('OK');
});

This endpoint fails (and correctly marks the container unhealthy) if the database connection it depends on is actually broken, providing a genuinely useful health signal.

Why Orchestrators Act on Healthcheck Results

An orchestrator can automatically restart or stop routing traffic to a container reported as unhealthy, providing automatic remediation without requiring manual intervention.

docker inspect myapp --format '{{.State.Health.Status}}'

Avoiding an Overly Aggressive Healthcheck Configuration

A healthcheck interval or timeout set too aggressively risks falsely marking a genuinely healthy but momentarily slow container as unhealthy, triggering unnecessary restarts.

healthcheck:
  timeout: 10s
  retries: 5

Why Production Healthcheck Matters

A properly configured, genuinely meaningful health check is essential for an orchestrator to automatically detect and respond to a container that's running but not actually functioning correctly, a capability that simple process-liveness monitoring alone cannot provide.