15.3.1.2 HTTP Health Endpoint

A focused guide to HTTP Health Endpoint, connecting core concepts with practical Docker and container operations.

An HTTP health endpoint is the specific, conventional web route an application exposes for external health verification, and beyond the question of what it checks internally, its HTTP-specific implementation details, status codes, response format, path naming, caching behavior, and authentication, determine how reliably and conveniently it can be consumed by the variety of tools, container runtimes, load balancers, and monitoring systems, that typically need to query it.

Path naming conventions

While no single standard is universally enforced, several path conventions are common enough that following one improves consistency across services and reduces friction when configuring external tooling against many different applications:

/healthz
/health
/status
/ping

The z suffix convention (/healthz) originated from internal Google practices and has become widely adopted partly because it avoids colliding with an application's own legitimate /health resource, if one exists for unrelated business purposes; choosing one convention and applying it consistently across every service in an organization is more valuable than which specific convention is chosen.

Status code conventions

The HTTP status code returned should map clearly and consistently to the health state, since many consumers, container runtime health checks, load balancers, uptime monitors, key their behavior off the status code rather than parsing a response body:

app.get('/healthz', async (req, res) => {
  const healthy = await checkDependencies();
  res.status(healthy ? 200 : 503).end();
});

200 OK for healthy and 503 Service Unavailable for unhealthy is the most widely understood pairing; 503 specifically communicates "temporarily unavailable, may recover," which is semantically appropriate for most health check failure scenarios, as opposed to a code implying a permanent or client-caused error.

Response body conventions

A minimal response body, or none at all, is sufficient for most consumers that only check the status code, but including a small amount of structured detail is valuable for human operators and for monitoring systems capable of using it:

app.get('/healthz', async (req, res) => {
  const checks = await runDependencyChecks();
  const healthy = Object.values(checks).every(Boolean);
  res.status(healthy ? 200 : 503).json({ status: healthy ? 'ok' : 'degraded', checks });
});

{ "status": "degraded", "checks": { "database": true, "cache": false } }

This kind of structured detail is particularly useful when a single endpoint checks multiple dependencies, since it lets a human or a more detailed monitoring integration immediately see which specific dependency is responsible for an unhealthy result, rather than only knowing that something, unspecified, is wrong.

Response time expectations

A health endpoint should respond quickly and predictably, since slow health check responses can themselves cause timeouts that are then misinterpreted as the application being unhealthy, when the actual problem might be something else slowing down just this specific endpoint:

app.get('/healthz', async (req, res) => {
  const timeout = new Promise((_, reject) => setTimeout(() => reject(new Error('timeout')), 2000));
  try {
    await Promise.race([checkDependencies(), timeout]);
    res.status(200).send('ok');
  } catch {
    res.status(503).send('degraded');
  }
});

Setting an internal timeout on the health check's own dependency calls, shorter than whatever timeout the consuming health check configuration uses, ensures the endpoint itself fails fast and predictably rather than hanging until an external timeout eventually intervenes.

Avoiding caching interference

Because health status needs to reflect current, real-time state, response headers should explicitly prevent caching at any intermediary that might otherwise serve a stale cached response instead of querying the application directly:

app.get('/healthz', (req, res) => {
  res.set('Cache-Control', 'no-store');
  res.status(200).send('ok');
});

A health endpoint accidentally cached by a CDN or reverse proxy in front of the application can produce a persistently stale, misleading health status, continuing to report healthy long after the application itself has actually become unhealthy, which defeats the entire purpose of having the check.

Authentication considerations

Whether a health endpoint requires authentication depends on what it reveals and who needs to consume it; many setups deliberately leave it unauthenticated specifically because the container runtime's own internal health check, and often an external load balancer, need to reach it without managing credentials:

app.get('/healthz', (req, res) => res.status(200).send('ok')); // intentionally unauthenticated

If a more detailed response body reveals internal architecture details (specific dependency names, version numbers, internal hostnames) that should not be exposed publicly, a reasonable middle ground is a minimal, unauthenticated public response for basic routing and runtime checks, paired with a separate, authenticated, more detailed status endpoint for internal monitoring tooling that needs the additional detail.

app.get('/healthz', (req, res) => res.status(200).send('ok'));
app.get('/internal/status', requireAuth, async (req, res) => {
  res.json(await getDetailedStatus());
});

Separating liveness and readiness endpoints

For applications where the distinction matters, exposing separate endpoints for liveness (is the process fundamentally alive) and readiness (is it currently able to serve traffic) allows different consumers to query the specific question relevant to their own decision, restart versus traffic routing:

app.get('/livez', (req, res) => res.status(200).send('alive'));
app.get('/readyz', async (req, res) => {
  const ready = await checkDependencies();
  res.status(ready ? 200 : 503).send(ready ? 'ready' : 'not ready');
});

HEAD versus GET for lightweight checks

For extremely high-frequency external health polling (an uptime monitor checking every few seconds, for instance), supporting a lightweight HEAD request in addition to GET can reduce unnecessary response body generation overhead for consumers that only care about the status code:

app.head('/healthz', (req, res) => res.status(200).end());

This is a minor optimization relevant mainly at high check frequency or for resource-constrained environments, and is not a necessary requirement for most typical health check consumption patterns.

Common mistakes

Using a non-standard or inconsistent status code mapping across different services, making it harder to configure shared tooling uniformly against many applications.
Allowing the health endpoint's response to be cached by an intermediate proxy or CDN, producing a stale health status that no longer reflects current reality.
Exposing detailed internal architecture information through an unauthenticated health endpoint that should have been kept to a minimal public response with a separate, authenticated detailed endpoint instead.
Not setting an internal timeout on the health check's own dependency calls, allowing a slow dependency to make the health check itself unpredictably slow.
Conflating liveness and readiness into a single endpoint when the consuming systems would benefit from being able to query each question separately.

An HTTP health endpoint's value depends as much on these HTTP-specific implementation details, status codes, caching headers, response time, and authentication boundaries, as it does on what the check verifies internally, since a health endpoint that is slow, cacheable, or inconsistent in its status code conventions undermines the reliability of every system built to consume it.