✦ For everyone, free.

Practical knowledge for real and everyday life

Home

16.2.1.5 App Startup Crash

A focused guide to App Startup Crash, connecting core concepts with practical Docker and container operations.

An app startup crash is a failure that occurs after Docker has successfully invoked the configured command, distinguishing it from a runtime command error, but before the application has finished its own initialization sequence, and diagnosing it effectively depends on recognizing that the cause lies within the application's own startup logic and its immediate dependencies, not in anything Docker-specific about how the container itself was configured.

Confirming the command was actually invoked successfully

The first useful distinction is verifying that the application process did start executing at all, even briefly, before crashing, which separates this category from a runtime command error where the command itself never successfully launched:

docker logs my-api
Starting server...
Connecting to database...
Error: connect ECONNREFUSED 127.0.0.1:5432

Seeing the application's own log output, even just a few lines, before the crash confirms the command itself was invoked correctly and the failure is happening within the application's own startup logic, which redirects the investigation toward configuration and dependencies rather than the command invocation itself.

Missing or invalid configuration

A startup crash caused by missing or malformed required configuration is extremely common and, when the application is well designed, produces a clear, explicit error naming exactly what is missing:

Error: Missing required environment variable: JWT_SECRET
docker inspect my-api --format '{{.Config.Env}}'

Comparing the application's actual required configuration against what was supplied to the container directly confirms whether a specific value was simply never passed, or was passed with the wrong key name, a common and easy mistake when an environment variable name has a typo on either the application or the deployment configuration side.

Dependency not yet ready

A startup crash that occurs specifically when attempting to connect to a database, cache, or other dependency, particularly in a freshly started multi-container stack, often indicates the dependency was not actually ready to accept connections yet when the application attempted to connect, a startup ordering problem rather than a configuration problem:

Error: connect ECONNREFUSED database:5432
services:
  api:
    depends_on:
      db:
        condition: service_healthy

Adding a health-aware depends_on condition, rather than relying on a bare depends_on that only guarantees container start order, not actual readiness, addresses this specific category of startup crash directly by delaying the dependent container's own start until the dependency reports itself genuinely ready.

Retry logic as a complementary safeguard

Even with correctly ordered startup, retry logic within the application itself provides resilience against a dependency that takes slightly longer than expected to become ready, or against a transient, momentary connection failure that would otherwise cause an unnecessary crash and restart cycle:

async function connectWithRetry(retries = 5, delay = 2000) {
  for (let i = 0; i < retries; i++) {
    try {
      return await db.connect();
    } catch (err) {
      if (i === retries - 1) throw err;
      await new Promise((resolve) => setTimeout(resolve, delay));
    }
  }
}

This kind of retry logic is a sensible complement to, not a replacement for, correct startup ordering at the orchestration level; relying on retries alone to paper over a fundamentally unordered startup sequence works but introduces unnecessary delay and noisy, repeated failure logging during every single startup.

Port already in use

A startup crash specific to binding a network port, where the port is already occupied by another process inside the same container or, less commonly, conflicts with something unexpected in the container's network namespace, produces a clear, specific error:

Error: listen EADDRINUSE: address already in use :::3000
docker exec my-api netstat -tlnp

This is relatively uncommon for a typical, simple container but can occur if a previous instance of the application failed to exit cleanly and left a process still bound to the port, or if a multi-process container inadvertently starts two instances of the same listener.

File or directory not found during initialization

An application that expects a configuration file, certificate, or other resource at a specific path, but does not find it because of a build-time COPY mistake or a runtime volume mount that was not actually attached, crashes during startup with an error naming the specific missing path:

Error: ENOENT: no such file or directory, open '/app/config/production.json'
docker exec my-api ls -la /app/config/

Checking the actual filesystem state directly, rather than assuming the expected file is present based on what the Dockerfile or Compose file appears to specify, confirms or rules out this category of cause quickly.

Crash loops and the restart policy interaction

A startup crash that recurs identically on every restart, rather than resolving itself, produces a restart loop, and the restart count combined with consistent log output across restarts confirms this is a deterministic, repeatable failure rather than a transient one:

docker inspect my-api --format '{{.RestartCount}}'
docker logs my-api --tail 20

A genuinely deterministic startup crash will not resolve itself through repeated restarts alone; the underlying cause, a missing configuration value, an unreachable dependency that is itself persistently down, needs to actually be fixed rather than relying on the restart policy to eventually succeed through repetition.

Distinguishing a startup crash from a later runtime crash

It is worth being precise about timing: a crash occurring within the first few seconds of a container's life is almost always a startup crash with a cause rooted in initialization, configuration, or immediate dependency availability, while a crash occurring after a container has been running successfully for an extended period points toward a different category of cause entirely, resource exhaustion, a specific request pattern, or a time-based condition, that an investigation focused purely on startup logic would not surface.

docker inspect my-api --format '{{.State.StartedAt}} {{.State.FinishedAt}}'

Comparing these two timestamps directly confirms how long the container actually ran before failing, which is a useful, objective anchor for deciding which category of investigation is actually appropriate.

Common mistakes

  • Not checking whether the application's own log output appeared at all before the crash, missing the distinction between a runtime command error and a genuine startup crash.
  • Relying on depends_on without a health condition, allowing the application to attempt connecting to a dependency before it is genuinely ready.
  • Adding retry logic as the only safeguard against startup ordering issues, rather than also fixing the underlying orchestration-level ordering directly.
  • Assuming a configuration value was supplied correctly without directly comparing what the application expects against what the container was actually given.
  • Treating a deterministic, repeatedly failing startup crash as something a restart policy will eventually resolve through repetition alone.

An app startup crash is diagnosed by first confirming the application genuinely began executing, then narrowing the cause through its own log output, typically pointing directly at a missing configuration value, an unready dependency, or a missing expected file, each of which has a specific, recognizable fix distinct from the others.