✦ For everyone, free.

Practical knowledge for real and everyday life

Home

18.2.1.2 Swarm Manager Nodes

A focused guide to Swarm Manager Nodes, connecting core concepts with practical Docker and container operations.

Swarm manager nodes deserve operational treatment distinct from worker nodes, since they carry the cluster's actual consensus state and scheduling responsibility, which raises specific, practical questions about whether they should also run application workloads, how their Raft state is backed up, and how a failed manager is actually replaced without risking the cluster's overall quorum.

Whether managers should run application workloads

A manager node can run ordinary application containers in addition to its cluster management responsibilities, but dedicating managers exclusively to cluster management, draining them of ordinary application workload, isolates the cluster's critical control-plane function from resource contention an application workload might otherwise introduce:

docker node update --availability drain manager-1
docker node ls

Draining manager nodes specifically, leaving them available only for cluster management duties while application services run entirely on dedicated worker nodes, is the generally recommended production pattern, since a manager node under heavy resource pressure from an application workload it is also running risks degrading its own ability to participate reliably in cluster consensus.

Sizing manager count appropriately

The standard guidance is an odd number of managers, three for most clusters, five for unusually large or critical ones needing additional fault tolerance, since an odd number provides clear majority-based quorum tolerance without the wasted redundancy an even number would introduce without any corresponding additional fault tolerance:

docker node ls --filter role=manager

Three managers tolerate one failure while maintaining quorum; five tolerate two; an even number, four for instance, provides no additional quorum tolerance over three while requiring an additional node to maintain, which is why even manager counts are not a recommended configuration regardless of overall cluster size.

Backing up manager Raft state

The Raft consensus data each manager maintains can, and should, be backed up directly, providing a recovery path distinct from relying purely on quorum-based fault tolerance among currently running managers:

docker swarm ca
systemctl stop docker
tar czf swarm-backup-$(date +%Y%m%d).tar.gz /var/lib/docker/swarm
systemctl start docker

Performing this backup specifically while the Docker daemon is stopped on that manager avoids capturing an inconsistent, mid-write snapshot of the Raft state, which is why a brief, planned daemon stop, rather than a live backup attempt, is the recommended approach for this specific kind of backup.

Replacing a failed manager node

When a manager node fails permanently, removing it cleanly from the cluster's manager list and promoting a new node to replace it maintains the cluster's intended manager count and quorum tolerance, rather than leaving a permanently failed manager counted toward quorum calculations indefinitely:

docker node demote failed-manager-1
docker node rm failed-manager-1
docker node promote new-manager-1

If the failed manager cannot be cleanly removed because it is genuinely unreachable, docker node rm --force against its node ID, run from a still-healthy manager, removes it from the cluster's records even without its own cooperation, which is necessary specifically when a manager has failed in a way that prevents normal, graceful removal.

Monitoring manager-specific health

Beyond general container and resource monitoring, manager nodes specifically benefit from monitoring their participation in cluster consensus, since a manager that is technically running but struggling to keep up with Raft consensus traffic represents a distinct, more subtle failure mode than an outright node failure:

docker node ls
docker info --format '{{.Swarm.Managers}} {{.Swarm.Nodes}}'

Reviewing manager status directly and periodically, rather than only when a problem is already suspected, catches a manager that has silently fallen behind or is struggling before it actually drops out of quorum entirely and potentially threatens overall cluster availability.

Geographic and failure-domain distribution

For a cluster spanning multiple physical locations or availability zones, distributing manager nodes across genuinely separate failure domains, rather than concentrating them within a single rack, availability zone, or physical location, protects against a single, localized failure event taking out enough managers simultaneously to lose quorum entirely:

manager-1: availability-zone-a
manager-2: availability-zone-b
manager-3: availability-zone-c

This kind of deliberate distribution is a meaningful additional resilience consideration beyond simply having an odd, sufficient number of managers, since the actual physical or logical placement of those managers determines what kinds of correlated failures the cluster can genuinely tolerate.

Common mistakes

  • Running significant application workload directly on manager nodes rather than draining them to focus exclusively on cluster management responsibilities.
  • Configuring an even number of manager nodes, providing no additional quorum tolerance benefit over the next lower odd number while requiring extra infrastructure to maintain.
  • Never backing up manager Raft state directly, relying entirely on currently running managers' own quorum-based fault tolerance with no separate recovery path.
  • Leaving a permanently failed manager counted toward the cluster's manager set indefinitely rather than cleanly removing it and promoting a replacement.
  • Concentrating all manager nodes within a single physical location or failure domain, leaving the cluster vulnerable to a single, localized event taking out enough managers to lose quorum.

Swarm manager nodes warrant dedicated operational attention distinct from worker nodes, draining them of application workload, sizing their count deliberately with an odd number appropriate to the cluster's actual fault tolerance needs, backing up their Raft state directly, and distributing them across genuinely separate failure domains, all of which protect the cluster's actual control-plane function rather than leaving it as an unmanaged consequence of whatever nodes happened to be designated managers at initialization.