18.3.2.5 Kubernetes Persistent Volumes
A focused guide to Kubernetes Persistent Volumes, connecting core concepts with practical Docker and container operations.
Kubernetes persistent volumes extend the concept of a Docker named volume into a considerably more elaborate model involving separate cluster-level and namespace-level resources, dynamic provisioning, access modes governing how many pods can use the same storage simultaneously, and explicit reclaim policies, all of which require understanding directly rather than assuming a simple, one-to-one mapping from Docker's much simpler volume model.
PersistentVolume versus PersistentVolumeClaim
A PersistentVolume (PV) represents an actual piece of provisioned storage at the cluster level, while a PersistentVolumeClaim (PVC) is a namespaced request for storage that gets bound to a matching PV, a two-resource model with no direct equivalent in Docker's considerably simpler, single-resource named volume concept:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pgdata-claim
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
docker volume create pgdata
The PVC here is a request, not the storage itself; a cluster-level PV satisfying that request is either provisioned dynamically or selected from a pool of pre-existing, statically created PVs, a layer of indirection that Docker's directly-created named volume simply does not have.
Static versus dynamic provisioning
Static provisioning requires a cluster administrator to create PVs manually ahead of time; dynamic provisioning, the more common pattern in modern clusters, automatically creates a matching PV the moment a PVC requests storage from a configured StorageClass:
apiVersion: v1
kind: PersistentVolumeClaim
spec:
storageClassName: gp3
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
This dynamic model is what most production clusters actually rely on, automatically provisioning the underlying cloud storage resource, an EBS volume, a Persistent Disk, the moment a PVC referencing an appropriate storage class is created, removing the need for manual, static volume pre-creation entirely.
Access modes and their implications for replicated services
Access modes, ReadWriteOnce (one node can mount for read-write), ReadOnlyMany (many nodes, read-only), ReadWriteMany (many nodes, read-write simultaneously), determine whether a given storage volume can actually support more than one pod accessing it at the same time, which has direct implications for any service running multiple replicas:
accessModes: ["ReadWriteOnce"]
A ReadWriteOnce volume, the most common and widely supported mode across storage backends, can only be mounted read-write by pods on a single node at a time, which means a multi-replica service all needing write access to the same shared volume genuinely requires ReadWriteMany support, a mode not universally available across every storage backend and worth confirming directly rather than assuming.
StatefulSets and per-replica volumes
For a genuinely stateful, multi-replica service where each replica needs its own distinct, persistent storage rather than one shared volume, a StatefulSet combined with a volumeClaimTemplate automatically provisions a separate PVC for each replica, each one persisting independently across that specific replica's restarts:
apiVersion: apps/v1
kind: StatefulSet
spec:
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
This pattern has no direct Docker equivalent at all, since Docker's named volume model assumes either one shared volume or manually creating and attaching a separately named volume per replica explicitly; Kubernetes's StatefulSet automates this per-replica volume provisioning entirely.
Reclaim policies
A PV's reclaim policy determines what happens to the underlying storage once its bound PVC is deleted, Retain preserving the actual data for manual recovery or reattachment, Delete (the more common default for dynamically provisioned volumes) actually destroying the underlying storage resource entirely:
apiVersion: v1
kind: PersistentVolume
spec:
persistentVolumeReclaimPolicy: Retain
Confirming a volume's reclaim policy explicitly, particularly for anything holding genuinely important data, before deleting its PVC avoids an unintended, permanent data loss if the policy happens to be set to Delete when Retain was actually intended.
The binding lifecycle
A PVC remains in a Pending state until a matching PV is found or dynamically provisioned, and understanding this lifecycle directly is useful when diagnosing a pod stuck waiting on storage that never seems to actually become available:
kubectl get pvc
NAME STATUS VOLUME
pgdata-claim Pending
A PVC stuck in Pending typically indicates either no storage class is correctly configured to satisfy the request, or the underlying cloud provider's storage provisioning is failing for some specific, investigable reason, both of which are worth checking directly through kubectl describe pvc rather than assuming the storage will eventually become available on its own.
Common mistakes
- Assuming a PersistentVolumeClaim directly creates storage the way
docker volume createdoes, rather than understanding it as a request that still needs to be satisfied by a matching, separately existing or dynamically provisioned PersistentVolume. - Using
ReadWriteOncefor a multi-replica service that genuinely needs simultaneous, multi-node write access, without confirming whetherReadWriteManyis actually available from the underlying storage backend. - Not using a StatefulSet's
volumeClaimTemplatefor a service needing genuinely separate, per-replica persistent storage, attempting to manually replicate this pattern instead. - Deleting a PVC without first confirming its bound PV's reclaim policy, risking unintended, permanent data loss if the policy is set to
Deleterather thanRetain. - Not investigating a PVC stuck in a pending state directly through
kubectl describe, assuming the underlying storage provisioning will eventually resolve on its own.
Kubernetes persistent volumes extend Docker's simple named-volume concept into a considerably more elaborate model involving separate PV and PVC resources, dynamic provisioning through storage classes, access modes with direct implications for multi-replica services, and explicit reclaim policies, each of which requires deliberate, direct understanding rather than assuming a straightforward translation from Docker's much simpler volume abstraction.