Open In App

An Introduction To Kubernetes Volume Snapshots

Last Updated : 15 Mar, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

Kubernetes is an open-source container orchestration platform designed to automate the deployment, scaling, and management of containerized applications with unparalleled efficiency and flexibility to meet fluctuating demand and ensure high availability through the automated self-healing mechanism.

Kubernetes, crafted in the Go programming language and originating from Google’s internal ” Borg ” project, officially emerged as an open-source platform. later subsequent donation to the Cloud Native Computing Foundation (CNCF). Kubernetes has seen exponential growth and adoption, becoming the standard for container orchestration Tools in the Tech industry.

Kubernetes Volume

One crucial aspect of Kubernetes is managing data persistence, which is where volumes and snapshots come into play. One issue arises when a container crashes or stops, resulting in the loss of the container state and any files created or modified during its lifetime. Upon a crash, kubelet restarts the container with a clean slate, leading to data loss. Additionally, managing shared files among multiple containers within a Pod can be complex. Kubernetes volume abstraction addresses these challenges by providing a solution for persistent storage and shared filesystems across containers.

In Kubernetes, a volume acts like a shared folder that is accessible to all containers within a pod, designed to persist data beyond the lifetime of the pod, ensuring data remains available even if containers are restarted or replaced. This feature facilitates seamless data storage and Volumes provide a mechanism for storing and sharing data between containers within the same pod. enhancing application reliability and resilience. Just like you might have folders on your computer to store files, volumes in Kubernetes give our applications a place to store their data.

Kubernetes Persistent Volume

A Kubernetes persistent volume (PV) is a piece of storage provisioned either statically by an administrator or dynamically by a storage class. PV is not dependent on the lifecycle of a pod. PVs provide a way for pods to access durable, persistent storage that can survive pod restarts or rescheduling. PVs are especially useful for stateful applications such as databases where data persistence is critical.

Kubernetes persistent volumes provide a way for applications running in Kubernetes clusters to store and access data persistently. PVs abstract underlying storage infrastructure and provide a unified interface for managing storage resources within the cluster. they decouple storage provisioning from pod lifecycle management, allowing administrators to manage storage independently of the application workload.

Persistent volumes are typically created using storage plugins provided by the underlying infrastructure (e.g., AWS EBS, GCP Persistent Disk, NFS, etc.) or dynamically provisioned by Kubernetes using StorageClasses.

Kubernetes Persistent Volume Specification

It mainly consists of several components

1. Capacity:

Specifies the total amount of storage available in the Persistent Volume, typically expressed in units like bytes, megabytes, or gigabytes.

spec:
capacity:
storage: 1Gi

storage: 1Gi : Indicates that the storage capacity is 1 gigabyte (Gi stands for gigabyte).

2. Access Modes:

These modes define how the storage can be accessed by pods. Common access modes include ReadWriteOnce (RWO), ReadOnlyMany (ROX), and ReadWriteMany (RWX).

accessModes:
- ReadWriteOnce

indicating how the volume can be mounted by Pods (e.g., ReadWriteOnce for read-write access by a single node)

3. Volume Mode:

Indicates the type of volume being used, such as Filesystem or Block.

spec:
volumeMode: Filesystem

will be accessible and mountable as a filesystem within pods.

Persistent Volume (PV) Manifest File

apiVersion: v1
kind: PersistentVolume
metadata:
name: flask-volume
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
hostPath:
path: /data/flask

you need to create the /data/flask directory manually on your host machine before using it as a “hostPath” volume in Kubernetes. the Kubernetes pod will be able to read and write data to this directory on the host machine.

Apply the Persistent Volume manifest to create the volume, using this command Persistent Volume named “flask-volume” will be created.

kubectl apply -f  pv.yml

you can check whether PV is created or not using Kubectl command. this command will display a list of Persistent Volume (PV) in the kubernetes cluster.

kubectl get pv

PV

Kubernetes Persistent Volume Claim

A Kubernetes Persistent Volume Claim (PVC) like sending a request for storage space within a Kubernetes cluster. It’s a resource that allows a pod to request specific storage resources, such as size and access mode, without needing to know the details of the underlying storage implementation. For example, pod asking for storage without knowing all the technical details about how the storage works. Once a PVC is created or we can say once you make this request, Kubernetes will automatically bind it to an available Persistent Volume (PV) that satisfies the request.

This makes it easier for users to request and use storage resources dynamically without having to manage the underlying storage Infrastructure because they don’t have to worry about managing the storage themselves. PVCs are really helpful, especially when you have applications that need to keep data even if the pod restarts or moves around in the cluster.

Kubernetes Persistent Volume Claim (PVC) Specification

It mainly consists of several components.

1. Resources:

Specifies the requested storage resources for the Persistent Volume Claim, typically expressed in units like bytes, megabytes, or gigabytes.

spec:
resources:
requests:
storage: 1Gi

storage: 1Gi: Indicates the requested storage capacity is 1 gigabyte (Gi stands for gigabyte).

2. Access Modes:

These modes define how the storage can be accessed by pods using the Persistent Volume Claim. Common access modes include ReadWriteOnce (RWO), ReadOnlyMany (ROX), and ReadWriteMany (RWX).

accessModes:
- ReadWriteOnce

indicating how the volume can be mounted by Pods (e.g., ReadWriteOnce for read-write access by a single node)

3. Volume Name:

Specifies the name of the Persistent Volume to bind to the Persistent Volume Claim. If not specified, Kubernetes automatically binds the Persistent Volume Claim to an available Persistent Volume that satisfies the request.

volumeName: my-pv

Specifies the name of the Persistent Volume to bind to the Persistent Volume Claim. If not specified, Kubernetes automatically binds the PersistentVolumeClaim to an available Persistent Volume that satisfies the request.

Persistent Volume Claim (PVC) Manifest File

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: flask-claim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi

PVC

This YAML/Manifest file defines a PersistentVolumeClaim named “flask-claim” requesting 1Gi of storage with ReadWriteOnce access mode.

Apply the PersistentVolumeClaim manifest to claim the Volume. This command will apply “pvc.yml” file and Claim Persistent Volume named as “flask-claim”.

kubectl apply -f pvc.yml

To Check if Persistent Volume Claim is Created or Not we can use this kubectl command.

kubectl get pvc

PVC

Mount the Volume in a Pod

Now, why we do need to Create Pod manifest ? we have already created manifests for Persistent Volume (PV) and Persistent Volume Claim (PVC). these resources are used to provide and claim storage within the Kubernetes cluster.

Pod is what actually uses that storage. in your Pod manifest, you specify which containers should run, what images they should use, and how they should interact with the storage provided by the PVs Through PVCs. this includes defining volume mounts in the pod spec, which tell Kubernetes where to mount the storage inside the containers.

Pod Manifest File

apiVersion: v1
kind: Pod
metadata:
name: flask-pod
labels:
app: flask-app
spec:
containers:
- name: flask-container
image: DockerHubUsername/your-image-name:tag
volumeMounts:
- name: flask-volume
mountPath: /volume/data
volumes:
- name: flask-volume
PersistentVolumeClaim:
claimName: flask-claim

Screenshot-(77)

Adjust the image field with your applications Docker image and tag. also, modify the mountPath to the directory where your application stores data.

the mountPath /volume/data is a directory inside a container where the contents of the persistentvolumeclaim named “flask-claim” will be mounted. and this directory we do not need to create manually. when you specify a mountPath in a pod manifest, Kubernetes will automatically create the directory /volume/data inside the container.

Apply the Pod Manifest File to create the Pod and mount the Volume.

kubectl apply -f pod.yml

image_123650291-(3)-(1)

PV and PVC are created and Mounted in a Pod

To check if a persistent volume and persistent volume claim are created and mounted in a pod we can use following command.

kubectl describe pod  <pod_name>

Describe Pod

How To Access Application Running Inside Pod?

To access the Application running inside a Kubernetes pod, we need to expose the application either through port forwarding or by creating a Kubernetes Service.

1. Port Forwarding

Port Forwarding used to forward the traffic from the local machine to the pod running the application.

kubectl port-forward   your-pod-name   8080:8080

you can also use different ports as per your wish.

2. Kubernetes Service

To access the application externally or by the other pods within the Kubernetes Cluster, Service is the Resource we can use.

Kubernetes Service Manifest File

apiVersion: v1
kind: Service
metadata:
name: flask-service
spec:
selector:
app: flask-app
ports:
- protocol: TCP
port: 8080
targetPort: 8080
type: NodePort

Service

Apply the Service manifest file to create the service.

kubectl apply -f  service.yml

this will create service named “flask-service” that forwards traffic to the pod with label app: flask-app on port 8080.

To access the application, we need to find out the NodePort assigned to the service.

kubectl  get svc

image_123650291-(6)-(3)

the output will show the NodePort under the PORT (s) column, now search for minikube IP, you can get minikube Ip using following Command.

minikube  ip

once you have minkikube ip and NodePort you can access the application.

curl  <minikube_ip>:<NodePort>

image_123650291-(4)-(1)

Kubernetes Snapshots

Kubernetes volumes snapshots enhance the capabilities of volumes by enabling the creation of snapshots that capture the exact state of volume data at a specific point in time. These snapshots provide a reliable means for backing up, recovering, and cloning data, ensuring data integrity and enabling robust backup solutions for Kubernetes applications.

Just like taking a picture with your phone. A snapshot in Kubernetes is like taking a picture of our volume at a specific moment in time. It captures all the data stored in the volume at that exact moment.

This helps ensure that our applications always have access to the data they need, even if something unexpected happens.

Benefits of Kubernetes Volumes Snapshots

1. Robust Data Security

Snapshots provide a robust backup mechanism, ensuring the safety of critical data against system failures or inadvertent errors.

2. Streamlined Cloning Process

Snapshots enable rapid creation of volume clones, optimizing resource usage and expediting deployment processes.

3. Seamless Point-in-Time Data Recovery

With snapshots, users can effortlessly restore volumes to specific points in time, facilitating smooth rollback operations and maintaining data integrity during application updates.

Volume V/s Snapshot

Parameter

Volume

Snapshot

Definition

A volume in Kubernetes is a storage abstraction that allows containers within a pod to access shared or persistent data.

It provides a way to store and persist data that can be shared among multiple containers within the same pod.

A snapshot is a point-in-time copy of the data stored in a volume.

It captures the state of the data at a specific moment, allowing users to create backups or clones of their storage resources.

Purpose

Volumes are used to provide storage to containers within pods. They allow containers to access and store data persistently or temporarily during the lifetime of a pod.

Snapshots are used to create backups or clones of data stored in volumes.

They provide a way to capture the state of the data at a specific moment, enabling data protection, disaster recovery, and application cloning.

Lifecycle

Volumes have a lifecycle tied to the pod they are attached to.

They are created when the pod is created and destroyed when the pod is deleted.

Snapshots have a separate lifecycle independent of the volume or pod.

They are created at a specific moment in time and can exist even after the original volume or pod is deleted.

Mutability

Volumes are mutable, meaning they can be read from and written to by containers within the pod.

Changes made to the volume are reflected in real-time.

Snapshots are immutable, meaning they cannot be modified after creation.

They provide a consistent copy of the data at the time of the snapshot and cannot be altered.

Usage

Volumes are used to provide persistent storage to application running in Kubernetes pods.

They can be used for storing application data, configuration files, and other persistent resources.

Snapshots are used for data backup, disaster recovery, and cloning purposes.

They enable users to create point-in-time copies of data for archiving, testing, or restoring purposes.

Step-by-Step Implementation of Kubernetes volume Snapshot

Make sure the VolumeSnapshot API is installed in your Kubernetes cluster. This API is required to create and manage volume snapshots.

To install the VolumeSnapshot API in a Minikube Kubernetes cluster.

First, make sure that the VolumeSnapshot API addon is enabled in Minikube. You can enable it using the Minikube command line interface (CLI)

minikube addons enable volumesnapshots

After enabling the addon, verify that it’s running correctly.

minikube addons list

Addons

Apply Custom Resource Definitions (CRDs)

The VolumeSnapshot API requires Custom Resource Definitions (CRDs) to be installed in the cluster.

Note:

Installation of the CRDs is the responsibility of the Kubernetes distribution. Without the required CRDs present, the creation of a VolumeSnapshotClass fails. And you also need to ensure that the required components, including the external snapshotter controller (snapshot-controller, StorageClass) are installed.

Here is an example of StorageClass.yml file for reference.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: csi-hostpjath-sc
annotations:
storageclass.kubernetes.io/is-default-class: "false"
provisioner: hostpath.csi.k8s.io
reclaimPolicy: Retain # default value is Delete
allowVolumeExpansion: true
mountOptions:
- discard # this might enable UNMAP / TRIM at the block storage layer
volumeBindingMode: WaitForFirstConsumer
parameters:
guaranteedReadWriteLatency: "true" # provider-specific

Once the StorageClass is created, you can include storageClassName field in PersistentVolumeClaim manifest file that requests storage from the StorageClass.

Here’s an example PVC YAML manifest.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
accessModes:
- ReadWriteOnce
storageClassName: csi-hostpjath-sc
resources:
requests:
storage: 1Gi

Step 1: Create VolumeSnapshotClass

To define custom behaviors for volume snapshots, you can create a VolumeSnapshotClass object. VolumeSnapshotClass provides a way to describe the “classes” of storage when provisioning a volume snapshot.

The VolumeSnapshotClass includes the following attributes:

1. Driver

Each VolumeSnapshotClass is associated with a driver, determining the CSI (Container Storage Interface) volume plugin used for provisioning VolumeSnapshots. This attribute is mandatory.

2. Deletion Policy

Volume snapshot classes feature a deletionPolicy, allowing configuration of actions taken when a VolumeSnapshotContent linked to it is deleted. The deletionPolicy options are either Retain or Delete. If set to Delete, the associated storage snapshot and VolumeSnapshotContent are removed. If set to Retain, both the storage snapshot and VolumeSnapshotContent persist.

3. Parameters

Volume snapshot classes comprise parameters that detail volume snapshots within the class. The accepted parameters may vary based on the driver in use.

VolumeSnapshotClass YAML File

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: snapshot-class
driver: hostpath.csi.k8s.io
deletionPolicy: Delete
parameters:

Apply the manifest to create the VolumeSnapshotClass

kubectl apply -f  <snapshot_class_manifest.yaml>

You can verify the creation by listing the VolumeSnapshotClass objects in the cluster.

kubectl get volumesnapshotclass

Step 2: Create VolumeSnapshotContent

VolumeSnapshotContent object represents the actual snapshot data. VolumeSnapshotContent in Kubernetes is a cluster resource that represents a snapshot of a volume, established by an administrator post volume provisioning. It acts as a reference point encapsulating the data and metadata of the snapshot, similar to how a PersistentVolume serves as a cluster resource for storage.

The VolumeSnapshotContent includes the following fields:

1. snapshotHandle

The snapshotHandle serves as a distinct identifier for the volume snapshot established within the storage backend. This attribute is essential for pre-provisioned snapshots, delineating the CSI snapshot ID on the storage platform corresponding to the VolumeSnapshotContent.

2. sourceVolumeMode

The sourceVolumeMode specifies the mode of the volume from which the snapshot is captured. It can be either Filesystem or Block. If not specified, Kubernetes regards the snapshot as if the mode of the source volume is unknown.

3. Driver

Each VolumeSnapshotContent is associated with a driver, defining the CSI volume plugin utilized for creating the snapshot. This field is mandatory.

4. Deletion Policy

VolumeSnapshotContent specifies a deletionPolicy, determining actions upon deletion of the snapshot. Options include Retain or Delete. If set to Delete, both the snapshot and VolumeSnapshotContent are removed. If set to Retain, they persist.

5. volumeSnapshotRef

volumeSnapshotRef serves as a pointer to the associated VolumeSnapshot. It’s important to note that when creating a pre-provisioned snapshot, the referenced VolumeSnapshot in volumeSnapshotRef might not be created yet.

VolumeSnapshotClontent YAML File

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotContent
metadata:
name: snapshot-content-72d9a349-aacd-42d2-a240-d775650d2455
spec:
deletionPolicy: Delete
driver: hostpath.csi.k8s.io
volumeSnapshotClassName: snapshot-class
source:
volumeHandle: ee0cfb94-f8d4-11e9-b2d8-0242ac110002
volumeSnapshotRef:
name: volume-snapshot

Apply the manifest to create the VolumeSnapshotContent.

kubectl apply -f  <snapshot_content_manifest.yaml>

You can verify the creation by listing the VolumeSnapshotContent objects in the cluster.

kubectl get volumesnapshotcontent

Step 3: Create VolumeSnapshot

In Kubernetes, a VolumeSnapshot denotes a snapshot of a volume within a storage system.

A VolumeSnapshot in Kubernetes is a user-requested snapshot of a volume, akin to a PersistentVolumeClaim. It captures a point-in-time copy of the volume’s data, empowering users to create and oversee snapshots for data protection and recovery within the cluster.

The VolumeSnapshot includes the following Attributes :

1. PersistentVolumeClaimName

persistentVolumeClaimName identifies the name of the PersistentVolumeClaim serving as the data source for the snapshot. It’s essential for dynamically provisioning a snapshot.

2. VolumeSnapshotClassName

Additionally, a volume snapshot can specify a particular class by referencing a VolumeSnapshotClass through the volumeSnapshotClassName attribute. If left unset, the default class is utilized if available.

VolumeSnapshot YAML File

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: volume-snapshot
spec:
volumeSnapshotClassName: snapshot-class
source:
persistentVolumeClaimName: flask-claim

Apply the manifest to create the VolumeSnapshot.

Kubectl apply -f volume-snapshot.yml

You can verify the creation by listing the VolumeSnapshot objects in the cluster.

Kubectl get volumesnapshot

To determine if a Volume Snapshot is associated with a PersistentVolumeClaim (PVC), we can use Following command.

kubectl describe volumesnapshot <snapshot-name>

Conclusion

we successfully created Kubernetes volume snapshots We followed the steps to create PersistentVolume (PV), PersistentVolumeClaim (PVC), VolumeSnapshotClass, VolumeSnapshot, and VolumeSnapshotContent. These snapshots provide a point-in-time copy of the data stored in the PersistentVolume, which can be used for backup, cloning, or other data management purposes.

Kubernetes Volume Snapshot – FAQ’s

What is a Kubernetes volume snapshot ?

A Kubernetes volume snapshot is a point-in-time copy of the data stored in a PersistentVolume (PV) within a Kubernetes cluster. It captures the state of the data at a specific moment, allowing users to create backups or clones of their storage resources.

Why do we need volume snapshots ?

Volume snapshots provide a convenient way to back up and restore data in Kubernetes environments. They enable data protection, disaster recovery, and application cloning without disrupting ongoing operations.

How do I create a Kubernetes volume snapshot ?

To create a volume snapshot, you need to define a VolumeSnapshotClass, then create a VolumeSnapshot object specifying the source (PV or PVC) and the snapshotClassName. Finally, Kubernetes will automatically create a VolumeSnapshotContent object to represent the actual snapshot.

Can I create volume snapshots for any PersistentVolume?

Volume snapshots can only be created for PersistentVolumes that support snapshotting. This capability depends on the storage provider and the underlying storage system.

Are volume snapshots persistent across cluster restarts?

Volume snapshots are typically persistent across cluster restarts as they are stored in the underlying storage system. However, it’s essential to verify the behavior with your specific storage provider.

How do I restore data from a volume snapshot?

To restore data from a volume snapshot, you can create a new PersistentVolumeClaim (PVC) using the snapshot as the data source. Kubernetes will automatically provision a new PersistentVolume (PV) based on the snapshot, allowing you to access the data.

What are some best practices for managing volume snapshots?

Some best practices include regularly backing up critical data, testing snapshot restores, using appropriate retention policies, and monitoring snapshot usage and performance.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads