Using Volume Snapshot and Clone in Longhorn

by Emmanuel COLUSSI

Sep 23, 2021

After my previous post about Using HELM Chart to Deploying Longhorn to an Amazon Kubernetes Cluster using Terraform already up, I wanted to test Snapshot and Clone in Longhorn.

We are going to take the configuration that was set up in the previous post (I invite you to look at it if you did not follow it ) We had :

3 nodes Kubernetes cluster
2 nodes Attributes Longhorn storage cluster installed and configured
Storage Class defined : longhorn-demo
PVC (pvc-mysql-data01) used by a MySQL Server instance.

In this post you will see :

Creating a Snapshot
Creating a Clone
Creating a Storage Class
Provisioning a Persistent Volume Claim
Deploying a MySQL Server instance on an Longhorn storage

Prerequisites

Before you get started, you’ll need to have these things:

Kubernetes 1.13+ with RBAC enabled
iSCSI PV support in the underlying infrastructure
Longhorn installed and configured
a Storage Class created
a PVC created
a MySQL Server instance

Infra

infra, the Kubernetes infra

Create volumes Snapshot Persistent Volume Claim (PVC)

Snapshots are typically useful for purposes like audits and reporting. Another use for snapshot backups is that multiple snapshots can be created for a database, and these can be taken at different points in time. This helps with period-over-period analyses.

It is important to understand that database snapshots are directly dependent on the source database. Therefore, snapshots can never substitute your regular backup and restore strategy. For instance, if an entire database is lost, it would mean its source files are inconsistent. If the source files are unavailable, snapshots cannot refer to them, and so, snapshot restoration would be impossible. Snap, Snap

Definition of some terms used :

Snapshot: A storage snapshot is a set of reference markers for data at a particular point in time. A snapshot acts like a detailed table of contents, providing the user with accessible copies of data that they can roll back to.

VolumeSnapshot: A snapshot request object in kubernetes. When this object is created then the snapshot controller would act on the information and try to create a VolumeSnapshotData.

VolumeSnapshotData: It is a kubernetes object that represents an actual volume snapshot. It contains information about the snapshot e.g. the snapshot name mapped by the actual storage vendor, creation time etc.

Install the Snapshot Beta CRDs

Get the latest definitions of the CRDs

%:>  cd snapclone
%:> curl -o snapshot.storage.k8s.io_volumesnapshotclasses.yaml https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshotclasses.yaml
%:>
%:> curl -o snapshot.storage.k8s.io_volumesnapshotcontents.yaml https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshotcontents.yaml
%:>
%:>curl -o snapshot.storage.k8s.io_volumesnapshots.yaml https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshots.yaml
%:>

Install Snapshot CRDs

%:> kubectl apply -f snapshot.storage.k8s.io_volumesnapshotclasses.yaml
%:> kubectl apply -f snapshot.storage.k8s.io_volumesnapshotcontents.yaml
%:> kubectl apply -f snapshot.storage.k8s.io_volumesnapshots.yaml
%:>
%:>

Install the Common Snapshot Controller:

Get the last “Snapshot Controller” deployment definitions

%:> curl -o rbac-snapshot-controller.yaml https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/deploy/kubernetes/snapshot-controller/rbac-snapshot-controller.yaml
%:> curl -o deployment-snapshot-controller.yaml https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/deploy/kubernetes/snapshot-controller/setup-snapshot-controller.yaml
%:>

Install Common Snapshot Controller:

%:> kubectl apply -f rbac-snapshot-controller.yaml
%:> kubectl apply -f deployment-snapshot-controller.yaml
%:>

Creating a Volume Snapshot Class

Once the CRDs and Snapshot Controller have been installed, it remains to define an object of type VolumeSnapshotClass to make the link between Snapshot Controller and Longhorn.

Copy the following YAML specification into a file called sc_snapshot_class.yaml. (or modify the existing one)

%:> kubectl apply -f sc_snapshot_class.yaml
volumesnapshotclass.snapshot.storage.k8s.io/longhorn-demo-snap created
%:>

Verify VolumeSnapshotClass :

%:> kubectl get VolumeSnapshotClass
NAME                 DRIVER               DELETIONPOLICY   AGE
longhorn-demo-snap   driver.longhorn.io   Delete           3m48s

%:>

Creating a cStor Volume Snapshot

The following steps will help you to create a snapshot of a cStor volume. For creating the snapshot, you need to create a YAML specification and provide the required PVC name into it. The only prerequisite check is to be performed is to ensure that there is no stale entries of snapshot and snapshot data before creating a new snapshot.

Copy the following YAML specification into a file called vol_snapshot_student1.yaml.

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: vol-snapshot-mysql-data01
  namespace: student1
spec:
  volumeSnapshotClassName: longhorn-demo-snap
  source:
    persistentVolumeClaimName: pvc-mysql-data01

Run the following command to create the snapshot of the provided PVC.

%:> kubectl apply -f vol_snapshot_student1.yaml -n student1
volumesnapshot.snapshot.storage.k8s.io/vol-snapshot-mysql-data01 created

%:>

The above command creates a snapshot. The snapshot takes some time to be ready. We can see the status of it with the command kubectl get volumesnapshot :

%:> kubectl get volumesnapshot -n student1
NAME                        READYTOUSE   SOURCEPVC          SOURCESNAPSHOTCONTENT   RESTORESIZE   SNAPSHOTCLASS        SNAPSHOTCONTENT                                    CREATIONTIME   AGE
vol-snapshot-mysql-data01   True        pvc-mysql-data01                                         longhorn-demo-snap   snapcontent-7f89a90c-8fab-44db-9a24-1f6193f94dba                  56s

%:>

The important information is the value true in the READYTOUSE column.It must be True to use the snapshot

Now we can create a Clone from the VolumeSnapshot

We could have created a clone directly from the PVC source, but I wanted to try from the snapshot.

If you want to create the clone from the PVC.create a file pvc_mysqldata01_clone_student1.yaml and insert the following lines :

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-mysql-data01-clone
spec:
  storageClassName: longhorn-demo
  dataSource:
    name: pvc-mysql-data01
    kind: PersistentVolumeClaim
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 15Gi

Then run the following command :

%:> kubectl apply -f pvc_mysqldata01_clone_student1.yaml

Go directly to the section: Create a new MySQL Server instance

Cloning is the process of copying an online database onto another server. The copy is independent of the existing database and is preserved as a point-in-time snapshot.

You can use a cloned database for various purposes without putting a load on the production server or risking the integrity of production data. Some of these purposes include the following:

Performing analytical queries
Load testing or integration testing of your apps
Data extraction for populating data warehouses
Running experiments on the data
Migration …..
We will demonstrate the ability to clone directly from a PVC as declared in the dataSource.It’s also possible to create forks of - the SQL Server database to create even more sophisticated workflows.

It’s now possible to transform the data of the “test” deployment without disturbing the data of the “prod” deployment. This opens up the possibility to create advanced testing and development workflows that uses an exact representation of production data. Whether this dataset is a few bytes or a handful of terabytes, the operation will only take a few seconds to execute as the clones are not making any copies of the source data. We will create a Clone directly from an existing PVC without creating a snapshot.

The clone is a writable physical copy Clone, Clone

To create a clone out of a snapshot is very simple. All you need to do is create a pvc manifest that refers to the volumesnapshot object.

Copy the following YAML specification into a file called pvc_mysqldata01_clone_student1.yaml.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-mysqldata01-clone-student1
  namespace: student1
spec:
  storageClassName: longhorn-demo
  dataSource:
    name: vol-snapshot-mysql-data01
    kind: VolumeSnapshot
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 15Gi

Run the following command to create the new PVC (clone) from VolumeSnapshot.

terraform-longhorn-k8s-aws/snapclone $> kubectl apply -f pvc_mysqldata01_snap_student1.yaml -n student1
persistentvolumeclaim/pvc-mysqldata01-snap-student1 created
terraform-longhorn-k8s-aws/snapclone $>

show a PVC is created : pvc-mysqldata01-snap-student1

%:> kubectl get pvc -n student1
NAME                            STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS    AGE
pvc-mysql-data01                Bound    pvc-95f6861d-018d-4ee0-a4a0-8752cdb9860e   15Gi       RWO            longhorn-demo   7d3h
pvc-mysqldata01-clone-student1  Bound    pvc-ff44c35f-4ef8-4fa4-8fc1-5bc67371dd6f   15Gi       RWO            longhorn-demo   92s
%:>

* Now we can create a new MySQL Server deployment using PVC : pvc-mysqldata01-clone-student1*

We will use our Terraform deployment for MySQL used in the previous post.The creation of the PVC will be deleted and we will pass in parameters the name of the PVC to use or setup the name of PVC in the variables.tf file.

Create a new MySQL Server instance:

%:> cd mysqlsnapclone
%:> terraform init
%:> terrafor apply

kubernetes_secret.sqlsecret2: Creating...
kubernetes_service.mysql-deployment-student2: Creating...
kubernetes_deployment.mysql-deployment-student2: Creating...
kubernetes_secret.sqlsecret2: Creation complete after 1s [id=student1/sqlsecret2]
kubernetes_service.mysql-deployment-student2: Creation complete after 1s [id=student1/mysql-deployment-student2-service]
kubernetes_deployment.mysql-deployment-student2: Still creating... [10s elapsed]
kubernetes_deployment.mysql-deployment-student2: Still creating... [20s elapsed]
kubernetes_deployment.mysql-deployment-student2: Creation complete after 27s [id=student1/mysql-deployment-student2]

Apply complete! Resources: 3 added, 0 changed, 0 destroyed.

%:>

Check if your MySQL Server instance works:

%:> kubectl get pods -n student1

NAME                                         READY   STATUS    RESTARTS   AGE
mysql-deployment-student1-5b84f6bfc9-mk6wm   1/1     Running   0          7d4h
mysql-deployment-student2-54665b8dbf-s44t4   1/1     Running   0          69s
%:>

We have two MySQL Server instances running.

Check database employee on two MySQL Server instances :

%:> export MYSQLPOD01=`kubectl -n student1 get pods -l app=mysql | grep Running | grep student1 | awk '{print $1}'`
%:> export MYSQLPOD02=`kubectl -n student1 get pods -l app=mysql | grep Running | grep student2 | awk '{print $1}'`
%:>
%:>
%:> kubectl -n student1 exec -it $MYSQLPOD01 -- mysql -u root -pBench123 employees -e "select count(*) from employees"
+----------+
| count(*) |
+----------+
|   300024 |
+----------+
%:>
%:> kubectl -n student1 exec -it $MYSQLPOD02 -- mysql -u root -pBench123 employees -e "select count(*) from employees"
+----------+
| count(*) |
+----------+
|   300024 |
+----------+
%:>

We have two MySQL Server instances with exactly the same data, goal achieved .

Next Step

Backup Longhorn volumes to S3 bucket

Conclusion

The volume clone is similar to dynamic volume provisioning. The only difference is the fact that the clone volume is provisioned through snapshot provisioner instead of volume provisioner. The advantage of Longhorn is that. Snapshots are fast because they use Copy-On-Write. Inshort when we take snapshots we are not actually copying the volume anywhere, instead internally leverages the fs to save/record indexes. Thus, both actual volume and snapshot would refer to the same block of data unless one of them updates it. Longhorn extends the benefits of software-defined storage to cloud native through the container attached storage approach. It represents a modern, contemporary way of dealing with storage in the context of microservices and cloud native applications.

Resources :

Longhorn Documentation
CSI snapshotter