luc.run

Exercise

In this exercise (from the official Kubernetes documentation), we will use a StatefulSet to set up a MySQL cluster. This will create 3 Pods, one master and two slaves:

the master will be configured to allow writing
the slaves will be configured to allow reading

Prerequisites

This exercise can be performed on a cluster deployed with a cloud provider or simply on a local cluster.

Also ensure you have a default StorageClass in your cluster. If not, you might consider installing Longhorn.

Master and Slave Configuration

Let’s start by creating a ConfigMap resource with the following specification:

cm.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: mysql
  labels:
    app: mysql
data:
  master.cnf: |
    # Apply this config only on the master.
    [mysqld]
    log-bin
  slave.cnf: |
    # Apply this config only on slaves.
    [mysqld]
    super-read-only

This contains:

the configuration to use for the MySQL master (key master.cnf)
the configuration to use for MySQL slaves (key slave.cnf)

Save this specification in the cm.yaml file and create the resource with the following command:

kubectl apply -f cm.yaml

Setting up Services

We will now create 2 services:

the first, called a Headless service, will be used to give a stable network identity to each Pod that will be created by the StatefulSet
the second of type ClusterIP will enable load balancing between slave-type Pods

Creating the Headless Service

The following specification defines the mysql Service to give each Pod an identity. The .spec.clusterIP property having the value None means this Service will not load balance between Pods

headless-svc.yaml

apiVersion: v1
kind: Service
metadata:
  name: mysql
  labels:
    app: mysql
spec:
  ports:
  - name: mysql
    port: 3306
  clusterIP: None
  selector:
    app: mysql

Copy this specification into the headless-svc.yaml file and create this Service with the following command:

kubectl apply -f headless-svc.yaml

Creating the ClusterIP Service

The following specification defines the Service that will handle load balancing between slaves for read operations.

Note: write operations must be performed by connecting to the master

mysql-svc.yaml

apiVersion: v1
kind: Service
metadata:
  name: mysql-read
  labels:
    app: mysql
spec:
  type: ClusterIP
  ports:
  - name: mysql
    port: 3306
  selector:
    app: mysql

Copy this specification into the mysql-svc.yaml file and create this Service with the following command:

kubectl apply -f mysql-svc.yaml

Creating the MySQL cluster

We will now create the MySQL cluster from the StatefulSet defined in the following specification:

mysql-statefulset.yaml

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql
spec:
  selector:
    matchLabels:
      app: mysql
  serviceName: mysql
  replicas: 3
  template:
    metadata:
      labels:
        app: mysql
    spec:
      initContainers:
      - name: init-mysql
        image: mysql:5.7.36
        command:
        - bash
        - "-c"
        - |
          set -ex
          # Generate mysql server-id from pod ordinal index.
          [[ `hostname` =~ -([0-9]+)$ ]] || exit 1
          ordinal=${BASH_REMATCH[1]}
          echo [mysqld] > /mnt/conf.d/server-id.cnf
          # Add an offset to avoid reserved server-id=0 value.
          echo server-id=$((100 + $ordinal)) >> /mnt/conf.d/server-id.cnf
          # Copy appropriate conf.d files from config-map to emptyDir.
          if [[ $ordinal -eq 0 ]]; then
            cp /mnt/config-map/master.cnf /mnt/conf.d/
          else
            cp /mnt/config-map/slave.cnf /mnt/conf.d/
          fi
        volumeMounts:
        - name: conf
          mountPath: /mnt/conf.d
        - name: config-map
          mountPath: /mnt/config-map
      - name: clone-mysql
        image: gcr.io/google-samples/xtrabackup:1.0
        command:
        - bash
        - "-c"
        - |
          set -ex
          # Skip the clone if data already exists.
          [[ -d /var/lib/mysql/mysql ]] && exit 0
          # Skip the clone on master (ordinal index 0).
          [[ `hostname` =~ -([0-9]+)$ ]] || exit 1
          ordinal=${BASH_REMATCH[1]}
          [[ $ordinal -eq 0 ]] && exit 0
          # Clone data from previous peer.
          ncat --recv-only mysql-$(($ordinal-1)).mysql 3307 | xbstream -x -C /var/lib/mysql
          # Prepare the backup.
          xtrabackup --prepare --target-dir=/var/lib/mysql
        volumeMounts:
        - name: data
          mountPath: /var/lib/mysql
          subPath: mysql
        - name: conf
          mountPath: /etc/mysql/conf.d
      containers:
      - name: mysql
        image: mysql:5.7.36
        env:
        - name: MYSQL_ALLOW_EMPTY_PASSWORD
          value: "1"
        ports:
        - name: mysql
          containerPort: 3306
        volumeMounts:
        - name: data
          mountPath: /var/lib/mysql
          subPath: mysql
        - name: conf
          mountPath: /etc/mysql/conf.d
        resources:
          requests:
            cpu: 500m
            memory: 1Gi
        livenessProbe:
          exec:
            command: ["mysqladmin", "ping"]
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
        readinessProbe:
          exec:
            # Check we can execute queries over TCP (skip-networking is off).
            command: ["mysql", "-h", "127.0.0.1", "-e", "SELECT 1"]
          initialDelaySeconds: 5
          periodSeconds: 2
          timeoutSeconds: 1
      - name: xtrabackup
        image: gcr.io/google-samples/xtrabackup:1.0
        ports:
        - name: xtrabackup
          containerPort: 3307
        command:
        - bash
        - "-c"
        - |
          set -ex
          cd /var/lib/mysql

          # Determine binlog position of cloned data, if any.
          if [[ -f xtrabackup_slave_info && "x$(<xtrabackup_slave_info)" != "x" ]]; then
            # XtraBackup already generated a partial "CHANGE MASTER TO" query
            # because we're cloning from an existing slave. (Need to remove the tailing semicolon!)
            cat xtrabackup_slave_info | sed -E 's/;$//g' > change_master_to.sql.in
            # Ignore xtrabackup_binlog_info in this case (it's useless).
            rm -f xtrabackup_slave_info xtrabackup_binlog_info
          elif [[ -f xtrabackup_binlog_info ]]; then
            # We're cloning directly from master. Parse binlog position.
            [[ `cat xtrabackup_binlog_info` =~ ^(.*?)[[:space:]]+(.*?)$ ]] || exit 1
            rm -f xtrabackup_binlog_info xtrabackup_slave_info
            echo "CHANGE MASTER TO MASTER_LOG_FILE='${BASH_REMATCH[1]}',\
                  MASTER_LOG_POS=${BASH_REMATCH[2]}" > change_master_to.sql.in
          fi

          # Check if we need to complete a clone by starting replication.
          if [[ -f change_master_to.sql.in ]]; then
            echo "Waiting for mysqld to be ready (accepting connections)"
            until mysql -h 127.0.0.1 -e "SELECT 1"; do sleep 1; done

            echo "Initializing replication from clone position"
            mysql -h 127.0.0.1 \
                  -e "$(<change_master_to.sql.in), \
                          MASTER_HOST='mysql-0.mysql', \
                          MASTER_USER='root', \
                          MASTER_PASSWORD='', \
                          MASTER_CONNECT_RETRY=10; \
                        START SLAVE;" || exit 1
            # In case of container restart, attempt this at-most-once.
            mv change_master_to.sql.in change_master_to.sql.orig
          fi

          # Start a server to send backups when requested by peers.
          exec ncat --listen --keep-open --send-only --max-conns=1 3307 -c \
            "xtrabackup --backup --slave-info --stream=xbstream --host=127.0.0.1 --user=root"
        volumeMounts:
        - name: data
          mountPath: /var/lib/mysql
          subPath: mysql
        - name: conf
          mountPath: /etc/mysql/conf.d
        resources:
          requests:
            cpu: 100m
            memory: 100Mi
      volumes:
      - name: conf
        emptyDir: {}
      - name: config-map
        configMap:
          name: mysql
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 2Gi

This specification is quite impressive, mainly due to the presence of some shell scripts that are opaque at first glance. Let’s explain the elements it contains:

the value of the .spec.replicas key indicates that 3 Pods will be launched. Due to the nature of StatefulSet, these Pods will be launched sequentially, with a Pod being launched when the previous Pod is active. Each of these Pods will have a name consisting of the StatefulSet name to which an incremented integer is added for each new Pod:
- the first Pod mysql-0 will be the master Pod
- the following ones, mysql-1 and mysql-2, will be slave Pods
the .spec.volumeClaimTemplates key defines the storage requested for each Pod, here 2Gi, which will be provisioned using the default StorageClass
the Pod specification is defined under the .spec.template.spec key
2 volumes are defined for each Pod:
- the first is based on the ConfigMap created earlier, which contains both master and slave configurations
- the second is an empty directory created on the host machine where the Pod configuration (master / slave) will be copied and read when the MySQL process starts
the Pod specification contains 2 container lists:
- .spec.template.spec.initContainers
- .spec.template.spec.containers
initContainers are containers responsible for preparing the environment. This list contains 2 containers:
- init-mysql determines, based on the Pod ID (0, 1, or 2), which configuration should be used from the ConfigMap and copies it to the emptyDir type volume. If the ID is 0 (master case), the master.cnf config is copied, if the ID is 1 or 2 (slave case) the slave.cnf config is copied
- clone-mysql clones the data from the Pod that was created before the current Pod. This action will only be performed for slave-type Pods
containers contains the list of containers that are launched once the initContainers are complete. This list contains 2 containers:
- mysql launches one of the MySQL processes of the cluster. It uses the configuration available in the ConfigMap named conf set up by the init-mysql container
- xtrabackup allows setting up the backup strategy

Copy this specification into the mysql-statefulset.yaml file, then create the resource with the following command:

kubectl apply -f mysql-statefulset.yaml

Verification

We can see that in a few tens of seconds the 3 Pods, mysql-0, mysql-1, and mysql-2 are created:

Note: the –watch option allows you to observe the creation of these Pods interactively:

$ kubectl get pods -l app=mysql --watch
NAME      READY   STATUS           RESTARTS   AGE
mysql-0   1/2     Running           0          59s
mysql-0   2/2     Running           0          76s
mysql-1   0/2     Pending           0          0s
mysql-1   0/2     Pending           0          0s
mysql-1   0/2     Pending           0          5s
mysql-1   0/2     Init:0/2          0          5s
mysql-1   0/2     Init:1/2          0          29s
mysql-1   0/2     Init:1/2          0          44s
mysql-1   0/2     PodInitializing   0          62s
mysql-1   1/2     Running           0          63s
mysql-1   2/2     Running           0          67s
mysql-2   0/2     Pending           0          0s
mysql-2   0/2     Pending           0          0s
mysql-2   0/2     Pending           0          4s
mysql-2   0/2     Init:0/2          0          4s
mysql-2   0/2     Init:1/2          0          28s
mysql-2   0/2     Init:1/2          0          38s
mysql-2   0/2     PodInitializing   0          58s
mysql-2   1/2     Running           0          59s
mysql-2   2/2     Running           0          64s

Write Test

We will now launch a Pod running a container based on the mysql image and from this Pod perform the following actions:

connection to the master Pod
creation of the test database
creation of the message table
adding a record to this table

Use the following command:

$ kubectl run mysql-client-write --image=mysql:5.7 -i --rm --restart=Never -- \
  mysql -h mysql-0.mysql <<EOF
CREATE DATABASE test;
CREATE TABLE test.messages (message VARCHAR(250));
INSERT INTO test.messages VALUES ('hello');
EOF

Read Test

To verify that the previous record was created successfully, we will proceed in the same way and launch a Pod running a container based on the mysql image and from this Pod perform the following actions:

connection to the mysql-read Service (responsible for load balancing requests to the cluster’s slaves)
retrieving elements from the message table in the test database

$ kubectl run mysql-client-read --image=mysql:5.7 -i -t --rm --restart=Never --\
  mysql -h mysql-read -e "SELECT * FROM test.messages"
+---------+
| message |
+---------+
| hello   |
+---------+

The command result shows that the data written from the master is indeed available for reading from a slave.

Storage

If we list the PersistentVolumeClaims and PersistentVolumes, we can see that a PVC was created for each Pod and a PV was dynamically provisioned and associated with this PVC.

Note: the StorageClass used here is named standard, it’s the default StorageClass on a Minikube cluster. However, the default storage class in your cluster may have a different name.

$ kubectl get pvc,pv
NAME                                 STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
persistentvolumeclaim/data-mysql-0   Bound    pvc-abded038-c28d-4fe6-8d8e-340e5cc745f7   2Gi       RWO            standard       8m11s
persistentvolumeclaim/data-mysql-1   Bound    pvc-97587c8e-dfdf-448a-af12-75d6e1e5d770   2Gi       RWO            standard       7m23s
persistentvolumeclaim/data-mysql-2   Bound    pvc-93aa73c9-a396-41ad-bb91-ec690bd00674   2Gi       RWO            standard       7m5s

NAME                                                        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                  STORAGECLASS   REASON   AGE
persistentvolume/pvc-93aa73c9-a396-41ad-bb91-ec690bd00674   2Gi       RWO            Delete           Bound    default/data-mysql-2   standard                7m5s
persistentvolume/pvc-97587c8e-dfdf-448a-af12-75d6e1e5d770   2Gi       RWO            Delete           Bound    default/data-mysql-1   standard                7m23s
persistentvolume/pvc-abded038-c28d-4fe6-8d8e-340e5cc745f7   2Gi       RWO            Delete           Bound    default/data-mysql-0   standard                8m11s

Cleanup

kubectl delete statefulset/mysql
kubectl delete -f headless-svc.yaml -f mysql-svc.yaml -f cm.yaml

Summary

Here we’ve seen how to set up a MySQL cluster using a StatefulSet. As you can see, this resource is more complex than the resources used for managing stateless workloads, meaning those that don’t have data to manage. In practice, the management of database clusters, or other complex applications, is often done using Operators, processes in which all the application management logic is coded.