Exercise
In this exercise (from the official Kubernetes documentation), we will use a StatefulSet to set up a MySQL cluster. This will create 3 Pods, one master and two slaves:
- the master will be configured to allow writing
- the slaves will be configured to allow reading
Prerequisites
This exercise can be performed on a cluster deployed with a cloud provider or simply on a local cluster.
Also ensure you have a default StorageClass in your cluster. If not, you might consider installing Longhorn.
Master and Slave Configuration
Let’s start by creating a ConfigMap resource with the following specification:
apiVersion: v1
kind: ConfigMap
metadata:
name: mysql
labels:
app: mysql
data:
master.cnf: |
# Apply this config only on the master.
[mysqld]
log-bin
slave.cnf: |
# Apply this config only on slaves.
[mysqld]
super-read-only
This contains:
- the configuration to use for the MySQL master (key master.cnf)
- the configuration to use for MySQL slaves (key slave.cnf)
Save this specification in the cm.yaml file and create the resource with the following command:
kubectl apply -f cm.yaml
Setting up Services
We will now create 2 services:
- the first, called a Headless service, will be used to give a stable network identity to each Pod that will be created by the StatefulSet
- the second of type ClusterIP will enable load balancing between slave-type Pods
Creating the Headless Service
The following specification defines the mysql Service to give each Pod an identity. The .spec.clusterIP property having the value None means this Service will not load balance between Pods
apiVersion: v1
kind: Service
metadata:
name: mysql
labels:
app: mysql
spec:
ports:
- name: mysql
port: 3306
clusterIP: None
selector:
app: mysql
Copy this specification into the headless-svc.yaml file and create this Service with the following command:
kubectl apply -f headless-svc.yaml
Creating the ClusterIP Service
The following specification defines the Service that will handle load balancing between slaves for read operations.
Note: write operations must be performed by connecting to the master
apiVersion: v1
kind: Service
metadata:
name: mysql-read
labels:
app: mysql
spec:
type: ClusterIP
ports:
- name: mysql
port: 3306
selector:
app: mysql
Copy this specification into the mysql-svc.yaml file and create this Service with the following command:
kubectl apply -f mysql-svc.yaml
Creating the MySQL cluster
We will now create the MySQL cluster from the StatefulSet defined in the following specification:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
spec:
selector:
matchLabels:
app: mysql
serviceName: mysql
replicas: 3
template:
metadata:
labels:
app: mysql
spec:
initContainers:
- name: init-mysql
image: mysql:5.7.36
command:
- bash
- "-c"
- |
set -ex
# Generate mysql server-id from pod ordinal index.
[[ `hostname` =~ -([0-9]+)$ ]] || exit 1
ordinal=${BASH_REMATCH[1]}
echo [mysqld] > /mnt/conf.d/server-id.cnf
# Add an offset to avoid reserved server-id=0 value.
echo server-id=$((100 + $ordinal)) >> /mnt/conf.d/server-id.cnf
# Copy appropriate conf.d files from config-map to emptyDir.
if [[ $ordinal -eq 0 ]]; then
cp /mnt/config-map/master.cnf /mnt/conf.d/
else
cp /mnt/config-map/slave.cnf /mnt/conf.d/
fi
volumeMounts:
- name: conf
mountPath: /mnt/conf.d
- name: config-map
mountPath: /mnt/config-map
- name: clone-mysql
image: gcr.io/google-samples/xtrabackup:1.0
command:
- bash
- "-c"
- |
set -ex
# Skip the clone if data already exists.
[[ -d /var/lib/mysql/mysql ]] && exit 0
# Skip the clone on master (ordinal index 0).
[[ `hostname` =~ -([0-9]+)$ ]] || exit 1
ordinal=${BASH_REMATCH[1]}
[[ $ordinal -eq 0 ]] && exit 0
# Clone data from previous peer.
ncat --recv-only mysql-$(($ordinal-1)).mysql 3307 | xbstream -x -C /var/lib/mysql
# Prepare the backup.
xtrabackup --prepare --target-dir=/var/lib/mysql
volumeMounts:
- name: data
mountPath: /var/lib/mysql
subPath: mysql
- name: conf
mountPath: /etc/mysql/conf.d
containers:
- name: mysql
image: mysql:5.7.36
env:
- name: MYSQL_ALLOW_EMPTY_PASSWORD
value: "1"
ports:
- name: mysql
containerPort: 3306
volumeMounts:
- name: data
mountPath: /var/lib/mysql
subPath: mysql
- name: conf
mountPath: /etc/mysql/conf.d
resources:
requests:
cpu: 500m
memory: 1Gi
livenessProbe:
exec:
command: ["mysqladmin", "ping"]
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
readinessProbe:
exec:
# Check we can execute queries over TCP (skip-networking is off).
command: ["mysql", "-h", "127.0.0.1", "-e", "SELECT 1"]
initialDelaySeconds: 5
periodSeconds: 2
timeoutSeconds: 1
- name: xtrabackup
image: gcr.io/google-samples/xtrabackup:1.0
ports:
- name: xtrabackup
containerPort: 3307
command:
- bash
- "-c"
- |
set -ex
cd /var/lib/mysql
# Determine binlog position of cloned data, if any.
if [[ -f xtrabackup_slave_info && "x$(<xtrabackup_slave_info)" != "x" ]]; then
# XtraBackup already generated a partial "CHANGE MASTER TO" query
# because we're cloning from an existing slave. (Need to remove the tailing semicolon!)
cat xtrabackup_slave_info | sed -E 's/;$//g' > change_master_to.sql.in
# Ignore xtrabackup_binlog_info in this case (it's useless).
rm -f xtrabackup_slave_info xtrabackup_binlog_info
elif [[ -f xtrabackup_binlog_info ]]; then
# We're cloning directly from master. Parse binlog position.
[[ `cat xtrabackup_binlog_info` =~ ^(.*?)[[:space:]]+(.*?)$ ]] || exit 1
rm -f xtrabackup_binlog_info xtrabackup_slave_info
echo "CHANGE MASTER TO MASTER_LOG_FILE='${BASH_REMATCH[1]}',\
MASTER_LOG_POS=${BASH_REMATCH[2]}" > change_master_to.sql.in
fi
# Check if we need to complete a clone by starting replication.
if [[ -f change_master_to.sql.in ]]; then
echo "Waiting for mysqld to be ready (accepting connections)"
until mysql -h 127.0.0.1 -e "SELECT 1"; do sleep 1; done
echo "Initializing replication from clone position"
mysql -h 127.0.0.1 \
-e "$(<change_master_to.sql.in), \
MASTER_HOST='mysql-0.mysql', \
MASTER_USER='root', \
MASTER_PASSWORD='', \
MASTER_CONNECT_RETRY=10; \
START SLAVE;" || exit 1
# In case of container restart, attempt this at-most-once.
mv change_master_to.sql.in change_master_to.sql.orig
fi
# Start a server to send backups when requested by peers.
exec ncat --listen --keep-open --send-only --max-conns=1 3307 -c \
"xtrabackup --backup --slave-info --stream=xbstream --host=127.0.0.1 --user=root"
volumeMounts:
- name: data
mountPath: /var/lib/mysql
subPath: mysql
- name: conf
mountPath: /etc/mysql/conf.d
resources:
requests:
cpu: 100m
memory: 100Mi
volumes:
- name: conf
emptyDir: {}
- name: config-map
configMap:
name: mysql
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 2Gi
This specification is quite impressive, mainly due to the presence of some shell scripts that are opaque at first glance. Let’s explain the elements it contains:
-
the value of the .spec.replicas key indicates that 3 Pods will be launched. Due to the nature of StatefulSet, these Pods will be launched sequentially, with a Pod being launched when the previous Pod is active. Each of these Pods will have a name consisting of the StatefulSet name to which an incremented integer is added for each new Pod:
- the first Pod mysql-0 will be the master Pod
- the following ones, mysql-1 and mysql-2, will be slave Pods
-
the .spec.volumeClaimTemplates key defines the storage requested for each Pod, here 2Gi, which will be provisioned using the default StorageClass
-
the Pod specification is defined under the .spec.template.spec key
-
2 volumes are defined for each Pod:
- the first is based on the ConfigMap created earlier, which contains both master and slave configurations
- the second is an empty directory created on the host machine where the Pod configuration (master / slave) will be copied and read when the MySQL process starts
-
the Pod specification contains 2 container lists:
- .spec.template.spec.initContainers
- .spec.template.spec.containers
-
initContainers are containers responsible for preparing the environment. This list contains 2 containers:
-
init-mysql determines, based on the Pod ID (0, 1, or 2), which configuration should be used from the ConfigMap and copies it to the emptyDir type volume. If the ID is 0 (master case), the master.cnf config is copied, if the ID is 1 or 2 (slave case) the slave.cnf config is copied
-
clone-mysql clones the data from the Pod that was created before the current Pod. This action will only be performed for slave-type Pods
-
-
containers contains the list of containers that are launched once the initContainers are complete. This list contains 2 containers:
-
mysql launches one of the MySQL processes of the cluster. It uses the configuration available in the ConfigMap named conf set up by the init-mysql container
-
xtrabackup allows setting up the backup strategy
-
Copy this specification into the mysql-statefulset.yaml file, then create the resource with the following command:
kubectl apply -f mysql-statefulset.yaml
Verification
We can see that in a few tens of seconds the 3 Pods, mysql-0, mysql-1, and mysql-2 are created:
Note: the –watch option allows you to observe the creation of these Pods interactively:
$ kubectl get pods -l app=mysql --watch
NAME READY STATUS RESTARTS AGE
mysql-0 1/2 Running 0 59s
mysql-0 2/2 Running 0 76s
mysql-1 0/2 Pending 0 0s
mysql-1 0/2 Pending 0 0s
mysql-1 0/2 Pending 0 5s
mysql-1 0/2 Init:0/2 0 5s
mysql-1 0/2 Init:1/2 0 29s
mysql-1 0/2 Init:1/2 0 44s
mysql-1 0/2 PodInitializing 0 62s
mysql-1 1/2 Running 0 63s
mysql-1 2/2 Running 0 67s
mysql-2 0/2 Pending 0 0s
mysql-2 0/2 Pending 0 0s
mysql-2 0/2 Pending 0 4s
mysql-2 0/2 Init:0/2 0 4s
mysql-2 0/2 Init:1/2 0 28s
mysql-2 0/2 Init:1/2 0 38s
mysql-2 0/2 PodInitializing 0 58s
mysql-2 1/2 Running 0 59s
mysql-2 2/2 Running 0 64s
Write Test
We will now launch a Pod running a container based on the mysql image and from this Pod perform the following actions:
- connection to the master Pod
- creation of the test database
- creation of the message table
- adding a record to this table
Use the following command:
$ kubectl run mysql-client-write --image=mysql:5.7 -i --rm --restart=Never -- \
mysql -h mysql-0.mysql <<EOF
CREATE DATABASE test;
CREATE TABLE test.messages (message VARCHAR(250));
INSERT INTO test.messages VALUES ('hello');
EOF
Read Test
To verify that the previous record was created successfully, we will proceed in the same way and launch a Pod running a container based on the mysql image and from this Pod perform the following actions:
- connection to the mysql-read Service (responsible for load balancing requests to the cluster’s slaves)
- retrieving elements from the message table in the test database
$ kubectl run mysql-client-read --image=mysql:5.7 -i -t --rm --restart=Never --\
mysql -h mysql-read -e "SELECT * FROM test.messages"
+---------+
| message |
+---------+
| hello |
+---------+
The command result shows that the data written from the master is indeed available for reading from a slave.
Storage
If we list the PersistentVolumeClaims and PersistentVolumes, we can see that a PVC was created for each Pod and a PV was dynamically provisioned and associated with this PVC.
Note: the StorageClass used here is named standard, it’s the default StorageClass on a Minikube cluster. However, the default storage class in your cluster may have a different name.
$ kubectl get pvc,pv
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/data-mysql-0 Bound pvc-abded038-c28d-4fe6-8d8e-340e5cc745f7 2Gi RWO standard 8m11s
persistentvolumeclaim/data-mysql-1 Bound pvc-97587c8e-dfdf-448a-af12-75d6e1e5d770 2Gi RWO standard 7m23s
persistentvolumeclaim/data-mysql-2 Bound pvc-93aa73c9-a396-41ad-bb91-ec690bd00674 2Gi RWO standard 7m5s
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pvc-93aa73c9-a396-41ad-bb91-ec690bd00674 2Gi RWO Delete Bound default/data-mysql-2 standard 7m5s
persistentvolume/pvc-97587c8e-dfdf-448a-af12-75d6e1e5d770 2Gi RWO Delete Bound default/data-mysql-1 standard 7m23s
persistentvolume/pvc-abded038-c28d-4fe6-8d8e-340e5cc745f7 2Gi RWO Delete Bound default/data-mysql-0 standard 8m11s
Cleanup
kubectl delete statefulset/mysql
kubectl delete -f headless-svc.yaml -f mysql-svc.yaml -f cm.yaml
Summary
Here we’ve seen how to set up a MySQL cluster using a StatefulSet. As you can see, this resource is more complex than the resources used for managing stateless workloads, meaning those that don’t have data to manage. In practice, the management of database clusters, or other complex applications, is often done using Operators, processes in which all the application management logic is coded.