Exercise
In this exercise, you will create a Job to dump a MongoDB database. You will then create a CronJob to perform dumps at regular intervals.
- Creating a MongoDB Pod
In a file named mongo-pod.yaml, define the specification for a Pod named db based on the mongo:4.0 image, then create this Pod.
Note: you can also create this Pod using the imperative command kubectl run
- Exposing the MongoDB Database
In a file named mongo-svc.yaml, define the specification for a Service named db of type clusterIP to expose the previous Pod inside the cluster. Then create this Service.
Note: MongoDB listens by default on port 27017
Note: you can also create this Service using the imperative command kubectl expose
- Adding a Label to One of the Cluster Nodes
In the following questions, you will run a Job to dump the previously created database and a CronJob to perform this action at regular intervals. To ensure that the different dumps are created on the same node’s filesystem, add the label app=dump to one of your cluster’s nodes:
kubectl label node NODE_NAME app=dump
Note: in a production context, we would ensure the dump is sent directly to external storage (NFS, S3, …).
- Defining a Job to Dump the Database
In a file named mongo-dump-job.yaml, define the specification for a Job that launches a Pod based on mongo:4.0.
Use the nodeSelector property to deploy the Pod on the previously labeled node (you can use the kubectl explain ...
command to learn how to define this property).
The Pod launched by the Job must also define a volume to persist data in the /dump directory of the node it runs on. You will use the volumes instruction in the Pod specification part:
volumes:
- name: dump
hostPath:
path: /dump
The mongo container of this Pod must mount this volume in its /dump directory. You will use the volumeMounts instruction in the mongo container specification:
volumeMounts:
- name: dump
mountPath: /dump
Additionally, ensure that the Pod’s container runs the following command to create the /dump/db.gz file containing the database dump.
/bin/bash -c mongodump --gzip --host db --archive=/dump/db.gz
Note: this command uses the mongodump binary present in the mongo:4.0 image. The container will connect to the db service you launched previously.
Then launch this Job and verify that the Pod launched by this Job ran correctly.
- Defining a CronJob to Perform Database Dumps at Regular Intervals
In a file named mongo-dump-cronjob.yaml, define the specification for a CronJob that performs a mongo dump every minute.
Use the nodeSelector property to deploy the Pod on the previously labeled node (you can use the kubectl explain ...
command to learn how to define this property).
To preserve the different dumps, make the Pod’s container run the following command (this adds a timestamp to the generated dump filename):
/bin/bash -c mongodump --gzip --host db --archive=/dump/$(date +"%Y%m%dT%H%M%S")-db.gz
Then launch this CronJob.
- Verifying the Dumps
Launch a test Pod ensuring it is scheduled on the node with the label app: dump and has access to the /dump directory of this node.
From a shell in this Pod, verify that the dumps have been created.
- Verifying the Dumps (Alternative Method)
Use the kubectl debug
command to launch an alpine pod on one of your cluster nodes.
- Finally, delete the previous Job and CronJob.
Solution
- The following specification defines the db Pod based on mongo:4.0.
apiVersion: v1
kind: Pod
metadata:
name: db
labels:
app: db
spec:
containers:
- name: mongo
image: mongo:4.0
Copy this specification to mongo-pod.yaml and create the Pod:
kubectl apply -f mongo-pod.yaml
Note: you can also use the following imperative command to create the db Pod
kubectl run db --image=mongo:4.0
- The following specification defines the db Service of type ClusterIP. This service exposes the previous Pod inside the cluster.
apiVersion: v1
kind: Service
metadata:
name: db
spec:
selector:
app: db
type: ClusterIP
ports:
- port: 27017
Copy this specification to mongo-svc.yaml and create the Service:
kubectl apply -f mongo-svc.yaml
Note: you can also create this Service with the following imperative command:
kubectl expose pod/db --port 27017 --target-port 27017
- The following specification defines a Job that performs the database dump.
apiVersion: batch/v1
kind: Job
metadata:
name: dump
spec:
template:
spec:
restartPolicy: Never
nodeSelector:
app: dump
containers:
- name: mongo
image: mongo:4.0
command:
- /bin/bash
- -c
- mongodump --gzip --host db --archive=/dump/db.gz
volumeMounts:
- name: dump
mountPath: /dump
volumes:
- name: dump
hostPath:
path: /dump
Copy this specification to mongo-dump-job.yaml and create the Job:
kubectl apply -f mongo-dump-job.yaml
After a few seconds, you can verify that the Pod launched by the Job is in the Completed state:
$ kubectl get po
NAME READY STATUS RESTARTS AGE
dump-r5jg6 0/1 Completed 0 32s
We can also look at the Pod’s logs to confirm that the dump was performed correctly:
$ kubectl logs dump-r5jg6
2022-05-24T20:23:23.865+0000 writing admin.system.version to archive '/dump/db.gz'
2022-05-24T20:23:23.870+0000 done dumping admin.system.version (1 document)
- The following specification defines a CronJob that performs database dumps, accessible via the service named db, every minute.
apiVersion: batch/v1
kind: CronJob
metadata:
name: dump
spec:
schedule: "* * * * *"
jobTemplate:
spec:
template:
spec:
nodeSelector:
app: dump
containers:
- name: mongo
image: mongo:4.0
command:
- /bin/bash
- -c
- mongodump --gzip --host db --archive=/dump/$(date +"%Y%m%dT%H%M%S")-db.gz
volumeMounts:
- name: dump
mountPath: /dump
restartPolicy: OnFailure
volumes:
- name: dump
hostPath:
path: /dump
Copy this specification to mongo-dump-cronjob.yaml and create the CronJob:
kubectl apply -f mongo-dump-cronjob.yaml
- The following command launches the requested test Pod:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: test
spec:
nodeSelector:
app: dump
containers:
- name: test
image: alpine:3.15
command:
- "sleep"
- "10000"
volumeMounts:
- name: dump
mountPath: /dump
volumes:
- name: dump
hostPath:
path: /dump
EOF
Then launch an interactive shell in the container of this Pod:
kubectl exec -ti test -- sh
From this shell, you can observe the created dumps:
# ls /dump
20220524T202900-db.gz 20220524T203000-db.gz 20220524T203100-db.gz db.gz
- Verifying the Dumps (Alternative Method)
The following command launches a debug Pod whose single alpine container will be launched in the pid and network namespaces of node NODE_NAME. The node’s filesystem will be automatically mounted in the container’s /host directory:
kubectl debug node/NODE_NAME -it --image=alpine
The dumps are present in /host/dump from the alpine container.
- The following command removes the different created resources:
kubectl delete job/dump cj/dump po/test po/db svc/db