Exercise

In this exercise, you will create a Job to dump a MongoDB database. You will then create a CronJob to perform dumps at regular intervals.

  1. Creating a MongoDB Pod

In a file named mongo-pod.yaml, define the specification for a Pod named db based on the mongo:4.0 image, then create this Pod.

Note: you can also create this Pod using the imperative command kubectl run

  1. Exposing the MongoDB Database

In a file named mongo-svc.yaml, define the specification for a Service named db of type clusterIP to expose the previous Pod inside the cluster. Then create this Service.

Note: MongoDB listens by default on port 27017

Note: you can also create this Service using the imperative command kubectl expose

  1. Adding a Label to One of the Cluster Nodes

In the following questions, you will run a Job to dump the previously created database and a CronJob to perform this action at regular intervals. To ensure that the different dumps are created on the same node’s filesystem, add the label app=dump to one of your cluster’s nodes:

kubectl label node NODE_NAME app=dump

Note: in a production context, we would ensure the dump is sent directly to external storage (NFS, S3, …).

  1. Defining a Job to Dump the Database

In a file named mongo-dump-job.yaml, define the specification for a Job that launches a Pod based on mongo:4.0.

Use the nodeSelector property to deploy the Pod on the previously labeled node (you can use the kubectl explain ... command to learn how to define this property).

The Pod launched by the Job must also define a volume to persist data in the /dump directory of the node it runs on. You will use the volumes instruction in the Pod specification part:

volumes:
- name: dump
  hostPath:
    path: /dump

The mongo container of this Pod must mount this volume in its /dump directory. You will use the volumeMounts instruction in the mongo container specification:

volumeMounts:
- name: dump
  mountPath: /dump

Additionally, ensure that the Pod’s container runs the following command to create the /dump/db.gz file containing the database dump.

/bin/bash -c mongodump --gzip --host db --archive=/dump/db.gz

Note: this command uses the mongodump binary present in the mongo:4.0 image. The container will connect to the db service you launched previously.

Then launch this Job and verify that the Pod launched by this Job ran correctly.

  1. Defining a CronJob to Perform Database Dumps at Regular Intervals

In a file named mongo-dump-cronjob.yaml, define the specification for a CronJob that performs a mongo dump every minute.

Use the nodeSelector property to deploy the Pod on the previously labeled node (you can use the kubectl explain ... command to learn how to define this property).

To preserve the different dumps, make the Pod’s container run the following command (this adds a timestamp to the generated dump filename):

/bin/bash -c mongodump --gzip --host db --archive=/dump/$(date +"%Y%m%dT%H%M%S")-db.gz

Then launch this CronJob.

  1. Verifying the Dumps

Launch a test Pod ensuring it is scheduled on the node with the label app: dump and has access to the /dump directory of this node.

From a shell in this Pod, verify that the dumps have been created.

  1. Verifying the Dumps (Alternative Method)

Use the kubectl debug command to launch an alpine pod on one of your cluster nodes.

  1. Finally, delete the previous Job and CronJob.

Solution
  1. The following specification defines the db Pod based on mongo:4.0.
mongo-pod.yaml
apiVersion: v1             
kind: Pod                  
metadata:
  name: db
  labels:
    app: db
spec:
  containers:
  - name: mongo
    image: mongo:4.0

Copy this specification to mongo-pod.yaml and create the Pod:

kubectl apply -f mongo-pod.yaml

Note: you can also use the following imperative command to create the db Pod

kubectl run db --image=mongo:4.0
  1. The following specification defines the db Service of type ClusterIP. This service exposes the previous Pod inside the cluster.
mongo-svc.yaml
apiVersion: v1
kind: Service
metadata:
  name: db
spec:
  selector:
    app: db
  type: ClusterIP
  ports:
  - port: 27017

Copy this specification to mongo-svc.yaml and create the Service:

kubectl apply -f mongo-svc.yaml

Note: you can also create this Service with the following imperative command:

kubectl expose pod/db --port 27017 --target-port 27017
  1. The following specification defines a Job that performs the database dump.
mongo-dump-job.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: dump
spec:
  template:
    spec:
      restartPolicy: Never
      nodeSelector:
        app: dump
      containers:
      - name: mongo
        image: mongo:4.0
        command:
        - /bin/bash
        - -c
        - mongodump --gzip --host db --archive=/dump/db.gz
        volumeMounts:
        - name: dump
          mountPath: /dump
      volumes:
      - name: dump
        hostPath:
          path: /dump

Copy this specification to mongo-dump-job.yaml and create the Job:

kubectl apply -f mongo-dump-job.yaml

After a few seconds, you can verify that the Pod launched by the Job is in the Completed state:

$ kubectl get po
NAME         READY   STATUS      RESTARTS   AGE
dump-r5jg6   0/1     Completed   0          32s

We can also look at the Pod’s logs to confirm that the dump was performed correctly:

$ kubectl logs dump-r5jg6
2022-05-24T20:23:23.865+0000	writing admin.system.version to archive '/dump/db.gz'
2022-05-24T20:23:23.870+0000	done dumping admin.system.version (1 document)
  1. The following specification defines a CronJob that performs database dumps, accessible via the service named db, every minute.
mongo-dump-cronjob.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: dump
spec:
  schedule: "* * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          nodeSelector:
            app: dump
          containers:
          - name: mongo
            image: mongo:4.0
            command:
            - /bin/bash
            - -c
            - mongodump --gzip --host db --archive=/dump/$(date +"%Y%m%dT%H%M%S")-db.gz
            volumeMounts:
            - name: dump
              mountPath: /dump
          restartPolicy: OnFailure
          volumes:
          - name: dump
            hostPath:
              path: /dump

Copy this specification to mongo-dump-cronjob.yaml and create the CronJob:

kubectl apply -f mongo-dump-cronjob.yaml
  1. The following command launches the requested test Pod:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: test
spec:
  nodeSelector:
    app: dump
  containers:
  - name: test
    image: alpine:3.15
    command:
    - "sleep"
    - "10000"
    volumeMounts:
    - name: dump
      mountPath: /dump
  volumes:
  - name: dump
    hostPath:
      path: /dump
EOF

Then launch an interactive shell in the container of this Pod:

kubectl exec -ti test -- sh

From this shell, you can observe the created dumps:

# ls /dump
20220524T202900-db.gz  20220524T203000-db.gz  20220524T203100-db.gz  db.gz
  1. Verifying the Dumps (Alternative Method)

The following command launches a debug Pod whose single alpine container will be launched in the pid and network namespaces of node NODE_NAME. The node’s filesystem will be automatically mounted in the container’s /host directory:

kubectl debug node/NODE_NAME -it --image=alpine

The dumps are present in /host/dump from the alpine container.

  1. The following command removes the different created resources:
kubectl delete job/dump cj/dump po/test po/db svc/db