Exercise

In this exercise, we will illustrate multi-stage build

Reminder

The Dockerfile contains a list of instructions that creates an image. The first instruction is FROM, which defines the base image used. This base image often contains many elements (binaries and libraries) that the final application doesn’t need (compiler, …). This can significantly impact the image size and also its security since it can considerably increase its attack surface. This is where multi-stage build comes in…

A http server written in Go

Let’s take the example of the following program written in Go.

In a new directory, create the file http.go containing the following code. This defines a simple http server that listens on port 8080 and exposes the /whoami endpoint in GET. For each request, it returns the hostname of the host machine on which it runs.

http.go
package main

import (
        "io"
        "net/http"
        "os"
)

func handler(w http.ResponseWriter, req *http.Request) {
        host, err := os.Hostname()
        if err != nil {
          io.WriteString(w, "unknown")
        } else {
          io.WriteString(w, host)
        }
}

func main() {
        http.HandleFunc("/whoami", handler)
        http.ListenAndServe(":8080", nil)
}

Traditional Dockerfile

To create an image for this application, first create the Dockerfile with the following content (place this file in the same directory as http.go):

Dockerfile
FROM golang:1.17
WORKDIR /go/src/app
COPY http.go .
RUN go mod init
RUN CGO_ENABLED=0 GOOS=linux go build -o http .
CMD ["./http"]

Note: in this Dockerfile, the official golang image is used as the base image, the source file http.go is copied then compiled.

You can then build the image and name it whoami:1.0:

docker image build -t whoami:1.0 .

List the present images and note the size of the whoami:1.0 image

docker image ls whoami

You should get the following result:

REPOSITORY   TAG       IMAGE ID       CREATED         SIZE
whoami       1.0       16795cf36deb   2 seconds ago   962MB

The resulting image has a really big size big because it contains the entire Go language toolchain. However, once the binary has been compiled, we no longer need the compiler in the final image.

Dockerfile using multi-stage build

Multi-stage build allows, within a single Dockerfile, to perform the build process in several stages. Each stage can reuse artifacts (compilation result files, web assets, …) created during previous stages. This Dockerfile will have multiple FROM instructions but only the last one will be used for building the final image.

If we take the example of the http server above, we can first compile the source code using the golang image containing the compiler. Once the binary is created, we can use an empty base image, named scratch, and copy the previously generated binary.

Replace the content of the Dockerfile with the following instructions:

Dockerfile
FROM golang:1.17 as build
WORKDIR /go/src/app
COPY http.go .
RUN go mod init
RUN CGO_ENABLED=0 GOOS=linux go build -o http .

FROM scratch
COPY --from=build /go/src/app .
CMD ["./http"]
⚠️
The example we used here is based on an application written in Go. This language has the particularity of being able to be compiled into a static binary, meaning it doesn’t need to be “linked” to external libraries. This is why we can start from the scratch image. For other languages, the base image used during the last stage of the build may be different (alpine, …)

Build the image in version 2 with the following command.

docker image build -t whoami:2.0 .

List the images and observe the size difference between them:

docker image ls whoami

You should get the following result:

REPOSITORY   TAG       IMAGE ID       CREATED         SIZE
whoami       2.0       0a97315aeaaa   6 seconds ago   6.07MB
whoami       1.0       16795cf36deb   2 minutes ago   962MB

Launch a container based on the whoami:2.0 image

docker container run -p 8080:8080 whoami:2.0

Using the curl command, send a GET request to the exposed endpoint. You should get, in return, the identifier of the container that processed the request.

curl localhost:8080/whoami

For this simple application, multi-stage build allowed removing binaries and libraries whose presence is unnecessary in the final image. The example of an application written in go is extreme, but multi-stage build is part of the best practices to adopt for many development languages.