Exercise
In this exercise, we will illustrate multi-stage build
Reminder
The Dockerfile contains a list of instructions that creates an image. The first instruction is FROM, which defines the base image used. This base image often contains many elements (binaries and libraries) that the final application doesn’t need (compiler, …). This can significantly impact the image size and also its security since it can considerably increase its attack surface. This is where multi-stage build comes in…
A http server written in Go
Let’s take the example of the following program written in Go.
In a new directory, create the file http.go containing the following code. This defines a simple http server that listens on port 8080 and exposes the /whoami endpoint in GET. For each request, it returns the hostname of the host machine on which it runs.
package main
import (
"io"
"net/http"
"os"
)
func handler(w http.ResponseWriter, req *http.Request) {
host, err := os.Hostname()
if err != nil {
io.WriteString(w, "unknown")
} else {
io.WriteString(w, host)
}
}
func main() {
http.HandleFunc("/whoami", handler)
http.ListenAndServe(":8080", nil)
}
Traditional Dockerfile
To create an image for this application, first create the Dockerfile with the following content (place this file in the same directory as http.go):
FROM golang:1.17
WORKDIR /go/src/app
COPY http.go .
RUN go mod init
RUN CGO_ENABLED=0 GOOS=linux go build -o http .
CMD ["./http"]
Note: in this Dockerfile, the official golang image is used as the base image, the source file http.go is copied then compiled.
You can then build the image and name it whoami:1.0:
docker image build -t whoami:1.0 .
List the present images and note the size of the whoami:1.0 image
docker image ls whoami
You should get the following result:
REPOSITORY TAG IMAGE ID CREATED SIZE
whoami 1.0 16795cf36deb 2 seconds ago 962MB
The resulting image has a really big size big because it contains the entire Go language toolchain. However, once the binary has been compiled, we no longer need the compiler in the final image.
Dockerfile using multi-stage build
Multi-stage build allows, within a single Dockerfile, to perform the build process in several stages. Each stage can reuse artifacts (compilation result files, web assets, …) created during previous stages. This Dockerfile will have multiple FROM instructions but only the last one will be used for building the final image.
If we take the example of the http server above, we can first compile the source code using the golang image containing the compiler. Once the binary is created, we can use an empty base image, named scratch, and copy the previously generated binary.
Replace the content of the Dockerfile with the following instructions:
FROM golang:1.17 as build
WORKDIR /go/src/app
COPY http.go .
RUN go mod init
RUN CGO_ENABLED=0 GOOS=linux go build -o http .
FROM scratch
COPY --from=build /go/src/app .
CMD ["./http"]
Build the image in version 2 with the following command.
docker image build -t whoami:2.0 .
List the images and observe the size difference between them:
docker image ls whoami
You should get the following result:
REPOSITORY TAG IMAGE ID CREATED SIZE
whoami 2.0 0a97315aeaaa 6 seconds ago 6.07MB
whoami 1.0 16795cf36deb 2 minutes ago 962MB
Launch a container based on the whoami:2.0 image
docker container run -p 8080:8080 whoami:2.0
Using the curl command, send a GET request to the exposed endpoint. You should get, in return, the identifier of the container that processed the request.
curl localhost:8080/whoami
For this simple application, multi-stage build allowed removing binaries and libraries whose presence is unnecessary in the final image. The example of an application written in go is extreme, but multi-stage build is part of the best practices to adopt for many development languages.