The multistage pattern splits the Docker image building process into two or more individual steps. These are described in a single Dockerfile and will run sequentially. The main advantage is a smaller final output image and a clear separation between concerns.

Here’s a minimal example to help us get started:

FROM python:3.8-alpine AS builder
RUN apk update && apk add --upgrade alpine-sdk
COPY . /app
RUN pip install -U pip wheel setuptools
RUN pip wheel . -w /wheels/

FROM python:3.8-alpine AS runner
COPY --from=builder /wheels /wheels
RUN pip install /wheels/*
COPY . /app
CMD python

The first stage, the builder, pulls a meta-package which in turn installs quite a lot of system dependencies needed for compilation. The Python wheels are then built over the next steps, including our own application.

In my example I’ve used the same base Docker image for both, but this stage could be built from a completely different base image than the one we use next in the runner. In general you might not want to combine Alpine environments with Debian ones for example.
They use different C standard libraries and there’s a good chance that compiled packages will not function correctly across them.

The runner builds our final image. The process is lightweight and simply installs the previously built packages and boots the app. The trick here is the COPY directive which will allow us to fetch content from the previous stage into the new clean environment. This can potentially mean significant image size savings and a simplified runner image.

The builder stage is then discarded and only the runner is presented as the final product without the extra baggage associated with compiling.

Additional saving in build time can be achieved if for example we push the built wheels to a private PyPI repository, or even cache the whole Docker image whenever possible.

Naming each stage using the AS directive makes this process nice and tidy, even if the naming of terminal images is not really required.

Read more