Multi-stage docker images

·

4 min read

One of the requirement as DevOps is to constantly optimize build process. Today if you're shipping container images with Docker and your Dockerfiles aren't multi-stage, you're likely shipping unnecessary bloat to production. This makes your image size huge and if deploying to Kubernetes the resource required is also increase tremendously. This also broadens the build image potential attack surface.

Almost any application, regardless of its type language stack has two types of dependencies build-time and run-time.

The build-time dependencies are much more numerous and noisy than the run-time. Therefore, in most cases, we only want the production dependencies in the final image.

So lets dive into how not to organize Dockerfile in today’s world

Node.js

FROM node:lts-slim

WORKDIR /app
COPY . .

RUN npm ci
RUN npm run build

ENV NODE_ENV=production
EXPOSE 3000

CMD ["node", "/app/.output/index.mjs"]

Do notice what is wrong, it is shipping the COPY source code and node_modules on the final output.

The Build Pattern

The builder pattern describes the setup that many people use to build a container. It involves two Docker images:

  1. a "build-time" image with all the build tools installed, capable of creating production-ready application files.

  2. a "run-time" image capable of running the application.

So how does it work

Here is what a "compound" Node.js application Dockerfile could look like:

# The "build" stage
FROM node:lts-slim AS build

WORKDIR /app
COPY . .

RUN npm ci
RUN npm run build

# The "runtime" stage
FROM node:lts-slim AS runtime

WORKDIR /app
COPY --from=build /app/.output .

ENV NODE_ENV=production
EXPOSE 3000

CMD ["node", "/app/.output/index.mjs"]

Using the official terminology, every FROM instruction defines not an image but a stage, and technically the COPY happens --from a stage. However, as we saw above, thinking of stages as independent images is helpful for connecting the dots.

Last but not least, when all stages and COPY --from=<stage> instructions are defined in one Dockerfile, the Docker build engine can compute the right build order, skip unused, and execute independent stages concurrently

Go Language

FROM golang:1.23-alpine as builder

# Install dependencies for copy
RUN apk add -U --no-cache ca-certificates tzdata git

# Use an valid GOPATH and copy the files
WORKDIR /go/src/github.com/abc/abc
COPY go.mod .
COPY go.sum .
RUN go mod tidy
COPY . .

# Fetching dependencies and build the app
RUN go get -d -v ./...
RUN CGO_ENABLED=0 go build -a -installsuffix cgo -o abc .

# Use scratch as production environment -> Small builds
FROM scratch as production
WORKDIR /
# Copy valid SSL certs from the builder for fetching github/gitlab/...
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
# Copy zoneinfo for getting the right cron timezone
COPY --from=builder /usr/share/zoneinfo /usr/share/zoneinfo
# Copy the main executable from the builder
COPY --from=builder /go/src/github.com/abc/abc /abc/abc

ENTRYPOINT [ "/abc/abc" ]
CMD [ "/abc/conf.yml" ]

Rust

# Build stage
FROM rust:1.67 AS build

WORKDIR /usr/src/app

COPY . .
RUN cargo install --path .

# Runtime stage
FROM debian:bullseye-slim

RUN apt-get update && \
    apt-get install -y extra-runtime-dependencies && \
    rm -rf /var/lib/apt/lists/*

COPY --from=build /usr/local/cargo/bin/app /usr/local/bin/app

CMD ["myapp"]

JAVA

# Base stage (reused by test and dev stages)
FROM eclipse-temurin:21-jdk-jammy AS base

WORKDIR /build

COPY --chmod=0755 mvnw mvnw
COPY .mvn/ .mvn/

# Test stage
FROM base as test

WORKDIR /build

COPY ./src src/
RUN --mount=type=bind,source=pom.xml,target=pom.xml \
    --mount=type=cache,target=/root/.m2 \
    ./mvnw test

# Intermediate stage
FROM base AS deps

WORKDIR /build

RUN --mount=type=bind,source=pom.xml,target=pom.xml \
    --mount=type=cache,target=/root/.m2 \
    ./mvnw dependency:go-offline -DskipTests

# Intermediate stage
FROM deps AS package

WORKDIR /build

COPY ./src src/
RUN --mount=type=bind,source=pom.xml,target=pom.xml \
    --mount=type=cache,target=/root/.m2 \
    ./mvnw package -DskipTests && \
    mv target/$(./mvnw help:evaluate -Dexpression=project.artifactId -q -DforceStdout)-$(./mvnw help:evaluate -Dexpression=project.version -q -DforceStdout).jar target/app.jar

# Build stage
FROM package AS extract

WORKDIR /build

RUN java -Djarmode=layertools -jar target/app.jar extract --destination target/extracted

# Development stage
FROM extract AS development

WORKDIR /build

RUN cp -r /build/target/extracted/dependencies/. ./
RUN cp -r /build/target/extracted/spring-boot-loader/. ./
RUN cp -r /build/target/extracted/snapshot-dependencies/. ./
RUN cp -r /build/target/extracted/application/. ./

ENV JAVA_TOOL_OPTIONS="-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=*:8000"

CMD [ "java", "-Dspring.profiles.active=postgres", "org.springframework.boot.loader.launch.JarLauncher" ]

# Runtime stage
FROM eclipse-temurin:21-jre-jammy AS runtime

ARG UID=10001

RUN adduser \
    --disabled-password \
    --gecos "" \
    --home "/nonexistent" \
    --shell "/sbin/nologin" \
    --no-create-home \
    --uid "${UID}" \
    appuser

USER appuser

COPY --from=extract build/target/extracted/dependencies/ ./
COPY --from=extract build/target/extracted/spring-boot-loader/ ./
COPY --from=extract build/target/extracted/snapshot-dependencies/ ./
COPY --from=extract build/target/extracted/application/ ./

EXPOSE 8080
ENTRYPOINT [ "java", "-Dspring.profiles.active=postgres", "org.springframework.boot.loader.launch.JarLauncher" ]

As a final tip, in the event you would need to perform apt-get install on other libraries, instead on writing this

RUN apt-get update  
RUN apt-get install -y curl vim  
RUN apt-get clean  
RUN rm -rf /var/lib/apt/lists/*

Do this instead, this approach minimizes the number of layers and ensures temporary files (e.g., cache) are removed within the same layer, keeping the image smaller and cleaner.

RUN apt-get update && apt-get install -y curl vim \
    && apt-get clean && rm -rf /var/lib/apt/lists/*