-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Description
Is your feature request related to a problem? Please describe.
The pulled image, mcr.microsoft.com/azure-cli, at 1 GB is 425 MB larger than it should be due to this line not behaving as whoever wrote it expected.
# Remove CLI source code from the final image and normalize line endings.
RUN rm -rf ./azure-cli && \
dos2unix /root/.bashrc /usr/local/bin/azWhile running rm -rf ./azure-cli does, in fact, hide the directory from the subsequent layers, it does not remove it and its bloat from the built image. Proof of this can bee seen by using a tool such as dive to inspect the image.
Describe the solution you'd like
Find a way to actually remove the no-longer-needed source code from the finished container.
The trickiness of this is compounded by the fact that this is python and pip is involved. It's not like there is a single binary being generated that could simply be built in a separate stage and copied to the final stage.
I found a solution for this locally, but I'm not sure how feasible it would be on your end considering it requires the Docker daemon to be running in experimental mode. The new Dockerfile would look something like this:
# syntax = docker/dockerfile:experimental
#---------------------------------------------------------------------------------------------
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License. See License.txt in the project root for license information.
#---------------------------------------------------------------------------------------------
ARG PYTHON_VERSION="3.6.9"
FROM alpine:3.10 as azure-cli-source
WORKDIR azure-cli
COPY . /azure-cli
FROM python:${PYTHON_VERSION}-alpine3.10
ARG CLI_VERSION
# Metadata as defined at http://label-schema.org
ARG BUILD_DATE
LABEL maintainer="Microsoft" \
org.label-schema.schema-version="1.0" \
org.label-schema.vendor="Microsoft" \
org.label-schema.name="Azure CLI" \
org.label-schema.version=$CLI_VERSION \
org.label-schema.license="MIT" \
org.label-schema.description="The Azure CLI is used for all Resource Manager deployments in Azure." \
org.label-schema.url="https://docs.microsoft.com/cli/azure/overview" \
org.label-schema.usage="https://docs.microsoft.com/cli/azure/install-az-cli2#docker" \
org.label-schema.build-date=$BUILD_DATE \
org.label-schema.vcs-url="https://github.com/Azure/azure-cli.git" \
org.label-schema.docker.cmd="docker run -v \${HOME}/.azure:/root/.azure -it microsoft/azure-cli:$CLI_VERSION"
# bash gcc make openssl-dev libffi-dev musl-dev - dependencies required for CLI
# openssh - included for ssh-keygen
# ca-certificates
# curl - required for installing jp
# jq - we include jq as a useful tool
# pip wheel - required for CLI packaging
# jmespath-terminal - we include jpterm as a useful tool
# libintl and icu-libs - required by azure devops artifact (az extension add --name azure-devops)
RUN apk add --no-cache bash openssh ca-certificates jq curl openssl git zip \
&& apk add --no-cache --virtual .build-deps gcc make openssl-dev libffi-dev musl-dev linux-headers \
&& apk add --no-cache libintl icu-libs \
&& update-ca-certificates
ARG JP_VERSION="0.1.3"
RUN curl -L https://github.com/jmespath/jp/releases/download/${JP_VERSION}/jp-linux-amd64 -o /usr/local/bin/jp \
&& chmod +x /usr/local/bin/jp \
&& pip install --no-cache-dir --upgrade jmespath-terminal
WORKDIR azure-cli
# COPY . /azure-cli
# 1. Build packages and store in tmp dir
# 2. Install the cli and the other command modules that weren't included
# 3. Temporary fix - install azure-nspkg to remove import of pkg_resources in azure/__init__.py (to improve performance)
# RUN --mount=type=cache,from=azure-cli-source,target=/azure-cli /azure-cli/scripts/install_full.sh \
RUN --mount=type=cache,from=azure-cli-source,target=/azure-cli ./azure-cli/scripts/install_full.sh \
&& cat azure-cli/az.completion > ~/.bashrc \
&& runDeps="$( \
scanelf --needed --nobanner --recursive /usr/local \
| awk '{ gsub(/,/, "\nso:", $2); print "so:" $2 }' \
| sort -u \
| xargs -r apk info --installed \
| sort -u \
)" \
&& apk add --virtual .rundeps $runDeps \
&& rm -rf azure-cli/
WORKDIR /
# Remove CLI source code from the final image and normalize line endings.
RUN rm -rf ./azure-cli && \
dos2unix /root/.bashrc /usr/local/bin/az
CMD bashThe big changes are:
# syntax = docker/dockerfile:experimentalat the top of the Dockerfile and then using the --mount flag in the run command:
RUN --mount=type=cache,from=azure-cli-source,target=/azure-cli ./azure-cli/scripts/install_full.sh \
&& cat azure-cli/az.completion > ~/.bashrc \
&& runDeps="$( \
scanelf --needed --nobanner --recursive /usr/local \
| awk '{ gsub(/,/, "\nso:", $2); print "so:" $2 }' \
| sort -u \
| xargs -r apk info --installed \
| sort -u \
)" \
&& apk add --virtual .rundeps $runDeps \
&& rm -rf azure-cli/Doing this allows RUN to use the source code without a COPY and prevents it from being forever committed to an intermediate layer.
Building this version resulted in a functional image 586 MB in size.
Describe alternatives you've considered
If you could somehow make it so that you could pull down the source with a RUN instead of a COPY, it could be done and rm -rf'd in the same layer as the current install_full.sh command.
Additional context
It looks like the exact same thing is happening, at the very least, in Dockerfile.spot.
