Skip to content

{Packaging} Optimize Linux package and docker image by removing py file#25801

Open
bebound wants to merge 40 commits intoAzure:devfrom
bebound:trim_package
Open

{Packaging} Optimize Linux package and docker image by removing py file#25801
bebound wants to merge 40 commits intoAzure:devfrom
bebound:trim_package

Conversation

@bebound
Copy link
Contributor

@bebound bebound commented Mar 14, 2023

Description

We have used trim_sdk.py and only keep pyc file on Windows. Apply them on Linux and docker to reduce package size.

Trimming SKD is #26172
This is the second part, removing py file on Linux.

Final result:
Ubuntu 22.04 installed size 576 MB -> 326 MB, become 56% of original size.
RHEL 9 installed size 403 MB -> 172 MB, become 43% of original size.
Docker image size 956 MB -> 714 MB, become 75% of original size.

History Notes

[Component Name 1] BREAKING CHANGE: az command a: Make some customer-facing breaking change
[Component Name 2] az command b: Add some customer-facing feature


This checklist is used to make sure that common guidelines for a pull request are followed.

@azure-client-tools-bot-prd
Copy link

azure-client-tools-bot-prd bot commented Mar 14, 2023

️✔️acr
️✔️2020-09-01-hybrid
️✔️3.10
️✔️3.9
️✔️latest
️✔️3.10
️✔️3.9
️✔️acs
️✔️2020-09-01-hybrid
️✔️3.10
️✔️3.9
️✔️latest
️✔️3.10
️✔️3.9
️✔️advisor
️✔️latest
️✔️3.10
️✔️3.9
️✔️ams
️✔️latest
️✔️3.10
️✔️3.9
️✔️apim
️✔️latest
️✔️3.10
️✔️3.9
️✔️appconfig
️✔️latest
️✔️3.10
️✔️3.9
️✔️appservice
️✔️latest
️✔️3.10
️✔️3.9
️✔️aro
️✔️latest
️✔️3.10
️✔️3.9
️✔️backup
️✔️latest
️✔️3.10
️✔️3.9
️✔️batch
️✔️latest
️✔️3.10
️✔️3.9
️✔️batchai
️✔️latest
️✔️3.10
️✔️3.9
️✔️billing
️✔️latest
️✔️3.10
️✔️3.9
️✔️botservice
️✔️latest
️✔️3.10
️✔️3.9
️✔️cdn
️✔️latest
️✔️3.10
️✔️3.9
️✔️cloud
️✔️latest
️✔️3.10
️✔️3.9
️✔️cognitiveservices
️✔️latest
️✔️3.10
️✔️3.9
️✔️config
️✔️latest
️✔️3.10
️✔️3.9
️✔️configure
️✔️latest
️✔️3.10
️✔️3.9
️✔️consumption
️✔️latest
️✔️3.10
️✔️3.9
️✔️container
️✔️latest
️✔️3.10
️✔️3.9
️✔️core
️✔️2018-03-01-hybrid
️✔️3.10
️✔️3.9
️✔️2019-03-01-hybrid
️✔️3.10
️✔️3.9
️✔️2020-09-01-hybrid
️✔️3.10
️✔️3.9
️✔️latest
️✔️3.10
️✔️3.9
️✔️cosmosdb
️✔️latest
️✔️3.10
️✔️3.9
️✔️databoxedge
️✔️2019-03-01-hybrid
️✔️3.10
️✔️3.9
️✔️2020-09-01-hybrid
️✔️3.10
️✔️3.9
️✔️latest
️✔️3.10
️✔️3.9
️✔️dla
️✔️latest
️✔️3.10
️✔️3.9
️✔️dls
️✔️latest
️✔️3.10
️✔️3.9
️✔️dms
️✔️latest
️✔️3.10
️✔️3.9
️✔️eventgrid
️✔️latest
️✔️3.10
️✔️3.9
️✔️eventhubs
️✔️latest
️✔️3.10
️✔️3.9
️✔️feedback
️✔️latest
️✔️3.10
️✔️3.9
️✔️find
️✔️latest
️✔️3.10
️✔️3.9
️✔️hdinsight
️✔️latest
️✔️3.10
️✔️3.9
️✔️identity
️✔️latest
️✔️3.10
️✔️3.9
️✔️iot
️✔️2019-03-01-hybrid
️✔️3.10
️✔️3.9
️✔️2020-09-01-hybrid
️✔️3.10
️✔️3.9
️✔️latest
️✔️3.10
️✔️3.9
️✔️keyvault
️✔️2018-03-01-hybrid
️✔️3.10
️✔️3.9
️✔️2020-09-01-hybrid
️✔️3.10
️✔️3.9
️✔️latest
️✔️3.10
️✔️3.9
️✔️kusto
️✔️latest
️✔️3.10
️✔️3.9
️✔️lab
️✔️latest
️✔️3.10
️✔️3.9
️✔️managedservices
️✔️latest
️✔️3.10
️✔️3.9
️✔️maps
️✔️latest
️✔️3.10
️✔️3.9
️✔️marketplaceordering
️✔️latest
️✔️3.10
️✔️3.9
️✔️monitor
️✔️latest
️✔️3.10
️✔️3.9
️✔️netappfiles
️✔️latest
️✔️3.10
️✔️3.9
️✔️network
️✔️2018-03-01-hybrid
️✔️3.10
️✔️3.9
️✔️latest
️✔️3.10
️✔️3.9
️✔️policyinsights
️✔️latest
️✔️3.10
️✔️3.9
️✔️privatedns
️✔️latest
️✔️3.10
️✔️3.9
️✔️profile
️✔️latest
️✔️3.10
️✔️3.9
️✔️rdbms
️✔️latest
️✔️3.10
️✔️3.9
️✔️redis
️✔️latest
️✔️3.10
️✔️3.9
️✔️relay
️✔️latest
️✔️3.10
️✔️3.9
️✔️resource
️✔️2018-03-01-hybrid
️✔️3.10
️✔️3.9
️✔️2019-03-01-hybrid
️✔️3.10
️✔️3.9
️✔️latest
️✔️3.10
️✔️3.9
️✔️role
️✔️latest
️✔️3.10
️✔️3.9
️✔️search
️✔️latest
️✔️3.10
️✔️3.9
️✔️security
️✔️latest
️✔️3.10
️✔️3.9
️✔️servicebus
️✔️latest
️✔️3.10
️✔️3.9
️✔️serviceconnector
️✔️latest
️✔️3.10
️✔️3.9
️✔️servicefabric
️✔️latest
️✔️3.10
️✔️3.9
️✔️signalr
️✔️latest
️✔️3.10
️✔️3.9
️✔️sql
️✔️latest
️✔️3.10
️✔️3.9
️✔️sqlvm
️✔️latest
️✔️3.10
️✔️3.9
️✔️storage
️✔️2018-03-01-hybrid
️✔️3.10
️✔️3.9
️✔️2019-03-01-hybrid
️✔️3.10
️✔️3.9
️✔️2020-09-01-hybrid
️✔️3.10
️✔️3.9
️✔️latest
️✔️3.10
️✔️3.9
️✔️synapse
️✔️latest
️✔️3.10
️✔️3.9
️✔️telemetry
️✔️2018-03-01-hybrid
️✔️3.10
️✔️3.9
️✔️2019-03-01-hybrid
️✔️3.10
️✔️3.9
️✔️2020-09-01-hybrid
️✔️3.10
️✔️3.9
️✔️latest
️✔️3.10
️✔️3.9
️✔️util
️✔️latest
️✔️3.10
️✔️3.9
️✔️vm
️✔️2018-03-01-hybrid
️✔️3.10
️✔️3.9
️✔️2019-03-01-hybrid
️✔️3.10
️✔️3.9
️✔️2020-09-01-hybrid
️✔️3.10
️✔️3.9
️✔️latest
️✔️3.10
️✔️3.9

@ghost ghost requested review from jiasli, kairu-ms, wangzelin007 and yonzhan March 14, 2023 08:22
@ghost ghost added the Auto-Assign Auto assign by bot label Mar 14, 2023
@ghost ghost assigned jiasli Mar 14, 2023
@ghost ghost added this to the March 2023 (2023-04-04) milestone Mar 14, 2023
@ghost ghost added the Packaging label Mar 14, 2023
@yonzhan
Copy link
Collaborator

yonzhan commented Mar 14, 2023

Packaging

# Python image has build-in env $PYTHON_VERSION=3.10.10.
# `ARG PYTHON_VERSION="3.10"` works on ARM64, but it can't override the default value on AMD64.
RUN ./scripts/install_full.sh && python ./scripts/trim_sdk.py \
&& python ./scripts/use_pyc.py /usr/local/lib/python${PYTHON_VERSION:0:4}/site-packages/ \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to make the pyc change in a separate PR, just in case something goes wrong and we need to roll back.

@bebound bebound changed the title {Packaging} Trim Linux package and docker image {Packaging} Trim Linux package and docker image by removing py file Apr 18, 2023
@bebound bebound changed the title {Packaging} Trim Linux package and docker image by removing py file {Packaging} Optimize Linux package and docker image by removing py file Apr 18, 2023
# If pip's py files are also removed, the error is raised when installing some packages.
# See https://github.com/Azure/azure-cli/pull/25801 for details.
if os.path.join('site-packages', 'pip') in file:
continue
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing py file has side effects.
For example: /opt/az/bin/python3 -m pip install --no-cache pgcli==3.4.1

× pip subprocess to install build dependencies did not run successfully.
│ exit code: 2
╰─> [1 lines of output]
    /opt/az/bin/python3: can't open file '/opt/az/lib/python3.10/site-packages/pip/__pip-runner__.py': [Errno 2] No such file or directory
    [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

The MSI also keeps py files under pip folder.


for f in glob.glob(f'{folder}/**/__pycache__', recursive=True):
# Remove emtpy __pycache__ folder
shutil.rmtree(f)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to add -B in Linux entry point to prevent generating pyc in pip folder.

@sodul
Copy link

sodul commented Mar 7, 2025

This work is interesting but I would argue that this effort is a hack around the poor design choices for the Azure SDK releases. It does not solve the core problem which is that the SDKs have multiple copies per year since 2015 and that it is NOT sustainable and shows poor respect for Microsoft customers that have been struggling with this design choice for years.

Who is responsible for refusing to do releases with a single SDK version, which is a strong ask from many azure SDK users? They should spend some time having a direct conversation with the users and hear the complaints.

I can see that this removes 25 to 57% of the disk space but when over 90% of the SDK versions are useless, this would achieve much better results. Especially considering more versions are going to be added over time and the sizes will keep on growing at an increasing rate even after deleting the source files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Auto-Assign Auto assign by bot Packaging

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

Comments