-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-12112: [CI] Reduce footprint of conda-integration image #9891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
da2d253 to
72b9f03
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this really help? Docker layers are on top of each other. This thus then only "hide" the files but does not remove them from disk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah... That's a good question. But I assume image layers are compressed? At least this addresses the uncompressed container size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once you install something in a layer, it gets added to the total image size. This is why we should prefer install and cleanup up steps in the same layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, the install was done in conda-cpp...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should check whether we need the clangdev and llvmdev packages. I would guess that llvm would be sufficient as we link dynamically. Not sure what part of clang(dev) we need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of focusing on shrinking the docker images, I'd rather consider to free up more space on the github actions hosted agents (if we can easily), see the space usage here: https://github.com/apache/arrow/pull/9814/checks?check_run_id=2205470543#step:4:83
The GHA hosted agents are provisioned using packer, here is the configuration for ubuntu 20.04: https://github.com/actions/virtual-environments/blob/main/images/linux/toolsets/toolset-2004.json
Presumably the android SDK consumes a lot of space, but there are other toolsets we don't use either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, we can do both. The size of docker images and container also affects local development.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant deleting is probably easier and quicker, reduced image size would be better of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, the problem is that the delete step can take 1 minute if I remove the Android toolkit...
ba11008 to
5ca8677
Compare
* Reduce conda packages footprint * Make Rust install more minimal * Free up more space on Github Actions
5ca8677 to
24e592d
Compare
|
I think I addressed your review comments @kszucs |
kszucs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @pitrou !
|
The build failures are unrelated, merging. |
This seems to save about 2GB in the image's root directory.