Skip to content

Conversation

@pitrou
Copy link
Member

@pitrou pitrou commented Apr 5, 2021

This seems to save about 2GB in the image's root directory.

@pitrou pitrou requested a review from kszucs April 5, 2021 14:16
@pitrou pitrou force-pushed the ARROW-12191-conda-integration branch from da2d253 to 72b9f03 Compare April 5, 2021 15:07
@github-actions
Copy link

github-actions bot commented Apr 5, 2021

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this really help? Docker layers are on top of each other. This thus then only "hide" the files but does not remove them from disk.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah... That's a good question. But I assume image layers are compressed? At least this addresses the uncompressed container size.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once you install something in a layer, it gets added to the total image size. This is why we should prefer install and cleanup up steps in the same layer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, the install was done in conda-cpp...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should check whether we need the clangdev and llvmdev packages. I would guess that llvm would be sufficient as we link dynamically. Not sure what part of clang(dev) we need.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of focusing on shrinking the docker images, I'd rather consider to free up more space on the github actions hosted agents (if we can easily), see the space usage here: https://github.com/apache/arrow/pull/9814/checks?check_run_id=2205470543#step:4:83

The GHA hosted agents are provisioned using packer, here is the configuration for ubuntu 20.04: https://github.com/actions/virtual-environments/blob/main/images/linux/toolsets/toolset-2004.json

Presumably the android SDK consumes a lot of space, but there are other toolsets we don't use either.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we can do both. The size of docker images and container also affects local development.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant deleting is probably easier and quicker, reduced image size would be better of course.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, the problem is that the delete step can take 1 minute if I remove the Android toolkit...

@pitrou pitrou force-pushed the ARROW-12191-conda-integration branch 4 times, most recently from ba11008 to 5ca8677 Compare April 6, 2021 11:34
* Reduce conda packages footprint
* Make Rust install more minimal
* Free up more space on Github Actions
@pitrou pitrou force-pushed the ARROW-12191-conda-integration branch from 5ca8677 to 24e592d Compare April 6, 2021 17:25
@pitrou
Copy link
Member Author

pitrou commented Apr 6, 2021

I think I addressed your review comments @kszucs

@pitrou pitrou changed the title ARROW-12112: [CI] Reduce disk size of conda-integration image ARROW-12112: [CI] Reduce footprint of conda-integration image Apr 6, 2021
Copy link
Member

@kszucs kszucs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pitrou !

@kszucs
Copy link
Member

kszucs commented Apr 7, 2021

The build failures are unrelated, merging.

@kszucs kszucs closed this in 0c02ff9 Apr 7, 2021
@pitrou pitrou deleted the ARROW-12191-conda-integration branch April 7, 2021 09:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants