Skip to content

Add step to load pods in cache before building and testing#19551

Closed
mokagio wants to merge 1 commit intotrunkfrom
mokagio/pre-load-pods
Closed

Add step to load pods in cache before building and testing#19551
mokagio wants to merge 1 commit intotrunkfrom
mokagio/pre-load-pods

Conversation

@mokagio
Copy link
Contributor

@mokagio mokagio commented Nov 2, 2022

It occurred to me, because of our steps parallelism, we can run in situations where a commit that changes the Podfile.lock results in multiple steps in the same CI build installing the pods from scratch. For example, see this build where all three IPA deployment steps spend 15+ minutes in the CocoaPods part of the build because there was no cached version of the pods available,
https://buildkite.com/automattic/wordpress-ios/builds/10636.

By introducing a step at the start of the pipeline, on which the following steps depend upon, to install the pods we can make sure parallel steps won't waste time building the same new version of the Pods.

Testing

I'll add a few example builds here if we decide to move forward with this idea

RFC – Is this worth it & How to improve it

Based on the crude build time numbers mentioned above, on a release CI build that doesn't have a cached Pods/ available, this could save ~30 minutes – 15 minutes for the cache, then run 3 app builds with the cache available vs. run 3 app builds each building the cache. The savings should be similar on the regular pipeline, where we have 2 installable build steps plus the one that builds the app for testing. We'll also soon have the step to validate the .strings generation, which will run CocoaPods, too.

However, on builds that do have a cache available, this will introduce a ~4 minutes delay.

That means that will have caught up with the saving on non-cached builds every 7-8 cached builds. I don't have stats but, intuitively, I don't think we change the Podfile.lock that frequently.

The time saved might even be less that ~30 mins on a 3 steps build because, let's not forget, each step will still need to download the cache.

So, I wonder whether this is worth pursuing. From one point of view, I "can't unsee" the time waste that occurs when the cache is not available. From the other, my back of the napkin math points to the time saving becoming irrelevant once we take into account Podfile.lock doesn't change that often.

Improvement ideas:

  • I estimate we could make the pre-load step twice as fast by checking if the cache is available before running install_gems and install_pods. To do so, we'd have to extract the logic that checks that in a dedicated command at the bash-cache level.
  • Podfile.lock doesn't change often in development builds, but the build-to-change ratio is much higher in for release builds. The code freeze build will almost always change the pods. We always run at least two additional CI release builds: localization smoke testing and release finalization. On top of those, we may or may not have additional betas to ship. Even accounting for 3 additional betas, the number of CI release builds after the cache is still such that ....

I need a break! I'll finish this later because I'm becoming more and more unsure of my math

Regression Notes

  1. Potential unintended areas of impact – N.A.
  2. What I did to test those areas of impact (or what existing automated tests I relied on) – N.A.
  3. What automated tests I added (or what prevented me from doing so) – N.A.

  • I have completed the Regression Notes.
  • I have considered adding unit tests for my changes. N.A.
  • I have considered adding accessibility improvements for my changes. N.A.
  • I have considered if this change warrants user-facing release notes and have added them to RELEASE-NOTES.txt if necessary. N.A.

It occurred to me, because of our steps parallelism, we can run in
situations where a commit that changes the `Podfile.lock` results in
multiple steps in the same CI build installing the pods from scratch.
For example, see this build where all three IPA deployment steps spend
15+ minutes in the CocoaPods part of the build because there was no
cached version of the pods available,
https://buildkite.com/automattic/wordpress-ios/builds/10636.

By introducing a step at the start of the pipeline, on which the
following steps depend upon, to install the pods we can make sure
parallel steps won't waste time building the same new version of the
Pods.
@mokagio mokagio force-pushed the mokagio/pre-load-pods branch from c923ffc to d88a72e Compare November 2, 2022 01:51
@wpmobilebot
Copy link
Contributor

You can test the changes in WordPress from this Pull Request by:
  • Clicking here or scanning the QR code below to access App Center
  • Then installing the build number pr19551-d88a72e on your iPhone

If you need access to App Center, please ask a maintainer to add you.

@wpmobilebot
Copy link
Contributor

You can test the changes in Jetpack from this Pull Request by:
  • Clicking here or scanning the QR code below to access App Center
  • Then installing the build number pr19551-d88a72e on your iPhone

If you need access to App Center, please ask a maintainer to add you.

Copy link
Contributor

@AliSoftware AliSoftware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you said in the PR description, I'm not entirely sure that this is worth it, especially if it adds ≈ 4mn to each build when there's a cache hit.

Besides, the solution you implemented in this PR wouldn't work, because independent steps can run on different agents, so we'd need to also upload then download a tar of the Pods folder in between those new steps for it to work.

One alternative approach that one could try though, would be to add a step similar to your :cocoapods: Pre-load Pods one, but that would instead:

  • Only check if cache would hit or miss. Ideally without having to setup ruby and run pod command. Like, maybe we can just [compute the Podfile checksum like here, and check if there exists a cache for it on S3… but without downloading it.
  • If there's a cache hit, do nothing (in particular, don't bother downloading and installing said cache in that step–we just needed the check here, not the actual Pods; the subsequent dependent steps–which might run on different agents–will just use the existing cache)
  • If there's a cache miss, then setup ruby, then install the (new) pods, then save and upload them to a new cache. The subsequent dependent steps would then just detect and use that cache automatically.

If we can make this work, that would mean that the precheck step that we would add to validate Pods cache is up-to-date… should only take a couple of seconds (compute the Podfile hash, then check if a cache entry for it exists, without downloading it) in the most common case of a cache hit. While still saving those duplicated ≈15mn on subsequent steps if there was a cache miss.

Comment on lines +16 to +23
# Make sure there is a cached version of the pods avaialble before testing
# the app or deploying installable builds to avoid parallel steps wasting
# time building the same `Podfile.lock` from scratch.
- label: ':cocoapods: Pre-load Pods'
key: pods
command: .buildkite/commands/load-pods.sh
env: *common_env
plugins: *common_plugins
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this really work? I thought that each step in a Buildkite pipeline were independent from one another in terms of the agents they might be using, which means that this step downloading / installing pods might run on one agent, while the following steps that depends on it might run on a different one:

However you add steps to your pipeline, keep in mind that steps may run on different agents. It is good practice to install your dependencies in the same step that you run them.
Source: https://buildkite.com/docs/pipelines/defining-steps#adding-steps

So if we want this to work as expected, we'd also have to e.g. upload a .tar of the Pods/ folder that got installed by this step in artefacts, then download and uncompress that Pods.tar back in the subsequent steps (similar to how we do it for build-for-testing uploading the Build Products here then unit-tests.sh downloading them back here)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if we want this to work as expected, we'd also have to e.g...

That's what install_cocoapods does at the bash-cache plugin level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, I should say "install_cocoapods should be doing". I haven't gotten to the point of testing a build that would trigger a new cache generation. But I'm pretty sure it works as install_cocoapods is tried and tested.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I didn't realize that install_cocoapods was already I closing the cache dance! Your initial approach makes more sense now 🙂

So I guess the only thing left to solve is how to tell install_cocoapods told only install them (and thus ultimately rebuild the cache) if there was a cache miss… but don't bother installing them during this check step in case of a cache hit.

@mokagio
Copy link
Contributor Author

mokagio commented Nov 2, 2022

@AliSoftware thank you for having a look here and validating my doubts on the ROI of the setup as it is right now.

Your suggestion of doing a quick cache check at the start and only if we get a miss do the hard work of setting things up is exactly what I was thinking 😄

I'll have a look at some point.

@mokagio mokagio added the Tooling Build, Release, and Validation Tools label Nov 2, 2022
@mokagio mokagio added this to the Someday milestone Nov 2, 2022
@jkmassel jkmassel closed this Jun 21, 2024
@jkmassel jkmassel deleted the mokagio/pre-load-pods branch July 26, 2024 18:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Tooling Build, Release, and Validation Tools

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants