Add step to load pods in cache before building and testing by mokagio · Pull Request #19551 · wordpress-mobile/WordPress-iOS

mokagio · 2022-11-02T01:44:51Z

It occurred to me, because of our steps parallelism, we can run in situations where a commit that changes the Podfile.lock results in multiple steps in the same CI build installing the pods from scratch. For example, see this build where all three IPA deployment steps spend 15+ minutes in the CocoaPods part of the build because there was no cached version of the pods available,
https://buildkite.com/automattic/wordpress-ios/builds/10636.

By introducing a step at the start of the pipeline, on which the following steps depend upon, to install the pods we can make sure parallel steps won't waste time building the same new version of the Pods.

Testing

I'll add a few example builds here if we decide to move forward with this idea

https://buildkite.com/automattic/wordpress-ios/builds/10660

RFC – Is this worth it & How to improve it

Based on the crude build time numbers mentioned above, on a release CI build that doesn't have a cached Pods/ available, this could save ~30 minutes – 15 minutes for the cache, then run 3 app builds with the cache available vs. run 3 app builds each building the cache. The savings should be similar on the regular pipeline, where we have 2 installable build steps plus the one that builds the app for testing. We'll also soon have the step to validate the .strings generation, which will run CocoaPods, too.

However, on builds that do have a cache available, this will introduce a ~4 minutes delay.

That means that will have caught up with the saving on non-cached builds every 7-8 cached builds. I don't have stats but, intuitively, I don't think we change the Podfile.lock that frequently.

The time saved might even be less that ~30 mins on a 3 steps build because, let's not forget, each step will still need to download the cache.

So, I wonder whether this is worth pursuing. From one point of view, I "can't unsee" the time waste that occurs when the cache is not available. From the other, my back of the napkin math points to the time saving becoming irrelevant once we take into account Podfile.lock doesn't change that often.

Improvement ideas:

I estimate we could make the pre-load step twice as fast by checking if the cache is available before running install_gems and install_pods. To do so, we'd have to extract the logic that checks that in a dedicated command at the bash-cache level.
Podfile.lock doesn't change often in development builds, but the build-to-change ratio is much higher in for release builds. The code freeze build will almost always change the pods. We always run at least two additional CI release builds: localization smoke testing and release finalization. On top of those, we may or may not have additional betas to ship. Even accounting for 3 additional betas, the number of CI release builds after the cache is still such that ....

I need a break! I'll finish this later because I'm becoming more and more unsure of my math

Regression Notes

Potential unintended areas of impact – N.A.
What I did to test those areas of impact (or what existing automated tests I relied on) – N.A.
What automated tests I added (or what prevented me from doing so) – N.A.

I have completed the Regression Notes.
I have considered adding unit tests for my changes. N.A.
I have considered adding accessibility improvements for my changes. N.A.
I have considered if this change warrants user-facing release notes and have added them to RELEASE-NOTES.txt if necessary. N.A.

It occurred to me, because of our steps parallelism, we can run in situations where a commit that changes the `Podfile.lock` results in multiple steps in the same CI build installing the pods from scratch. For example, see this build where all three IPA deployment steps spend 15+ minutes in the CocoaPods part of the build because there was no cached version of the pods available, https://buildkite.com/automattic/wordpress-ios/builds/10636. By introducing a step at the start of the pipeline, on which the following steps depend upon, to install the pods we can make sure parallel steps won't waste time building the same new version of the Pods.

wpmobilebot · 2022-11-02T02:17:32Z

You can test the changes in WordPress from this Pull Request by:

Clicking here or scanning the QR code below to access App Center
Then installing the build number pr19551-d88a72e on your iPhone

If you need access to App Center, please ask a maintainer to add you.

wpmobilebot · 2022-11-02T02:18:49Z

You can test the changes in Jetpack from this Pull Request by:

Clicking here or scanning the QR code below to access App Center
Then installing the build number pr19551-d88a72e on your iPhone

If you need access to App Center, please ask a maintainer to add you.

AliSoftware

As you said in the PR description, I'm not entirely sure that this is worth it, especially if it adds ≈ 4mn to each build when there's a cache hit.

Besides, the solution you implemented in this PR wouldn't work, because independent steps can run on different agents, so we'd need to also upload then download a tar of the Pods folder in between those new steps for it to work.

One alternative approach that one could try though, would be to add a step similar to your :cocoapods: Pre-load Pods one, but that would instead:

Only check if cache would hit or miss. Ideally without having to setup ruby and run pod command. Like, maybe we can just [compute the Podfile checksum like here, and check if there exists a cache for it on S3… but without downloading it.
If there's a cache hit, do nothing (in particular, don't bother downloading and installing said cache in that step–we just needed the check here, not the actual Pods; the subsequent dependent steps–which might run on different agents–will just use the existing cache)
If there's a cache miss, then setup ruby, then install the (new) pods, then save and upload them to a new cache. The subsequent dependent steps would then just detect and use that cache automatically.

If we can make this work, that would mean that the precheck step that we would add to validate Pods cache is up-to-date… should only take a couple of seconds (compute the Podfile hash, then check if a cache entry for it exists, without downloading it) in the most common case of a cache hit. While still saving those duplicated ≈15mn on subsequent steps if there was a cache miss.

AliSoftware · 2022-11-02T12:25:55Z

.buildkite/pipeline.yml

+  # Make sure there is a cached version of the pods avaialble before testing
+  # the app or deploying installable builds to avoid parallel steps wasting
+  # time building the same `Podfile.lock` from scratch.
+  - label: ':cocoapods: Pre-load Pods'
+    key: pods
+    command: .buildkite/commands/load-pods.sh
+    env: *common_env
+    plugins: *common_plugins


Does this really work? I thought that each step in a Buildkite pipeline were independent from one another in terms of the agents they might be using, which means that this step downloading / installing pods might run on one agent, while the following steps that depends on it might run on a different one:

However you add steps to your pipeline, keep in mind that steps may run on different agents. It is good practice to install your dependencies in the same step that you run them.
— Source: https://buildkite.com/docs/pipelines/defining-steps#adding-steps

So if we want this to work as expected, we'd also have to e.g. upload a .tar of the Pods/ folder that got installed by this step in artefacts, then download and uncompress that Pods.tar back in the subsequent steps (similar to how we do it for build-for-testing uploading the Build Products here then unit-tests.sh downloading them back here)

So if we want this to work as expected, we'd also have to e.g...

That's what install_cocoapods does at the bash-cache plugin level.

Or, I should say "install_cocoapods should be doing". I haven't gotten to the point of testing a build that would trigger a new cache generation. But I'm pretty sure it works as install_cocoapods is tried and tested.

Oh I didn't realize that install_cocoapods was already I closing the cache dance! Your initial approach makes more sense now 🙂

So I guess the only thing left to solve is how to tell install_cocoapods told only install them (and thus ultimately rebuild the cache) if there was a cache miss… but don't bother installing them during this check step in case of a cache hit.

mokagio · 2022-11-02T19:18:46Z

@AliSoftware thank you for having a look here and validating my doubts on the ROI of the setup as it is right now.

Your suggestion of doing a quick cache check at the start and only if we get a miss do the hard work of setting things up is exactly what I was thinking 😄

I'll have a look at some point.

mokagio force-pushed the mokagio/pre-load-pods branch from c923ffc to d88a72e Compare November 2, 2022 01:51

mokagio self-assigned this Nov 2, 2022

mokagio mentioned this pull request Nov 2, 2022

Run genstrings in CI to validate LocalizedString usages #19553

Merged

4 tasks

AliSoftware requested changes Nov 2, 2022

View reviewed changes

mokagio added the Tooling Build, Release, and Validation Tools label Nov 2, 2022

mokagio added this to the Someday milestone Nov 2, 2022

jkmassel closed this Jun 21, 2024

jkmassel deleted the mokagio/pre-load-pods branch July 26, 2024 18:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add step to load pods in cache before building and testing#19551

Add step to load pods in cache before building and testing#19551
mokagio wants to merge 1 commit intotrunkfrom
mokagio/pre-load-pods

mokagio commented Nov 2, 2022 •

edited

Loading

Uh oh!

wpmobilebot commented Nov 2, 2022

Uh oh!

wpmobilebot commented Nov 2, 2022

Uh oh!

AliSoftware left a comment •

edited

Loading

Uh oh!

AliSoftware Nov 2, 2022

Uh oh!

mokagio Nov 2, 2022

Uh oh!

mokagio Nov 2, 2022

Uh oh!

AliSoftware Nov 2, 2022

Uh oh!

mokagio commented Nov 2, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

mokagio commented Nov 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Testing

RFC – Is this worth it & How to improve it

Regression Notes

Uh oh!

wpmobilebot commented Nov 2, 2022

Uh oh!

wpmobilebot commented Nov 2, 2022

Uh oh!

AliSoftware left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AliSoftware Nov 2, 2022

Choose a reason for hiding this comment

Uh oh!

mokagio Nov 2, 2022

Choose a reason for hiding this comment

Uh oh!

mokagio Nov 2, 2022

Choose a reason for hiding this comment

Uh oh!

AliSoftware Nov 2, 2022

Choose a reason for hiding this comment

Uh oh!

mokagio commented Nov 2, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mokagio commented Nov 2, 2022 •

edited

Loading

AliSoftware left a comment •

edited

Loading