fix: Restart buildkit after containerd when provisioning#461
fix: Restart buildkit after containerd when provisioning#461pendo324 merged 2 commits intorunfinch:mainfrom
Conversation
|
Seems like all of the errors on the tests are similar to this: Wondering if its because of the way the |
Been seeing some of these failures across a few different PRs. Potentially not related to your changes. |
I reran the CI on #455, and it didn't hit these errors. I think its something related to this one |
|
The tests are failing here on this PR and on the finch-core repo. The similarity? Both are now running with synced containerd/buildkit worker UUIDs. In other words, the "correct" behavior appears to be that these tests fail. Looking into when/why this started happening. |
|
I've pulled your changes down locally, verified that the UUIDs are matching, but can't reproduce the e2e test failures locally. These tests have been failing on finch-core since e2e tests were added. The difference between finch and finch-core e2e tests is that finch-core e2e 1/ runs a slightly different Fedora VM and 2/ runs tests with
True, but for some reason the Fedora VM used in finch-core e2e tests doesn't require this buildkit systemd service ordering change to enforce that UUID match... see https://github.com/runfinch/finch-core/blob/eaeaa96443a633d49d51143373650a93ab83a467/e2e/fedora.yaml#L148-L150 Here's my results using https://github.com/runfinch/finch-core/blob/main/e2e/fedora.yaml: The changes in this PR do what we expect, which is sync the UUIDs for buildkit and containerd. |
|
tl;dr for CI issues here is garbage collection labels not being set properly. I got to the bottom of it and opened an issue on nerdctl containerd/nerdctl#2372, but in reality this is a bug with buildkitv0.11.x . The bug has been patched in moby/buildkit#3972 which was included in buildkitv0.12.0 which was released just yesterday: https://github.com/moby/buildkit/releases/tag/v0.12.0. Opened up containerd/nerdctl#2374 to upgrade nerdctl's buildkit to v0.12.0. Let's keep this PR open until we have both a new nerdctl release with the upgrade and a new Lima release pointing to the new nerdctl release. |
|
Should be able to rebase and re-run CI now #521 has been merged with dependency upgrades |
Signed-off-by: Justin Alvarez <alvajus@amazon.com>
Signed-off-by: Justin Alvarez <alvajus@amazon.com>
19392e1 to
e36f90b
Compare
Just rebased 🤞 |
|
Benchmarking steps:
|
🤖 I have created a release *beep* *boop* --- ## [0.8.0](v0.7.0...v0.8.0) (2023-08-16) ### Features * adding config option for SOCI installation on VM ([#506](#506)) ([a2e077b](a2e077b)) ### Bug Fixes * configure aws creds in sync submodules/deps action ([#518](#518)) ([b67452e](b67452e)) * give pull request write permissions to sync job ([#520](#520)) ([55b5235](55b5235)) * give token write perms to sync-submodules ([#519](#519)) ([8b639ea](8b639ea)) * Mount /var/folders to finch vm ([#525](#525)) ([c97d2e9](c97d2e9)) * option to use installed lima for SOCI e2e tests ([#533](#533)) ([8b66659](8b66659)) * quote recursive calls to make ([#515](#515)) ([d603096](d603096)) * Restart buildkit after containerd when provisioning ([#461](#461)) ([fca1828](fca1828)) ### Build System or External Dependencies * **deps:** Bump github.com/docker/cli from 24.0.4+incompatible to 24.0.5+incompatible ([#495](#495)) ([e9e8617](e9e8617)) * **deps:** Bump github.com/docker/docker from 24.0.4+incompatible to 24.0.5+incompatible ([#497](#497)) ([6f1afbb](6f1afbb)) * **deps:** Bump github.com/lima-vm/lima from 0.16.0 to 0.17.2 ([#531](#531)) ([6e33d15](6e33d15)) * **deps:** Bump github.com/onsi/gomega from 1.27.8 to 1.27.10 ([#496](#496)) ([d08d102](d08d102)) * **deps:** Bump github.com/pkg/sftp from 1.13.5 to 1.13.6 ([#530](#530)) ([09b3846](09b3846)) * **deps:** Bump github.com/shirou/gopsutil/v3 from 3.23.6 to 3.23.7 ([#513](#513)) ([83bd718](83bd718)) * **deps:** Bump golang.org/x/tools from 0.11.0 to 0.11.1 ([#509](#509)) ([e826bcf](e826bcf)) * **deps:** Bump golang.org/x/tools from 0.11.1 to 0.12.0 ([#523](#523)) ([09d6514](09d6514)) * **deps:** Bump k8s.io/apimachinery from 0.27.3 to 0.27.4 ([#487](#487)) ([444bbc0](444bbc0)) * **deps:** Bump k8s.io/apimachinery from 0.27.4 to 0.28.0 ([#535](#535)) ([8df84cf](8df84cf)) * **deps:** Bump submodules and dependencies ([#521](#521)) ([1b3ad94](1b3ad94)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Issue #, if available: *Description of changes:* - Fixes e2e tests by restarting BuildKit when containerd is restarted (similar issue faced in runfinch/finch#461) - Uses a more up to date yaml file for testing (will automatically update the OS version too since that's included in the makefile) *Testing done:* - [x] I've reviewed the guidance in CONTRIBUTING.md #### License Acceptance By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. --------- Signed-off-by: Justin Alvarez <alvajus@amazon.com>
Issue #, if available: Fixes #412
Description of changes:
Testing done:
Local testing
I've reviewed the guidance in CONTRIBUTING.md
License Acceptance
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.