solver: fix possible race for provenance ResolveImageConfig#4157
Merged
tonistiigi merged 1 commit intomoby:masterfrom Aug 17, 2023
Merged
solver: fix possible race for provenance ResolveImageConfig#4157tonistiigi merged 1 commit intomoby:masterfrom
tonistiigi merged 1 commit intomoby:masterfrom
Conversation
ResolveImageConfig can be called concurrently - for example, by
dockerfile2llb during conversion, we loop through each stage and resolve
the base image for that stage.
In the case that two calls to ResolveImageConfig finish at roughly the
same time, we can hit an edge case where we attempt to modify the
bridge's image records at the same time.
To fix this, we just need to use the bridge's mutex to prevent
concurrent access here.
This should fix the following stack trace found in CI:
sandbox.go:144: goroutine 1079 [running]:
sandbox.go:144: github.com/moby/buildkit/solver/llbsolver.(*provenanceBridge).ResolveImageConfig(0xc000431e00, {0x1c2b040?, 0xc0008e5b30?}, {0xc00094ba00?, 0xc0003728f0?}, {0x0, 0xc0006cb580, {0x19ba868, 0x7}, {0xc0008f7500, ...}, ...})
sandbox.go:144: /src/solver/llbsolver/provenance.go:139 +0x1fb
sandbox.go:144: github.com/moby/buildkit/frontend/dockerfile/dockerfile2llb.toDispatchState.func3.1()
sandbox.go:144: /src/frontend/dockerfile/dockerfile2llb/convert.go:405 +0x5fe
sandbox.go:144: golang.org/x/sync/errgroup.(*Group).Go.func1()
sandbox.go:144: /src/vendor/golang.org/x/sync/errgroup/errgroup.go:75 +0x64
sandbox.go:144: created by golang.org/x/sync/errgroup.(*Group).Go
sandbox.go:144: /src/vendor/golang.org/x/sync/errgroup/errgroup.go:72 +0xa5
--- FAIL: TestIntegration/TestNoCache/worker=oci-rootless/frontend=builtin (4.45s)
No other explanation for this failure makes sense - `b` cannot be `nil`
at this point, since a call to `b.llbBridge.ResolveImageConfig` has just
succeeded (also because that would be very strange).
Signed-off-by: Justin Chadwell <me@jedevc.com>
crazy-max
approved these changes
Aug 17, 2023
tonistiigi
approved these changes
Aug 17, 2023
3 tasks
Member
Author
|
@tonistiigi this looks like this is causing panics downstream: docker/buildx#2064. Any objections to doing a v0.12.3 release? Happy to help prep for it - we have a few useful fixes to pull in: https://github.com/moby/buildkit/issues?q=label%3Aneeds-cherry-pick%2Fv0.12+sort%3Aupdated-desc+is%3Aclosed. |
Member
|
+1 for backporting (based on the diff, which seems very small/focussed) /cc @neersighted (we'll probably need a revendor in moby) |
This was referenced Aug 14, 2024
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
ResolveImageConfig can be called concurrently - for example, by dockerfile2llb during conversion, we loop through each stage and resolve the base image for that stage.
In the case that two calls to ResolveImageConfig finish at roughly the same time, we can hit an edge case where we attempt to modify the bridge's image records at the same time.
To fix this, we just need to use the bridge's mutex to prevent concurrent access here.
This should fix the following stack trace found in CI (https://github.com/moby/buildkit/actions/runs/5889475633/job/15972815280?pr=4041):
No other explanation for this failure makes sense -
bcannot benilat this point, since a call tob.llbBridge.ResolveImageConfighas just succeeded (also because that would be very strange).Note: I can't manage to reproduce this with the race checker, so I'm not actually 100% sure that this is the issue that caused the CI failure, but the code here is definitely not thread-safe, so at the very least, this improves that.