Skip to content

Clone ResultProxy at gateway boundaries#2929

Merged
tonistiigi merged 1 commit intomoby:masterfrom
jedevc:cloneable-resultproxy
Jul 30, 2022
Merged

Clone ResultProxy at gateway boundaries#2929
tonistiigi merged 1 commit intomoby:masterfrom
jedevc:cloneable-resultproxy

Conversation

@jedevc
Copy link
Copy Markdown
Member

@jedevc jedevc commented Jun 24, 2022

Previously, a frontend returning two identical references caused a
double-release error, as the gateway expects each ref to be unique.
However, there are scenarios where a frontend might reasonably be
expected to provide identical refs, e.g. for 2 platforms that have
binary compatibility, or for the upcoming attestations use-case.

To solve this issue, we refactor the ResultProxy implementation to
support cloning. Since the underlying Result supports reference
counting, we can avoid needing to add another layer of refcounts, and
simply clone the underlying refs.

Additionally, we then use the new Cloning functionality in the gateway
and forwarder (e.g. dockerfile) frontends, and ensure that the same
ResultProxy is never returned twice in a result. This means that
identical refs over the wire are translated into the correct
shared-memory structures.

@jedevc jedevc requested a review from tonistiigi June 24, 2022 11:43
@jedevc
Copy link
Copy Markdown
Member Author

jedevc commented Jun 24, 2022

Not quite sure how to test that this, it doesn't look like there are dedicated tests around the gateway - maybe having a fake frontend in client_test.go to create the duplicate refs is the right approach? Though I'm then not sure how to insert checks into the solver internals to check for the lack of double-frees.

@jedevc jedevc force-pushed the cloneable-resultproxy branch from 260b1df to 4b4050a Compare June 24, 2022 14:52
@tonistiigi
Copy link
Copy Markdown
Member

Not quite sure how to test that this,

There is already some detection for cases like this https://github.com/moby/buildkit/blob/master/solver/result.go#L51 . As well as https://github.com/moby/buildkit/blob/master/cache/refs.go#L1674-L1676 that we set on tests.

Comment thread solver/llbsolver/bridge.go Outdated
@jedevc jedevc force-pushed the cloneable-resultproxy branch from 4b4050a to b4ea127 Compare June 27, 2022 14:23
@jedevc jedevc force-pushed the cloneable-resultproxy branch from b4ea127 to 721daa0 Compare July 19, 2022 11:11
Comment thread solver/types.go Outdated
}

type ResultProxy interface {
Clone() ResultProxy
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit bothered by the change of this interface.

Let's say there is a different implementation of ResultProxy. The Clone() is not really part of that implementation. It is just a reference counting on top of it that doesn't change if the actual implementation changes.

Instead of calling .Clone() I think the code should add a private wrapper around it that implements own Release() but leaves other methods as passthrough.

smth like

type ref struct {
	main      sharedResultProxy
}

type sharedResultProxy {
     solver.ResultProxy
     ... // ref count fields
}

func (srp *sharedResultProxy) clone()

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having a private clone() method means that we have to duplicate that logic in both the gateway and forward packages.

I agree though, I like the idea of isolating the ref counting. Instead of going for a sharedResultProxy, I like the idea of using a resultProxySplit, similar to how we have splitResult in https://github.com/moby/buildkit/blob/v0.10/solver/result.go#L43-L47, with a public helper function like SplitResultProxy() (ResultProxy, ResultProxy). Details of GC are then entirely in the splitting, and we avoid needing to have duplication of clone.

@jedevc jedevc force-pushed the cloneable-resultproxy branch 2 times, most recently from 486ec15 to b447e77 Compare July 28, 2022 13:16
@jedevc
Copy link
Copy Markdown
Member Author

jedevc commented Jul 28, 2022

🎉 should be working now! Because of the way we can now do splitting of result proxies, we need to change the way that the forwarder cleans up it's references - specifically, we no longer need to track which results are part of the final result from toFrontendResult, since the ref counting will automatically clean up in these cases (and likewise in the gateway).

@jedevc jedevc force-pushed the cloneable-resultproxy branch from b447e77 to 8f05861 Compare July 28, 2022 14:11
Comment thread frontend/gateway/forwarder/forward.go Outdated
Comment thread frontend/gateway/gateway.go Outdated
Comment thread frontend/gateway/gateway.go Outdated
return r, nil
}

func (lbf *llbBridgeForwarder) convertRefShared(id string) (solver.ResultProxy, error) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be called cloneRef (Shared only makes sense for a type that has a reference counter on it).

Just to clean this up, there seems to be only one place that calls convertRef that could be also made to use this with extra release call. (In another PR seems that container releasing needs some fixes as well).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, will do a follow-up to this to tidy up around the container releasing, I'm definitely less familiar with it.

continue
}
}
r.Release(context.TODO())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If lbf.err is not nil then all lbf.result fields need to be released here as well?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch 😄

Have fixed this up to release the FrontendResult if we encounter an error, and also applied the same check in the forwarder, which means we do actually need to keep track of the cloned result proxies, so we can tidy them up in case of an error.

Comment thread frontend/gateway/forwarder/forward.go Outdated
@jedevc jedevc force-pushed the cloneable-resultproxy branch from 8f05861 to 23647b4 Compare July 29, 2022 09:25
Previously, a frontend returning two identical references caused a
double-release error, as the gateway expects each ref to be unique.
However, there are scenarios where a frontend might reasonably be
expected to provide identical refs, e.g. for 2 platforms that have
binary compatibility, or for the upcoming attestations use-case.

To solve this issue, we refactor the ResultProxy implementation to
support cloning via splits.

Additionally, we then use the new splitting functionality in the gateway
and forwarder (e.g. dockerfile) frontends, and ensure that the same
ResultProxy is never returned twice in a result. This means that
identical refs over the wire are translated into the correct
shared-memory structures.

Signed-off-by: Justin Chadwell <me@jedevc.com>
@jedevc jedevc force-pushed the cloneable-resultproxy branch from 23647b4 to 863d9d7 Compare July 29, 2022 10:38
@tonistiigi tonistiigi merged commit 54a0df8 into moby:master Jul 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants