Add resource usage monitoring for build steps#3860
Conversation
9b68967 to
853b11e
Compare
|
Is this something we could do with containerd's cgroups package? Does this mean we now have 3 separate implementations (runc, containerd, buildkit)? |
|
I don't see what could be reused from there. |
|
Isn't the "stats" code in there that collects metrics of the container's cgroup? |
Would be nice if the structs can be shared |
But they are not the same, for example:
We could take this one type https://pkg.go.dev/github.com/containerd/cgroups/v3@v3.0.1/cgroup2/stats#CPUStat and embed it into our type that has the pressure support. But I don't think it makes sense to include a big package for just one struct with 6 fields. |
|
needs rebase |
There was a problem hiding this comment.
Can we add comments linking to the specifications for the underlying file formats in https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html.
| *cpuStat.UserNanos = value * 1000 | ||
| case cpuSystemUsec: | ||
| cpuStat.SystemNanos = new(uint64) | ||
| *cpuStat.SystemNanos = value * 1000 |
There was a problem hiding this comment.
We can use uint64Ptr (from io.go) to simplify this.
| } | ||
| return nil, releaseContainer(context.TODO()) | ||
|
|
||
| return rec, rec.CloseAsync(releaseContainer) |
There was a problem hiding this comment.
Instead of doing this, can we not spawn a goroutine here, so we can have a standard Close function:
go func() {
// errors are explicitly discarded in this example
_ = rec.Close()
releaseContainer()
}()
return rec, nilThere was a problem hiding this comment.
I already did a new implementation before I realized why I did it this way.
If the daemon is closed before releaseContainer can be called, then it can leak resources. Monitor will take care of this and will block daemon from shutting down before all the release calls have been invoked. So if releaseContainer has not been called synchronously we still need to register it so that it is guaranteed to be called before shutdown. In a new goroutine this guarantee would not exist. 451e18c
This would be cleaner if Executor itself would have a Shutdown function. Then in here, it could make it block until release has been called.
There was a problem hiding this comment.
We should add this context in a comment, I think it's too easy to accidentally remove in a refactor or similar.
|
I'll be honest, I'm not sure I have a good grasp on why we need both per-step resource usage, and sysUsage - what are they both tracking and why are they different? Is sysUsage capturing from Buildkit, while the per-step is from each ExecOp? Ideally we could have some comments inline that could elaborate 😄 |
|
@jedevc |
853b11e to
e0a8e08
Compare
|
Needs rebase again |
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
This can be used to convert step usage to relative units. Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
e0a8e08 to
262b708
Compare
| } | ||
|
|
||
| withUsage := false | ||
| if v, ok := attrs["capture-usage"]; ok { |
jedevc
left a comment
There was a problem hiding this comment.
A couple of unresolved comments on the PR (would be nice to leave some comments/links inline to make it easier to read later, but shouldn't be a blocker).
|
This ends up crashing on Windows due to the fact that there is no |
This adds the possibility of capturing resource usage of build steps (CPU, memory, io, network) so it can be used for performance analysis or resource controls #2108 in the future.
The usage data is available in provenance attestation. To opt in user needs to set
capture-usage=true. For provenance available via history API, usage is available automatically.Some examples: Search resourceUsage, sysUsage:
https://explore.ggcr.dev/?blob=tonistiigi/buildkit@sha256:c771192c6f500ab46dc75c8aeced2d45cd564ca18dc230458aa225fbdd718936&mt=application%2Fvnd.in-toto%2Bjson&size=101624
https://explore.ggcr.dev/?blob=tonistiigi/buildkit@sha256:55904ca0b7a18fe76c10438c2a2c756eb73302628f6027138cd430b37de13d24&mt=application%2Fvnd.in-toto%2Bjson&size=109356
https://explore.ggcr.dev/?blob=docker.io/tonistiigi/buildkit@sha256:84fc0b9059458cce035df1d49402e30b441ff2a1ebe79f1d7aad6afa3e80754d&mt=application%2Fvnd.in-toto%2Bjson&size=61853
This feature requires cgroupv2. Pressure fields require a kernel with
CONFIG_PSIenabled.memory.peakfile requires kernel 5.19+ . Related fields are empty if requirements are not met or some group controllers are not enabled. I don't think it makes sense to add any fallbacks for cgroupv1 or non-PSI. These will be requirements for #2108 anyway in the future.Network monitoring only works with CNI provider.
Samples are taken at the end of the step and also while execution if the step takes a long time. The minimal sample interval is 2sec and the maximum limit of samples is 10 per build step.
Please check if I'm missing any fields that could be useful, or if you think some of the fields are useless. I didn't add all fields, but I think it is better to add more than miss out on something that could become useful in the future.
System samples for CPU/Memory are added as well so they can be compared against step information to understand how much relative resources a step used. The maximum number of system samples for the whole build is 20 (same 2sec minimum interval).