-
Notifications
You must be signed in to change notification settings - Fork 327
Add cache save and restore using github.com/buildkite/zstash #3551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cache save and restore using github.com/buildkite/zstash #3551
Conversation
5cd1e8e to
db78c9f
Compare
|
Tagging myself so I'm subscribed |
|
I am working on reducing the footprint of the library via buildkite/zstash#79 and buildkite/zstash#80 |
8240d88 to
bb98e0a
Compare
zhming0
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a preliminary review. My comments are mainly recommendations.
I don't see any red flag. But I have doubt about the concept of cache id being overly prominent in the code base.
clicommand/cache_restore.go
Outdated
| paths: | ||
| - node_modules |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be wise to we document the behavior when it comes to symlink in the folder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhming0 yeah this will need to be captured as it also overlaps with security, which makes it very complicated.
| The command automatically uses the following environment variables when available: | ||
| - BUILDKITE_BRANCH (for branch scoping) | ||
| - BUILDKITE_PIPELINE_SLUG (for pipeline scoping) | ||
| - BUILDKITE_ORGANIZATION_SLUG (for organization scoping)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[question] If these aren't available, are these information by default extracted from job token?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhming0 at the moment this is just run from the CLI so it depends on the environment variables, this will change once we add some configuration to the pipeline yaml as we can then trigger it within the agent lifecycle without needing to invoke the cli.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main question was on the security aspect. If in the backend we validate BUILDKITE_BRANCH against agent job token then I think it's ✅ .
| } | ||
|
|
||
| // setupCacheClient creates a cache client and determines which cache IDs to process | ||
| func setupCacheClient(ctx context.Context, l logger.Logger, cfg Config) (*zstash.Cache, []string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[non-blocking] as we will eventually introduce more ways to specify cache configuration, I am not sure if we should use cache id as the first level concept here. Maybe we should turn it into a concrete CacheConfig object and pass that around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, will add validation of cache ids.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my main point is that maybe this method should return a list of cache configurations instead of cache ids. Cache ID as a concept doesn't seem to serve much purpose beyond system boundary. They might be more of an inconvenience when we want to refactor this down the track.
Like this method, maybe it can just return
| func setupCacheClient(ctx context.Context, l logger.Logger, cfg Config) (*zstash.Cache, []string, error) { | |
| func setupCacheClient(ctx context.Context, l logger.Logger, cfg Config) (*zstash.Cache, []CacheConfiguration, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhming0 sorry commented on the wrong change here, cache ids are used to selectively restore caches in a step. The common example is a rails app wiht ruby and node deps, being able to only restore node or ruby deps or both in a step is useful.
This model follows other ci providers in their structure.
A good example of a bigger cache is something we experimented here https://github.com/bk-playground/kitesocial/blob/main/.buildkite/cache.hosted.yml#L1-L9 which has a docker cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I acknowledge the value of cache ids in boundary/edge (in people's pipeline yaml as you mentioned), but I don't see a great value preserving this concept beyond that.
What I mean is that maybe we can just take the cache IDs, parse them into into config configuration at the earliest code path, so we don't need to pass cache ids around internall?
DrJosh9000
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First thoughts
c7790ce to
f762297
Compare
zhming0
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great starting point. I have doubt about some details but none of them are blocking. So it's a LGTM from me.
This change introduces some new sub commands. * buildkite-agent cache save * buildkite-agent cache restore It is currently using some conventions from zstash which will change as it is integrated more with the agent.
This structure allows additional options at the top level of the configuration.
f762297 to
04c0aa1
Compare
|
@DrJosh9000 did you have any concerns with me merging this? |
|
@wolfeidau I planned to take another look this afternoon, but don't let me stop you if you want it in now |
|
@DrJosh9000 if you find anything happy to remediate via a follow up PR! |
Description
Adding buildkite cache support to the agent.
Context
https://linear.app/buildkite/issue/MDC-723/add-the-restore-and-save-commands-to-the-buildkite-agent
Changes
This change introduces some new sub commands.
It is currently using some conventions from zstash which will change as it is integrated more with the agent.
This change is just adding these subcommands to the agent for use from a plugin.
Testing
Currently most of the testing for this feature is in github.com/buildkite/zstash given this is just a minimal integration currently.
Disclosures / Credits
Claude Code was used to write and review code in this PR.