Skip to content

Conversation

@wolfeidau
Copy link
Member

@wolfeidau wolfeidau commented Oct 22, 2025

Description

Adding buildkite cache support to the agent.

Context

https://linear.app/buildkite/issue/MDC-723/add-the-restore-and-save-commands-to-the-buildkite-agent

Changes

This change introduces some new sub commands.

  • buildkite-agent cache save
  • buildkite-agent cache restore

It is currently using some conventions from zstash which will change as it is integrated more with the agent.

This change is just adding these subcommands to the agent for use from a plugin.

Testing

Currently most of the testing for this feature is in github.com/buildkite/zstash given this is just a minimal integration currently.

Disclosures / Credits

Claude Code was used to write and review code in this PR.

@wolfeidau wolfeidau force-pushed the mdc-723-add-the-restore-and-save-commands-to-the-buildkite-agent branch 2 times, most recently from 5cd1e8e to db78c9f Compare October 22, 2025 05:55
@DrJosh9000 DrJosh9000 self-requested a review October 22, 2025 06:11
@DrJosh9000
Copy link
Contributor

Tagging myself so I'm subscribed

@wolfeidau
Copy link
Member Author

I am working on reducing the footprint of the library via buildkite/zstash#79 and buildkite/zstash#80

@wolfeidau wolfeidau marked this pull request as ready for review October 23, 2025 05:40
@wolfeidau wolfeidau requested review from a team and zhming0 October 23, 2025 21:28
@wolfeidau wolfeidau force-pushed the mdc-723-add-the-restore-and-save-commands-to-the-buildkite-agent branch from 8240d88 to bb98e0a Compare October 24, 2025 00:41
Copy link
Contributor

@zhming0 zhming0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a preliminary review. My comments are mainly recommendations.

I don't see any red flag. But I have doubt about the concept of cache id being overly prominent in the code base.

Comment on lines 48 to 49
paths:
- node_modules
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be wise to we document the behavior when it comes to symlink in the folder.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhming0 yeah this will need to be captured as it also overlaps with security, which makes it very complicated.

Comment on lines +58 to +62
The command automatically uses the following environment variables when available:
- BUILDKITE_BRANCH (for branch scoping)
- BUILDKITE_PIPELINE_SLUG (for pipeline scoping)
- BUILDKITE_ORGANIZATION_SLUG (for organization scoping)`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[question] If these aren't available, are these information by default extracted from job token?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhming0 at the moment this is just run from the CLI so it depends on the environment variables, this will change once we add some configuration to the pipeline yaml as we can then trigger it within the agent lifecycle without needing to invoke the cli.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main question was on the security aspect. If in the backend we validate BUILDKITE_BRANCH against agent job token then I think it's ✅ .

}

// setupCacheClient creates a cache client and determines which cache IDs to process
func setupCacheClient(ctx context.Context, l logger.Logger, cfg Config) (*zstash.Cache, []string, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[non-blocking] as we will eventually introduce more ways to specify cache configuration, I am not sure if we should use cache id as the first level concept here. Maybe we should turn it into a concrete CacheConfig object and pass that around.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, will add validation of cache ids.

Copy link
Contributor

@zhming0 zhming0 Oct 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my main point is that maybe this method should return a list of cache configurations instead of cache ids. Cache ID as a concept doesn't seem to serve much purpose beyond system boundary. They might be more of an inconvenience when we want to refactor this down the track.

Like this method, maybe it can just return

Suggested change
func setupCacheClient(ctx context.Context, l logger.Logger, cfg Config) (*zstash.Cache, []string, error) {
func setupCacheClient(ctx context.Context, l logger.Logger, cfg Config) (*zstash.Cache, []CacheConfiguration, error) {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhming0 sorry commented on the wrong change here, cache ids are used to selectively restore caches in a step. The common example is a rails app wiht ruby and node deps, being able to only restore node or ruby deps or both in a step is useful.

This model follows other ci providers in their structure.

A good example of a bigger cache is something we experimented here https://github.com/bk-playground/kitesocial/blob/main/.buildkite/cache.hosted.yml#L1-L9 which has a docker cache.

Copy link
Contributor

@zhming0 zhming0 Oct 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I acknowledge the value of cache ids in boundary/edge (in people's pipeline yaml as you mentioned), but I don't see a great value preserving this concept beyond that.

What I mean is that maybe we can just take the cache IDs, parse them into into config configuration at the earliest code path, so we don't need to pass cache ids around internall?

Copy link
Contributor

@DrJosh9000 DrJosh9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First thoughts

@wolfeidau wolfeidau force-pushed the mdc-723-add-the-restore-and-save-commands-to-the-buildkite-agent branch 2 times, most recently from c7790ce to f762297 Compare October 27, 2025 04:48
Copy link
Contributor

@zhming0 zhming0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great starting point. I have doubt about some details but none of them are blocking. So it's a LGTM from me.

@wolfeidau wolfeidau force-pushed the mdc-723-add-the-restore-and-save-commands-to-the-buildkite-agent branch from f762297 to 04c0aa1 Compare October 29, 2025 01:41
@wolfeidau
Copy link
Member Author

@DrJosh9000 did you have any concerns with me merging this?

@DrJosh9000
Copy link
Contributor

@wolfeidau I planned to take another look this afternoon, but don't let me stop you if you want it in now

@wolfeidau wolfeidau merged commit 0d2e270 into main Oct 29, 2025
1 check passed
@wolfeidau wolfeidau deleted the mdc-723-add-the-restore-and-save-commands-to-the-buildkite-agent branch October 29, 2025 02:25
@wolfeidau
Copy link
Member Author

@DrJosh9000 if you find anything happy to remediate via a follow up PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants