Skip to content

install-agent-team: pin catalog SHA instead of @main to avoid gh-aw#27407 (silent SHA-resolution fallback) #47

@verkyyi

Description

@verkyyi

Summary

A fresh install via /install-agent-team can produce lockfiles whose embedded SHA-256 hash doesn't match the .md frontmatter that ends up committed, causing the first real run of the implementer agent to fail with:

ERR_CONFIG: Lock file '.github/workflows/implementer-agent.lock.yml' is outdated!
The workflow file '.github/workflows/implementer-agent.md' frontmatter has changed.

Hit this on verkyyi/agentfolio during the first dispatched task (issue #105 → implementer run 24679888189). Recovery PR: verkyyi/agentfolio#107.

Root cause (revised after deeper investigation)

Earlier framing of this issue blamed gh aw validate for silently recompiling and reverting the OAuth sed tweak. That was incorrect. gh aw validate passes NoEmit: true internally (pkg/cli/validate_command.go) and does not write lockfiles. My own post-install check confirmed the sed tweak survives gh aw validate. The original theory was wrong — the real cause is upstream in gh-aw core.

The actual source of the ERR_CONFIG is an upstream bug I've filed at github/gh-aw#27407:

  • pkg/cli/fetch.go:86-92 silently swallows transient failures of ResolveRefToSHAForHost (rate-limit, network hiccup) and sets commitSHA = "".
  • pkg/cli/spec.go:398-416 then falls back to @<ref> (i.e., @main) when commitSHA is empty.
  • The resulting .md / .lock.yml pair ships inconsistent — the .md has @main in source:, but the lockfile's stored hash was computed from a different canonical form.

Evidence from my install commit (3c36033):

spec-agent.md:        source: ...spec-agent.md@cb66d12806d7f00d220f11e964bc27dfec672913
planner-agent.md:     source: ...planner-agent.md@cb66d12806d7f00d220f11e964bc27dfec672913
reviewer-agent.md:    source: ...reviewer-agent.md@cb66d12806d7f00d220f11e964bc27dfec672913
implementer-agent.md: source: ...implementer-agent.md@main                      ← outlier

All four were installed with identical gh aw add <path>@main calls in a single shell session. One transient API failure → one unpinned workflow → one runtime ERR_CONFIG.

What the plugin can do (still valuable, but for a different reason)

The plugin-level fix is SHA pinning, and it's still the right change — not because gh aw validate is buggy, but because pinning to an explicit SHA avoids the ResolveRefToSHAForHost codepath entirely. If the source ref is already a 40-char SHA, gh aw add has nothing to resolve and the silent-fallback bug cannot trigger.

Proposed change in skills/install-agent-team/SKILL.md

Replace Step 4 ("install all four workflows") with an approach that pins each gh aw add to a specific catalog SHA:

# Compute the plugin's own catalog SHA (this file lives inside the plugin, so HEAD is always deterministic)
PLUGIN_SHA=$(git -C "$CLAUDE_PLUGIN_ROOT" rev-parse HEAD)

gh aw add verkyyi/github-agent-runner/catalog/agent-team/spec-agent.md@$PLUGIN_SHA
gh aw add verkyyi/github-agent-runner/catalog/agent-team/planner-agent.md@$PLUGIN_SHA
gh aw add verkyyi/github-agent-runner/catalog/agent-team/implementer-agent.md@$PLUGIN_SHA
gh aw add verkyyi/github-agent-runner/catalog/agent-team/reviewer-agent.md@$PLUGIN_SHA

Benefits:

  • Every install is reproducible against the plugin version that ran the skill.
  • Bypasses the upstream ResolveRefToSHAForHost flake by not depending on ref resolution at install time.
  • Tangential supply-chain benefit: users can audit the exact catalog content they installed.

Scratched: the "reorder validate" recommendation

Earlier I proposed reordering step 7 (gh aw validate) so the OAuth sed tweak would be the last thing to touch the lockfiles. That recommendation was based on my incorrect theory that validate mutates source. It doesn't, and the reorder is unnecessary. Keep gh aw validate where it is (or skip it — it adds little after each gh aw add already compiles). No change needed to that step.

Suggested acceptance

  • install-agent-team skill pins $CLAUDE_PLUGIN_ROOT's current SHA into each gh aw add call instead of @main.
  • Docstring/comment in the skill explains why (links to gh-aw#27407 for the underlying bug it's working around).
  • No change to the OAuth sed order (step 7 stays as-is, or is removed entirely — either is fine).

Once gh-aw#27407 is fixed upstream, the SHA-pinning becomes a belt-and-suspenders safety net rather than a necessity — but it's still the better contract (reproducible installs) and should stay.

Happy to send a PR if you'd like.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions