Skip to content

fix: gh-aw skill portability — 3-model audit findings#771

Open
PureWeen wants to merge 2 commits intomainfrom
fix/skill-portability-audit
Open

fix: gh-aw skill portability — 3-model audit findings#771
PureWeen wants to merge 2 commits intomainfrom
fix/skill-portability-audit

Conversation

@PureWeen
Copy link
Copy Markdown
Owner

3-model adversarial audit (Opus/Sonnet/Codex) found 8 portability issues:

SKILL.md:

  • Removed empty orphaned heading
  • Re-linked references/architecture.md from Further Reading

references/architecture.md:

  • Replaced 6 Checkout-GhAwPr.ps1 references with inline bash example
  • Changed 3 instances of 'our workflows' to generic language
  • Fixed .github/skills/ path to .github/scripts/ in anti-pattern example
  • Inlined complete checkout-then-restore pattern

scripts/:

  • Renamed Check-WorkflowSecurity.ps1Test-GhAwWorkflowSecurity.ps1 (Test- is approved PowerShell verb)
  • Fixed fix-message to not reference repo-specific script name

Adversarial review by Opus/Sonnet/Codex found 8 portability issues:

SKILL.md:
- Removed empty 'When to Read the Full Reference' heading
- Re-linked references/architecture.md from Further Reading section

references/architecture.md:
- Replaced all Checkout-GhAwPr.ps1 references with inline bash
  example (6 instances)
- Changed 'our workflows' to generic language (3 instances)
- Fixed '.github/skills/some-script.ps1' to '.github/scripts/'
- Inlined the full checkout-then-restore pattern with comments

scripts/:
- Renamed Check-WorkflowSecurity.ps1 → Test-GhAwWorkflowSecurity.ps1
  (Test- is the approved PowerShell verb for validation)
- Fixed fix-message to not reference Checkout-GhAwPr.ps1 by name

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

Design-Level Findings (outside the diff)

🟡 MODERATE — Stale Checkout-GhAwPr.ps1 references not updated (3/3 reviewers)

Three files outside this PR still reference the old script name. The most impactful is the active agent instruction file which will direct agents to use a pattern that no longer exists in the documentation:

File Line Impact
.github/instructions/gh-aw-workflows.instructions.md ~21 🔴 Active instruction — rule 9 still says "Use Checkout-GhAwPr.ps1 for workflow_dispatch workflows...". Agents following these instructions will conflict with the updated architecture.md.
.github/instructions/gh-aw-workflows.sync.yaml ~15 Stale divergence reason: "Documents our custom Checkout-GhAwPr.ps1 script"
.claude/skills/instruction-drift/SKILL.md ~90 Stale divergence reason referencing old script name

Suggested fix: Update these files in the same PR to use the generic "checkout-then-restore" terminology, especially the instructions file.


🟢 MINOR — Restore scope broadened without documentation note (2/3 reviewers)

The old script restored three specific items (.github/skills/, .github/instructions/, .github/copilot-instructions.md). The new pattern restores the entire .github/ and .agents/ directories. This is actually a security improvement (broader coverage), but the undocumented scope change could confuse maintainers who expect the patterns to be equivalent. A one-line note explaining the intentional broadening would help.

Generated by Expert Code Review (auto) for issue #771 · ● 8M ·

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generated by Expert Code Review (auto) for issue #771 · ● 8M

Comment on lines +280 to 284
# 3. Check out the PR branch
gh pr checkout "$PR_NUMBER"
# 4. Restore trusted agent infrastructure from base branch
git checkout "$BASE_SHA" -- .github/ .agents/ 2>/dev/null || true
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL — Silent failure on security-critical restore step

The old PowerShell script documented restore as fatal on failure (step 4: "Restores ... from the base branch SHA (fatal on failure)"). The new inline pattern uses:

git checkout "$BASE_SHA" -- .github/ .agents/ 2>/dev/null || true

2>/dev/null || true silently swallows ALL failures — including the failure to restore .github/. If the base branch doesn't contain a .github/ directory (new repo), or the SHA is unreachable (shallow clone, GC'd), or the command errors for any reason, the workflow continues with the untrusted PR branch's .github/ files intact. For workflow_dispatch, where there's no platform restore, this means fork-controlled agent infrastructure (skills, instructions, copilot-instructions.md) would be used by the agent.

Concrete scenario: An attacker with write access opens a PR that modifies .github/copilot-instructions.md to inject malicious agent instructions. A workflow_dispatch run triggers; the git checkout restore silently fails (e.g. due to a shallow clone missing $BASE_SHA); the agent runs with attacker-controlled instructions.

Suggested fix: Split into two operations — one that must succeed (.github/), one that may not exist (.agents/):

# MUST succeed — fatal if .github/ can't be restored
git checkout "$BASE_SHA" -- .github/
# Optional — .agents/ may not exist in all repos
git checkout "$BASE_SHA" -- .agents/ 2>/dev/null || true

PR_NUMBER: ${{ github.event.pull_request.number || inputs.pr_number }}
run: pwsh .github/scripts/Checkout-GhAwPr.ps1
run: |
# 1. Verify PR author has write access, reject forks
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MODERATE — Comment claims "reject forks" but code doesn't

The comment says # 1. Verify PR author has write access, reject forks but the code only checks the permission level. The old script explicitly rejected fork PRs as a separate check. The write-access check does not fully subsume fork rejection: a collaborator with write access can still submit a PR from their personal fork, and that PR's .github/ would be fork-controlled until the restore step runs.

This is partially mitigated by the restore in step 4 (which overwrites .github/ from the base branch), but if the restore silently fails (see the || true finding above), the fork-controlled .github/ survives.

Suggested fix: Add an explicit fork check before the permission check:

HEAD_REPO=$(gh pr view "$PR_NUMBER" --json headRepository --jq '.headRepository.owner.login')
BASE_REPO=$(gh pr view "$PR_NUMBER" --json baseRepository --jq '.baseRepository.owner.login')
if [[ "$HEAD_REPO" != "$BASE_REPO" ]]; then
  echo "::error::Fork PRs not allowed via workflow_dispatch"
  exit 1
fi

Or remove "reject forks" from the comment to avoid being misleading.

# 3. Check out the PR branch
gh pr checkout "$PR_NUMBER"
# 4. Restore trusted agent infrastructure from base branch
git checkout "$BASE_SHA" -- .github/ .agents/ 2>/dev/null || true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL2>/dev/null || true silently swallows a security-critical restore failure

The old Checkout-GhAwPr.ps1 script was documented as fatal on failure for the restore step (PowerShell $ErrorActionPreference = 'Stop'). The new inline pattern inverts that — 2>/dev/null || true suppresses all errors and exits 0.

Failing scenario: For workflow_dispatch, the platform skips checkout_pr_branch.cjs entirely, so this restore IS the final workspace state. If git checkout fails (empty $BASE_SHA, shallow clone, git error), the PR branch's .github/ and .agents/ remain in place — untrusted agent infrastructure survives, defeating the entire defense-in-depth pattern.

Suggested fix: Remove 2>/dev/null || true so the step fails hard on restore error:

git checkout "$BASE_SHA" -- .github/ .agents/

If suppressing stderr for legitimately non-existent paths is needed, use a targeted pre-check rather than blanket error suppression.

Flagged by: 3/3 reviewers

PR_NUMBER: ${{ github.event.pull_request.number || inputs.pr_number }}
run: pwsh .github/scripts/Checkout-GhAwPr.ps1
run: |
# 1. Verify PR author has write access, reject forks
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 MINOR — Comment says "reject forks" but no fork detection exists

The code checks the author's collaborator permission (write/admin/maintain) — it does not check head.repo.id != base.repo.id or isCrossRepository. A collaborator with write access who submits from a fork would pass this check.

This is a documentation accuracy issue, not a security gap — the real defense is step 4 (the restore), and the write-access gate is a valid authorization check on its own. But the misleading comment could confuse security reviewers into believing fork rejection happens here when it doesn't.

Suggested fix: Reword the comment:

# 1. Verify PR author is a trusted collaborator (requires write+ access)

Flagged by: 3/3 reviewers after follow-up adjudication (originally 1/3, confirmed by 2 follow-up reviewers at reduced severity)

exit 1
fi
# 2. Capture base branch SHA before checkout
BASE_SHA=$(gh pr view "$PR_NUMBER" --json baseRefOid --jq '.baseRefOid')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICALBASE_SHA not validated; empty value cascades into silent restore bypass

If PR_NUMBER is unset, the API call fails, or jq returns null, BASE_SHA will be empty. The subsequent git checkout "" -- .github/ .agents/ will fail — but that error is swallowed by 2>/dev/null || true on line 283. The restore is silently skipped.

Suggested fix: Add validation immediately after assignment:

BASE_SHA=$(gh pr view "$PR_NUMBER" --json baseRefOid --jq '.baseRefOid')
if [[ -z "$BASE_SHA" ]]; then
  echo "::error::Could not determine base branch SHA for PR $PR_NUMBER"
  exit 1
fi

Flagged by: 2/3 reviewers (compounded by Finding 1 — both reviewers who flagged the || true issue also identified this failure path)

- architecture.md line 3: removed 'SKILL.md' self-reference
- architecture.md line 70: 'Our gh-aw workflows' → generic
- SKILL.md line 439: removed 'in this skill directory'

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant