Skip to content

feat: gh-aw skills, security scanner, and instruction drift tooling#653

Merged
PureWeen merged 3 commits intomainfrom
feat/gh-aw-skills-and-security
Apr 21, 2026
Merged

feat: gh-aw skills, security scanner, and instruction drift tooling#653
PureWeen merged 3 commits intomainfrom
feat/gh-aw-skills-and-security

Conversation

@PureWeen
Copy link
Copy Markdown
Owner

Summary

Comprehensive gh-aw infrastructure for PolyPilot — skills, security enforcement, drift detection, and workflow hardening. Synced with dotnet/maui#35027 and expanded with PolyPilot-specific security tooling.

What's Included

gh-aw-guide skill (.github/skills/gh-aw-guide/)

SKILL.md — Quick-start reference:

  • 21-row anti-patterns table (manual reimplementations → built-in alternatives)
  • LabelOps (label_command: vs names: filtering)
  • Concurrency race condition warnings for slash_command
  • "Approve and run" gate risk documentation
  • 4 security-critical patterns with code examples
  • checkout: false, rate-limit:, web-fetch tool docs

references/architecture.md — Deep reference:

  • Execution model with credential availability matrix
  • Authorization model (on.roles: defaults, roles: all danger)
  • Dangerous Triggers Checklist (what gh-aw#23769 fixed and what it didn't)
  • Security boundaries (9 defense layers table)
  • Integrity filtering hierarchy
  • Fork PR handling (5-trigger behavior matrix)
  • Safe outputs quick reference (30+ types)
  • Troubleshooting table (12 entries)

scripts/Check-WorkflowSecurity.ps1 — Enforcement scanner:

  • 8 rules: pull_request_target without integrity, roles: all on PR workflows, workflow_run without branches, slash_command + cancel-in-progress: true, missing allowed-events on reviews, missing protected-files, workspace script execution after checkout
  • Tested against our own workflows — found and fixed 2 issues

instruction-drift skill (.github/skills/instruction-drift/)

SKILL.md — Drift detection for instruction files:

  • Coverage gaps tracking, index crawling, issue discovery
  • P0-P3 classification (factually wrong → nice-to-have)

scripts/Check-Staleness.ps1 (646 lines):

  • Content hashing (SHA256) to detect doc page changes
  • Index crawling (Get-IndexPageLinks) to discover untracked pages
  • Issue state comparison (actual vs expected from manifest)
  • Release tracking (Get-GitHubLatestRelease)
  • Recently closed issue discovery (90-day window)

scripts/Scan-GhAwUpdates.ps1 — Upstream knowledge extraction:

  • Mines github/gh-aw commits for high-signal changes
  • Watermark-based (only processes new commits per run)
  • Categorizes: safe-output, trigger, compiler, security, engine, breaking
  • Samples shared/ workflow configs for real-world patterns

Workflow hardening

  • COMMENT-only reviewsallowed-events: [COMMENT] on all review workflows (prevents stale CHANGES_REQUESTED reviews that can't be auto-dismissed — upstream gap documented)
  • Concurrency groups — Cross-workflow collision (intentional, documented)
  • dep-update.md — Added protected-files: fallback-to-issue
  • Slim instructions — 26KB → 34 lines referencing the skill

Sync manifest

.github/instructions/gh-aw-workflows.sync.yaml tracks 8 doc pages, 5 issues, and gh-aw releases for drift detection.

Motivation

Comprehensive gh-aw infrastructure for PolyPilot, synced with
dotnet/maui#35027 and expanded with security enforcement tooling.

## gh-aw-guide skill (.github/skills/gh-aw-guide/)

- SKILL.md — Quick-start reference with 21-row anti-patterns table,
  LabelOps, concurrency race warnings, 4 security-critical patterns,
  checkout:false, rate-limit:, web-fetch tool documentation
- references/architecture.md — Full execution model, authorization
  model (on.roles), security boundaries, dangerous triggers checklist
  (gh-aw#23769 analysis), integrity filtering, fork PR handling,
  safe outputs reference, troubleshooting
- scripts/Check-WorkflowSecurity.ps1 — Scans workflow .md files for
  8 dangerous patterns (pull_request_target+roles:all, workflow_run
  without branches, slash_command+cancel-in-progress:true, missing
  allowed-events on reviews, missing protected-files, etc.)

## instruction-drift skill (.github/skills/instruction-drift/)

- SKILL.md — Drift detection for instruction files tracking upstream
  docs, issues, releases. P0-P3 classification system.
- scripts/Check-Staleness.ps1 — 646-line script with content hashing,
  index crawling, issue discovery, release tracking, coverage gaps
- scripts/Scan-GhAwUpdates.ps1 — Mines github/gh-aw commits for
  new features (safe-outputs, triggers, compiler, security, engines)

## Workflow updates

- review.agent.md + review-on-open.agent.md — COMMENT-only reviews
  (no REQUEST_CHANGES), concurrency groups with intentional cross-
  workflow collision documented
- shared/review-shared.md — COMMENT-only policy, updated prompt
- expert-reviewer.agent.md — COMMENT-only verdict
- dep-update.md — Added protected-files: fallback-to-issue

## Slim instructions file

- gh-aw-workflows.instructions.md slimmed from 26KB to reference
  the gh-aw-guide skill, with 10 essential rules
- gh-aw-workflows.sync.yaml drift tracking manifest

## Key security findings addressed

- Stale REQUEST_CHANGES reviews can't be auto-dismissed (COMMENT-only)
- slash_command + cancel-in-progress:true kills agent runs (fixed)
- dep-update missing protected-files policy (fixed)
- gh-aw#23769 platform restore analysis documented

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PureWeen and others added 2 commits April 21, 2026 11:30
Weekly scheduled workflow that:
1. Runs Check-Staleness.ps1 against sync manifests
2. If stale, runs Scan-GhAwUpdates.ps1 to identify upstream changes
3. Classifies changes as P0-P3 (factually wrong → nice-to-have)
4. Auto-fixes P0/P1 (factual errors, security-relevant changes)
5. Creates a draft PR with updates via create-pull-request
6. Runs Check-WorkflowSecurity.ps1 to verify no regressions
7. If fresh, calls noop (no unnecessary issues/PRs)

Respects divergence_sections from sync manifest — never removes
repo-specific content (stale review limitation, security boundaries).

Uses protected-files: allowed since it intentionally modifies
.github/skills/ files. Draft PR requires human review before merge.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two bugs found during live testing:

1. Sync manifest missing resolution_expected: true on all 5 tracked
   issues. Script defaulted to expecting 'open', causing false stale
   signals on every run.

2. Get-RecentClosedIssues JSON parsing failed because gh --paginate
   returns an Object[] (one string per line), not a single string.
   The old -replace on the joined string missed the array structure.
   Fixed by joining array elements with commas before wrapping in [].

Verified: all 5 issues now show 'closed (expected: closed)' and
recently-closed-issues query returns 7146 results without errors.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen PureWeen merged commit 41558f0 into main Apr 21, 2026
@PureWeen PureWeen deleted the feat/gh-aw-skills-and-security branch April 21, 2026 18:54
PureWeen added a commit that referenced this pull request Apr 21, 2026
Three issues discovered during maui-labs deployment:

1. Draft PR guard — review-on-open fires on drafts, wasting 90min
   of 3-model review on unfinished work. Added if: draft == false.

2. Sub-agent recursion risk — sub-agent prompt said 'Read and follow
   expert-reviewer.agent.md' which contains task dispatch instructions.
   Sub-agents could recursively spawn 9+ leaf agents. Fixed by inlining
   review dimensions and adding explicit 'Do NOT dispatch sub-agents'
   guard.

3. Missing concurrency group on review-on-open — lost during merge
   of PR #653. Restored with cancel-in-progress: false.

Also learned: gh-aw compiler v0.62.2 accepts allowed-events: [COMMENT]
in source but does NOT enforce it at runtime (validation.json still
permits APPROVE/REQUEST_CHANGES). The prompt-level 'never APPROVE'
instruction is the real guard, not the frontmatter.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PureWeen added a commit that referenced this pull request Apr 22, 2026
## Problem

`/review` is broken on main — every run crashes with:
```
MCPG Error: allow-only must include repos
```

## Root Cause

`min-integrity: approved` (added in #653) causes the gh-aw compiler
(v0.62.2) to emit an incomplete guard policy — it sets `min-integrity`
but not `repos`. The MCP Gateway (v0.1.19) requires both fields.

## Fix

Remove the hardcoded `min-integrity: approved`. The runtime
`determine-automatic-lockdown` step correctly sets both `min-integrity`
and `repos` dynamically.

Verified working in dotnet/maui-labs (4 successful runs after this fix).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PureWeen added a commit that referenced this pull request Apr 22, 2026
Third iteration fixing the instruction-drift workflow.

**Problem chain:**
1. PR #653: Scripts ran inside agent container → gh CLI not
authenticated → scripts failed
2. PR #662: Moved to `steps:` (activation job) → scripts ran with
GH_TOKEN ✅ but files written to activation filesystem aren't available
in agent job (different runner)
3. **This PR:** Pass data via `$GITHUB_OUTPUT` → template substitution
inlines JSON directly into the agent prompt

**How it works now:**
```
activation job (steps:, has GH_TOKEN):
  ├── Check-Staleness.ps1 → $GITHUB_OUTPUT (changes_detected + report JSON)
  └── Scan-GhAwUpdates.ps1 → $GITHUB_OUTPUT (upstream JSON, only if stale)
      ↓ template substitution
agent prompt (data inlined, no file I/O):
  ├── changes_detected: false → call noop immediately
  └── changes_detected: true → analyze + create PR
```

The agent sees the actual JSON data right in its prompt — no `cat`
commands, no file paths, no filesystem dependency.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant