Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
105 changes: 91 additions & 14 deletions .claude/skills/instruction-drift/scripts/Check-Staleness.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -127,8 +127,8 @@ function Get-GitHubLatestRelease {
}
}
$release = $json | ConvertFrom-Json
# Truncate release notes to first 2000 chars to keep report manageable
$body = if ($release.body.Length -gt 2000) { $release.body.Substring(0, 2000) + '...' } else { $release.body }
# Truncate release notes to first 5000 chars
$body = if ($release.body.Length -gt 5000) { $release.body.Substring(0, 5000) + '...' } else { $release.body }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 MINOR — Pre-existing null-safety note (not a regression)

Both Get-GitHubLatestRelease (here) and the new Get-GitHubReleasesSince (line 173) access $release.body.Length / $r.body.Length without a null guard. This works in PowerShell 7 ($null.Length → 0, so the else branch returns $null — no crash), but would fail in PowerShell 5.

Since the workflow invokes pwsh (PowerShell 7), this is safe today. Just noting for awareness — if portability matters, a guard like if ($release.body -and $release.body.Length -gt 5000) would be more robust.

return @{
status = 'ok'
repo = $Repo
Expand All @@ -149,6 +149,50 @@ function Get-GitHubLatestRelease {
}
}

function Get-GitHubReleasesSince {
param([string]$Repo, [string]$SinceTag)

try {
# Fetch recent releases (up to 10)
$json = gh api "repos/$Repo/releases?per_page=10" --jq '[.[] | {tag_name: .tag_name, name: .name, published_at: .published_at, body: .body}]' 2>&1
if ($LASTEXITCODE -ne 0) {
return @{
status = 'error'
repo = $Repo
error = "Failed to fetch releases: $json"
}
}
if (-not $json -or $json -eq 'null' -or $json -eq '[]') {
return @{ status = 'ok'; repo = $Repo; releases = @() }
}
$allReleases = $json | ConvertFrom-Json
# Collect releases newer than $SinceTag
$newReleases = @()
foreach ($r in $allReleases) {
Comment on lines +157 to +171
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MODERATE — Silent history gap when >10 releases exist since watermark (Flagged by: 3/3 reviewers)

per_page=10 fetches at most 10 releases. If the watermark tag ($SinceTag) falls outside this window, the foreach loop never hits the break and silently returns all 10 as "new." The agent then updates last_reviewed_release to the latest tag — permanently skipping all intermediate releases beyond position 10.

Failing scenario: Watermark at v0.50, repo now at v0.80 with 20 releases since. Only the 10 newest are returned; v0.51v0.70 are never reviewed, yet the watermark jumps to v0.80.

Suggested fix: Return a watermark_found boolean in the result so the caller can warn or paginate:

$foundWatermark = $false
foreach ($r in $allReleases) {
    if ($r.tag_name -eq $SinceTag) { $foundWatermark = $true; break }
    # ... existing logic
}
# Include in return:
#   watermark_found = $foundWatermark

if ($r.tag_name -eq $SinceTag) { break }
Comment on lines +156 to +172
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MODERATE — Silent data loss when watermark tag is not in the top 10 releases

per_page=10 fetches at most 10 releases. The loop breaks when it finds $SinceTag, but if the watermark is older than the 10th release (e.g., drift check was disabled for a month, or a burst of releases occurred), the tag is never found. In that case:

  1. All 10 releases are returned as "new" — but there may be more between release Unify Chat and Dashboard into single view #11 and the watermark.
  2. The agent then processes those 10 and updates last_reviewed_release to the newest tag.
  3. Releases Unify Chat and Dashboard into single view #11 through the old watermark are permanently skipped — their notes are never reviewed.

There's no signal to the caller that the watermark wasn't found, so this is a silent failure.

Fix suggestion: Track whether the watermark was hit and surface it:

$watermarkFound = $false
foreach ($r in $allReleases) {
    if ($r.tag_name -eq $SinceTag) { $watermarkFound = $true; break }
    # ... collect release ...
}
return @{
    status          = 'ok'
    repo            = $Repo
    releases        = $newReleases
    watermark_found = $watermarkFound
}

Then the caller (or the agent) can warn/paginate when watermark_found is $false.

$body = if ($r.body.Length -gt 5000) { $r.body.Substring(0, 5000) + '...' } else { $r.body }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MODERATE — Null body on release notes causes silent failure (Flagged by: 3/3 reviewers)

GitHub releases with no body have body: null. In PowerShell 7, $null.Length returns $null (not an exception), so the else branch runs and $body = $null. The agent's Rule 4 instructs it to "Read ALL release notes" — if it calls string methods on this null value, processing fails.

The same pattern exists in Get-GitHubLatestRelease (line ~129) but is lower risk there since the fallback path doesn't process notes.

Suggested fix:

$body = if ($r.body -and $r.body.Length -gt 5000) { $r.body.Substring(0, 5000) + '...' } else { if ($r.body) { $r.body } else { '' } }

$newReleases += @{
tag = $r.tag_name
name = $r.name
published_at = $r.published_at
release_notes = $body
}
}
return @{
status = 'ok'
repo = $Repo
releases = $newReleases
}
}
catch {
return @{
status = 'error'
repo = $Repo
error = $_.Exception.Message
}
}
}

function Get-IndexPageLinks {
<#
.SYNOPSIS
Expand Down Expand Up @@ -372,6 +416,11 @@ function ConvertFrom-SyncManifest {
$currentItem.resolution_expected = $Matches[1].Trim() -eq 'true'
}
}
elseif ($trimmed -match '^last_reviewed_release:\s*(.+)$') {
if ($currentItem) {
$currentItem.last_reviewed_release = $Matches[1].Trim()
}
}
elseif ($trimmed -match '^coverage_gaps:') {
if ($currentItem) {
$currentItem.coverage_gaps = @()
Expand Down Expand Up @@ -533,21 +582,45 @@ foreach ($manifestPath in $manifests) {
}
'releases' {
Write-Host " 📦 Checking releases for $($source.repo)..." -NoNewline
$result = Get-GitHubLatestRelease -Repo $source.repo
if ($result.status -eq 'ok' -and $result.latest) {
Write-Host " ✅ latest=$($result.latest.tag)" -ForegroundColor Green
}
elseif ($result.status -eq 'ok') {
Write-Host " ℹ️ No releases found" -ForegroundColor Yellow
$sinceTag = if ($source.ContainsKey('last_reviewed_release')) { $source.last_reviewed_release } else { $null }
if ($sinceTag) {
$result = Get-GitHubReleasesSince -Repo $source.repo -SinceTag $sinceTag
if ($result.status -eq 'ok' -and $result.releases.Count -gt 0) {
Write-Host " 🆕 $($result.releases.Count) new release(s) since $sinceTag" -ForegroundColor Yellow
}
elseif ($result.status -eq 'ok') {
Write-Host " ✅ up to date (last reviewed: $sinceTag)" -ForegroundColor Green
}
else {
Write-Host " ❌ Error: $($result.error)" -ForegroundColor Red
}
}
else {
Write-Host " ❌ Error: $($result.error)" -ForegroundColor Red
# No last_reviewed_release — fall back to latest-only
$singleResult = Get-GitHubLatestRelease -Repo $source.repo
$result = @{
status = $singleResult.status
repo = $source.repo
releases = if ($singleResult.status -eq 'ok' -and $singleResult.latest) { @($singleResult.latest) } else { @() }
error = $singleResult.error
}
if ($result.status -eq 'ok' -and $result.releases.Count -gt 0) {
Write-Host " ✅ latest=$($result.releases[0].tag) (no last_reviewed_release set — add to manifest)" -ForegroundColor Yellow
}
elseif ($result.status -eq 'ok') {
Write-Host " ℹ️ No releases found" -ForegroundColor Yellow
}
else {
Write-Host " ❌ Error: $($result.error)" -ForegroundColor Red
}
}
$sourceResults += @{
type = 'releases'
repo = $source.repo
result = $result
$entry = @{
type = 'releases'
repo = $source.repo
last_reviewed_release = $sinceTag
result = $result
}
$sourceResults += $entry
}
}
}
Expand Down Expand Up @@ -632,16 +705,20 @@ foreach ($manifestPath in $manifests) {
# - closed issues where resolution_expected is true (instruction may reference outdated workarounds)
# - untracked pages discovered via index crawling
# - untracked recently closed issues
# - new releases since last_reviewed_release
$actionableChanges = $results | ForEach-Object { $_.sources } | Where-Object {
$_.result.status -eq 'error' -or
($_.type -eq 'issue' -and $_.resolution_expected -and $_.result.status -eq 'ok' -and $_.result.state -eq 'closed')
}
$hasNewReleases = ($results | ForEach-Object { $_.sources } | Where-Object {
$_.type -eq 'releases' -and $_.result.status -eq 'ok' -and $_.result.releases.Count -gt 0 -and $_.last_reviewed_release
Comment on lines +713 to +714
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 MINOR — Fallback releases (no watermark) never trigger changes_detected (Flagged by: 3/3 reviewers after follow-up)

The $hasNewReleases filter requires $_.last_reviewed_release to be truthy. When a releases: source is added without a last_reviewed_release field, the fallback path fetches the latest release and logs "add to manifest," but $hasNewReleases evaluates to $false — so changes_detected is never set from this source.

Failing scenario: A new releases: entry is added to a sync manifest without a watermark. The script logs an advisory but never triggers drift detection, so no update PR is created until someone manually adds the field.

Suggested fix: Include fallback sources in the check, or treat "missing watermark" as actionable:

$hasMissingWatermark = ($results | ForEach-Object { $_.sources } | Where-Object {
    $_.type -eq 'releases' -and -not $_.last_reviewed_release -and $_.result.releases.Count -gt 0
} | Measure-Object).Count -gt 0

} | Measure-Object).Count -gt 0
$hasUntrackedPages = ($results | Where-Object { $_.untracked_pages.Count -gt 0 } | Measure-Object).Count -gt 0
$hasUntrackedIssues = ($results | Where-Object { $_.untracked_closed_issues.Count -gt 0 } | Measure-Object).Count -gt 0
$report = @{
checked_at = (Get-Date -Format 'o')
manifests = $results
changes_detected = (($actionableChanges | Measure-Object).Count -gt 0) -or $hasUntrackedPages -or $hasUntrackedIssues
changes_detected = (($actionableChanges | Measure-Object).Count -gt 0) -or $hasUntrackedPages -or $hasUntrackedIssues -or $hasNewReleases
}

Write-Host "`n📊 Report:" -ForegroundColor Cyan
Expand Down
1 change: 1 addition & 0 deletions .github/instructions/gh-aw-workflows.sync.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ sources:

# GitHub releases — check for new versions
- releases: github/gh-aw
last_reviewed_release: v0.69.3

style: |
Match existing section structure. Use tables for feature comparisons.
Expand Down
5 changes: 3 additions & 2 deletions .github/workflows/instruction-drift.agent.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,8 +121,9 @@ Read the staleness report above. For each signal, read the affected skill files
### Rules
1. **Respect `divergence` sections** — NEVER remove: "Security Boundaries", "Safe Pattern: Checkout + Restore", "Common Patterns"
2. **Classify P0-P3:** P0=factually wrong, P1=security, P2=new features, P3=nice-to-have
3. **Only auto-fix P0 and P1.** Note P2 in PR description. Skip P3.
4. **Update sync manifest** — `resolution_expected`, `last_reviewed`, new issues
3. **Auto-fix P0, P1, AND P2.** Only skip P3 (cosmetic/nice-to-have).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MODERATE — P2 auto-fix now includes security-sensitive content without explicit human gate (Flagged by: 2/3 reviewers)

Changing Rule 3 from "Note P2 in PR description" to "Auto-fix P0, P1, AND P2" means the agent can autonomously modify the security patterns table, anti-patterns table, and trigger guide based solely on its interpretation of release notes. Rule 4 explicitly instructs it to add to "security patterns as appropriate."

The draft PR does provide a human review gate before merge, which mitigates the risk. However, the prior P2-as-note-only policy was a deliberate safeguard — if the agent misinterprets a release note, incorrect security guidance ends up in draft content that reviewers may rubber-stamp.

Suggestion: Consider keeping P2-as-note for changes that touch divergence sections or security guidance, and only auto-fix P2 changes to non-security documentation sections.

4. **For new releases:** Read ALL release notes in the report. For each new feature, check if our skill covers it. Add new features to the anti-patterns table, trigger guide, or security patterns as appropriate. Update the `last_reviewed_release` field in `gh-aw-workflows.sync.yaml` to the latest tag after processing.
5. **Update sync manifest** — `resolution_expected`, `last_reviewed_release`, new issues

Commit each change:
```bash
Expand Down
Loading