You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
🟢 Trigger selection was excellent across all scenarios — agents correctly used pull_request with paths: filters, workflow_run, schedule, and other appropriate triggers (5/5 average)
🟢 Tool selection was correct for all scenarios — GitHub MCP toolsets were always preferred over direct API access (5/5 average)
🟢 Safe-outputs were always configured correctly — add-pr-comment, create-issue, create-discussion as appropriate
🔴 Critical security gap: 3 of 4 scenarios incorrectly added write permissions to the agent job (e.g., pull-requests: write, discussions: write) instead of keeping the agent read-only and relying solely on safe-outputs
🟢 Prompts were consistently well-structured, actionable, and included helpful output format templates
Top Patterns
Most common trigger: pull_request: types: [opened, synchronize] for PR-based workflows
Tool default: github: toolsets: [default] (DevOps scenario correctly used specific toolsets [actions, issues, repos])
Security pattern inconsistency: Agents split on whether to add write permissions directly vs. keep read-only
View High Quality Responses (Top 2)
DevOps — Deployment Incident Reporter (5.0/5.0)
Used workflow_run: types: [completed] trigger — correct and precise
Prompt included conditional early-exit logic ("if conclusion is not failure, stop")
Backend Engineer — DB Migration Review (4.4/5.0)
Excellent use of paths: filter on pull_request trigger to scope to SQL files only
Comprehensive migration safety checklist in prompt (critical vs. warning vs. informational)
Well-structured output template for PR comment
View Areas for Improvement
Critical: Write permissions on agent job (3/4 scenarios affected)
The agent repeatedly added write permissions directly to the agent job:
# ❌ Generated (incorrect)permissions:
pull-requests: write # Should NOT be here
The correct pattern per .github/aw/create-agentic-workflow.md:
# ✅ Correct — agent stays read-onlypermissions:
pull-requests: readcontents: readsafe-outputs:
add-pr-comment: # Writes go through safe-outputs only
The DevOps scenario was the only one that got this right — likely because its write operation (create-issue) was clearly separated from reading GitHub data.
Minor: Generic toolset [default] vs. specific toolsets
Most scenarios defaulted to toolsets: [default] which works but is broader than necessary. Scoped toolsets (e.g., [issues, pull_requests]) would follow least-privilege principles better.
Recommendations
Strengthen the permission model guidance in .github/aw/create-agentic-workflow.md — add a prominently-placed "permission quick-reference table" showing which safe-output requires which read permission (e.g., add-pr-comment → pull-requests: read, never write). The current guidance exists but agents still generate write permissions 75% of the time.
Add a "safe-outputs permission mapping" section to .github/aw/github-agentic-workflows.md — explicitly document that all safe-outputs operate independently of the agent job's permission scope, and that adding write permissions to the agent job is always wrong when safe-outputs are in use.
Encourage specific GitHub toolsets over [default] — the DevOps scenario demonstrated better least-privilege by using [actions, issues, repos]. Consider adding guidance to scope toolsets to only what the workflow needs.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Persona Overview
Key Findings
pull_requestwithpaths:filters,workflow_run,schedule, and other appropriate triggers (5/5 average)add-pr-comment,create-issue,create-discussionas appropriatepull-requests: write,discussions: write) instead of keeping the agent read-only and relying solely on safe-outputsTop Patterns
pull_request: types: [opened, synchronize]for PR-based workflowsgithub: toolsets: [default](DevOps scenario correctly used specific toolsets[actions, issues, repos])View High Quality Responses (Top 2)
DevOps — Deployment Incident Reporter (5.0/5.0)
workflow_run: types: [completed]trigger — correct and preciseactions: read, contents: read) ✅[actions, issues, repos]instead of generic[default]max: 1+expires: 1hrate limiting oncreate-issuesafe-output — preventing issue floodingBackend Engineer — DB Migration Review (4.4/5.0)
paths:filter onpull_requesttrigger to scope to SQL files onlyView Areas for Improvement
Critical: Write permissions on agent job (3/4 scenarios affected)
The agent repeatedly added write permissions directly to the agent job:
The correct pattern per
.github/aw/create-agentic-workflow.md:The DevOps scenario was the only one that got this right — likely because its write operation (create-issue) was clearly separated from reading GitHub data.
Minor: Generic toolset
[default]vs. specific toolsetsMost scenarios defaulted to
toolsets: [default]which works but is broader than necessary. Scoped toolsets (e.g.,[issues, pull_requests]) would follow least-privilege principles better.Recommendations
Strengthen the permission model guidance in
.github/aw/create-agentic-workflow.md— add a prominently-placed "permission quick-reference table" showing which safe-output requires which read permission (e.g.,add-pr-comment→pull-requests: read, neverwrite). The current guidance exists but agents still generate write permissions 75% of the time.Add a "safe-outputs permission mapping" section to
.github/aw/github-agentic-workflows.md— explicitly document that all safe-outputs operate independently of the agent job's permission scope, and that adding write permissions to the agent job is always wrong when safe-outputs are in use.Encourage specific GitHub toolsets over
[default]— the DevOps scenario demonstrated better least-privilege by using[actions, issues, repos]. Consider adding guidance to scope toolsets to only what the workflow needs.References:
Beta Was this translation helpful? Give feedback.
All reactions