-
Notifications
You must be signed in to change notification settings - Fork 312
Description
Context
I hit a concurrency failure mode in a comment-triggered workflow that uses:
- workflow-level
concurrency.cancel-in-progress: true workflow_dispatchwith anissue_numberinputissue_commentas a triggersafe-outputs.github-appfor comments and other write operations
The issue is not that the concurrency key is empty or malformed. The key is
correctly populated. The problem is that a passive follow-up run triggered by
the workflow's own App-authored comment can resolve to the same workflow-level
concurrency group as the primary run that produced the comment.
When that happens, GitHub Actions cancels the original run mid-flight.
This looks like a gh-aw workflow-authoring footgun at the boundary between:
- documented
safe-outputs.github-appbehavior - documented workflow-level concurrency behavior
- slash-command / issue-comment style workflows that serialize by issue number
Problem
gh-aw explicitly supports using a GitHub App token for all safe output
operations.
From .github/aw/github-agentic-workflows.md:
safe-outputs.github-app"uses it for all safe output operations"safe-outputs.concurrency-groupexists specifically to control safe-outputs
job concurrency and usescancel-in-progress: false
That means safe-outputs writes are a first-class product surface, not an
incidental repo customization.
The failure mode is:
- a primary run starts in a workflow-level concurrency group keyed by issue or
dispatch input safe_outputsposts a comment using a GitHub App token- that comment triggers a passive
issue_commentrun of the same workflow - the passive run resolves to the same workflow-level concurrency group
cancel-in-progress: truecancels the original run
This is especially easy to hit in workflows that serialize by issue number and
listen to issue_comment for real slash commands.
In other words: App-authored safe outputs can re-enter the same workflow and
collide with the run that emitted them.
I am filing this primarily as a warning/docs/guardrail issue, not as a demand
for gh-aw to infer arbitrary user intent from custom concurrency expressions.
Why this seems upstream-relevant
I am not claiming gh-aw can guess every custom concurrency expression.
The upstream issue is that gh-aw currently exposes all the ingredients for this
failure mode without warning or guardrails:
- first-class support for
safe-outputs.github-app - first-class support for workflow-level concurrency with
cancel-in-progress: true - first-class support for comment-triggered workflows / slash-command style
activation - first-class support for safe-outputs job concurrency, but not for the
workflow-level self-trigger collision this creates
Right now it is easy to build a workflow that is individually valid in each
part, but self-cancels once those parts interact.
I do not think this necessarily implies a universal compiler fix for arbitrary
user-defined concurrency expressions. But it does look like something gh-aw
should warn about or document more explicitly.
Observed reproduction
Observed in:
- repo:
samuelkahessay/daily-habit-tracker - gh-aw version:
v0.62.2 - original run:
23341636440
Timeline:
workflow_dispatchfor issue#15- workflow-level concurrency group resolves to a per-issue key for
#15 agentcompletes successfullysafe_outputsposts a comment on issue#15using GitHub App auth- one or more passive
issue_commentruns are triggered - those runs resolve to the same workflow-level concurrency key
cancel-in-progress: truecancels the original runsafe_outputsand downstream post-agent jobs are marked cancelled- any final job with its own non-cancelling concurrency group can still
survive
Important detail:
- the spawned
issue_commentrun does not need to reachpre_activation
or prove it is actionable to cancel the original run - workflow-level concurrency is decided when the new run starts
Concrete local shape that reproduces it
This was reproducible with a workflow-level concurrency pattern like:
concurrency:
group: >-
gh-aw-${{ github.workflow }}-${{
github.event.issue.number ||
github.event.pull_request.number ||
github.event.inputs.issue_number ||
github.run_id
}}
cancel-in-progress: truepaired with:
safe-outputs:
github-app:
app-id: ${{ vars.APP_ID }}
private-key: ${{ secrets.APP_PRIVATE_KEY }}
add-comment:and an issue_comment trigger in the same workflow.
That is enough for a passive App-authored follow-up comment to enter the same
concurrency group as the original targeted run.
Minimal standalone repro shape
This does not depend on any repo-specific prompt content. The minimal shape is:
---
on:
workflow_dispatch:
inputs:
issue_number:
required: false
slash_command:
name: test
events: [issue_comment]
concurrency:
group: >-
gh-aw-${{ github.workflow }}-${{
github.event.issue.number ||
github.event.inputs.issue_number ||
github.run_id
}}
cancel-in-progress: true
safe-outputs:
github-app:
app-id: ${{ vars.APP_ID }}
private-key: ${{ secrets.APP_PRIVATE_KEY }}
add-comment:
engine: copilot
---
Post a comment to the triggering issue.If that workflow is dispatched with an issue_number, and safe_outputs posts
an App-authored comment back onto that same issue, the passive issue_comment
follow-up run can enter the same workflow-level concurrency group and cancel
the original run.
Product evidence
gh-aw explicitly documents GitHub App auth for safe-outputs
From .github/aw/github-agentic-workflows.md:
safe-outputs.github-appis documented as the token used "for all safe
output operations"
So App-authored follow-up writes are a supported configuration, not an odd
hack.
gh-aw already recognizes that safe-outputs needs its own concurrency handling
From docs/src/content/docs/reference/concurrency.md:
safe-outputs.concurrency-groupexists to serialize thesafe_outputsjob- when set, it uses
cancel-in-progress: false
So gh-aw already treats safe-outputs concurrency as something that needs
product-level controls. The missing piece is the interaction with
workflow-level concurrency when safe-outputs itself can trigger new runs.
safe-outputs.concurrency-group does not protect against this workflow-level collision
The existing safe-outputs.concurrency-group control is useful, but it only
changes concurrency on the safe_outputs job itself.
It does not prevent a passive comment-triggered workflow run from entering the
same workflow-level concurrency group and cancelling the originating run.
That makes this particular failure mode easy to miss even for users who are
already trying to use the safe-outputs concurrency controls correctly.
Top-level GitHub App config falls through to safe-outputs automatically
From pkg/workflow/safe_outputs_config.go, lines 675-682:
// mergeAppFromIncludedConfigs merges app configuration from included safe-outputs configurations
// If the top-level workflow has an app configured, it takes precedence
// Otherwise, the first app configuration found in included configs is used
func (c *Compiler) mergeAppFromIncludedConfigs(...) {
// If top-level workflow already has app configured, use it (no merge needed)
if topSafeOutputs != nil && topSafeOutputs.GitHubApp != nil {
safeOutputsAppLog.Print("Using top-level app configuration")
return topSafeOutputs.GitHubApp, nil
}So a workflow that configures github-app at the top level automatically gets
App-authored safe-outputs without explicitly setting safe-outputs.github-app.
That makes this collision easier to hit — any top-level App config is enough.
Impact
This can cause the original run to self-cancel after the main agent work is
done but before all post-agent jobs complete.
In the observed case that meant:
safe_outputsended as cancelled- downstream post-agent work was also cancelled
- the run looked partially successful even though post-agent state was cut off
Any workflow with the same trigger/concurrency pattern is exposed, especially:
- targeted issue implementation flows
- slash-command workflows that listen on
issue_comment - workflows that post App-authored status/progress comments back onto the same
issue or PR they serialize on
Expected behavior
gh-aw should make this interaction harder to trip over.
At minimum, users should not have to discover by incident that:
safe-outputs.github-appcan re-trigger the same workflow via comments- those passive follow-up runs can cancel the original run before they even
reach activation
Proposed fixes
I think any of these would be a reasonable upstream response. A docs-only or
warning-only fix would already be useful:
-
Docs warning in concurrency and safe-outputs references
- explicitly document that GitHub App-authored safe outputs can re-trigger
comment-based workflows and collide with workflow-level concurrency groups - explicitly note that
safe-outputs.concurrency-groupdoes not protect
against this specific workflow-level self-cancellation pattern
- explicitly document that GitHub App-authored safe outputs can re-trigger
-
Heuristic compiler warning for the dangerous combination
- warn when all of the following are true:
- workflow-level
cancel-in-progress: true - comment-based triggers are enabled
safe-outputs.github-appis enabled- workflow concurrency group references issue / PR identity
- workflow-level
- explain that passive App-authored follow-up events can cancel the
originating run - if this cannot be detected reliably for arbitrary expressions, it could be
limited to common patterns or skipped in favor of docs only
- warn when all of the following are true:
-
Provide a documented passive-event pattern
- recommend giving passive comment-triggered runs a unique key such as
github.run_id, while preserving per-issue serialization for real slash
commands / targeted dispatch
- recommend giving passive comment-triggered runs a unique key such as
-
Longer-term: add a first-class helper or frontmatter abstraction
- e.g. a built-in way to distinguish passive follow-up comment events from
real command activations in concurrency configuration
- e.g. a built-in way to distinguish passive follow-up comment events from
Workaround
The local workaround was to special-case passive issue_comment events so they
get a unique workflow concurrency key:
concurrency:
group: >-
gh-aw-${{ github.workflow }}-${{
github.event_name == 'issue_comment' &&
!(startsWith(github.event.comment.body, '/test ') || github.event.comment.body == '/test') &&
format('passive-comment-{0}', github.run_id) ||
github.event.issue.number ||
github.event.pull_request.number ||
github.event.inputs.issue_number ||
github.run_id
}}
cancel-in-progress: trueThat preserves serialization for real /test commands while preventing
App-authored follow-up comments from cancelling the primary run.
Environment
- Repository observed on:
samuelkahessay/daily-habit-tracker - gh-aw version observed:
v0.62.2 - Original run:
23341636440 - Passive follow-up runs: observed after the App-authored comment
- Relevant gh-aw docs/code:
.github/aw/github-agentic-workflows.mddocs/src/content/docs/reference/concurrency.mdpkg/workflow/safe_outputs_config.go(lines 675-682)