Skip to content

Warn when GitHub App-authored safe-outputs can self-cancel comment-triggered workflows via shared concurrency #21975

@samuelkahessay

Description

@samuelkahessay

Context

I hit a concurrency failure mode in a comment-triggered workflow that uses:

  • workflow-level concurrency.cancel-in-progress: true
  • workflow_dispatch with an issue_number input
  • issue_comment as a trigger
  • safe-outputs.github-app for comments and other write operations

The issue is not that the concurrency key is empty or malformed. The key is
correctly populated. The problem is that a passive follow-up run triggered by
the workflow's own App-authored comment can resolve to the same workflow-level
concurrency group as the primary run that produced the comment.

When that happens, GitHub Actions cancels the original run mid-flight.

This looks like a gh-aw workflow-authoring footgun at the boundary between:

  • documented safe-outputs.github-app behavior
  • documented workflow-level concurrency behavior
  • slash-command / issue-comment style workflows that serialize by issue number

Problem

gh-aw explicitly supports using a GitHub App token for all safe output
operations.

From .github/aw/github-agentic-workflows.md:

  • safe-outputs.github-app "uses it for all safe output operations"
  • safe-outputs.concurrency-group exists specifically to control safe-outputs
    job concurrency and uses cancel-in-progress: false

That means safe-outputs writes are a first-class product surface, not an
incidental repo customization.

The failure mode is:

  1. a primary run starts in a workflow-level concurrency group keyed by issue or
    dispatch input
  2. safe_outputs posts a comment using a GitHub App token
  3. that comment triggers a passive issue_comment run of the same workflow
  4. the passive run resolves to the same workflow-level concurrency group
  5. cancel-in-progress: true cancels the original run

This is especially easy to hit in workflows that serialize by issue number and
listen to issue_comment for real slash commands.

In other words: App-authored safe outputs can re-enter the same workflow and
collide with the run that emitted them.

I am filing this primarily as a warning/docs/guardrail issue, not as a demand
for gh-aw to infer arbitrary user intent from custom concurrency expressions.

Why this seems upstream-relevant

I am not claiming gh-aw can guess every custom concurrency expression.

The upstream issue is that gh-aw currently exposes all the ingredients for this
failure mode without warning or guardrails:

  • first-class support for safe-outputs.github-app
  • first-class support for workflow-level concurrency with
    cancel-in-progress: true
  • first-class support for comment-triggered workflows / slash-command style
    activation
  • first-class support for safe-outputs job concurrency, but not for the
    workflow-level self-trigger collision this creates

Right now it is easy to build a workflow that is individually valid in each
part, but self-cancels once those parts interact.

I do not think this necessarily implies a universal compiler fix for arbitrary
user-defined concurrency expressions. But it does look like something gh-aw
should warn about or document more explicitly.

Observed reproduction

Observed in:

  • repo: samuelkahessay/daily-habit-tracker
  • gh-aw version: v0.62.2
  • original run: 23341636440

Timeline:

  1. workflow_dispatch for issue #15
  2. workflow-level concurrency group resolves to a per-issue key for #15
  3. agent completes successfully
  4. safe_outputs posts a comment on issue #15 using GitHub App auth
  5. one or more passive issue_comment runs are triggered
  6. those runs resolve to the same workflow-level concurrency key
  7. cancel-in-progress: true cancels the original run
  8. safe_outputs and downstream post-agent jobs are marked cancelled
  9. any final job with its own non-cancelling concurrency group can still
    survive

Important detail:

  • the spawned issue_comment run does not need to reach pre_activation
    or prove it is actionable to cancel the original run
  • workflow-level concurrency is decided when the new run starts

Concrete local shape that reproduces it

This was reproducible with a workflow-level concurrency pattern like:

concurrency:
  group: >-
    gh-aw-${{ github.workflow }}-${{
      github.event.issue.number ||
      github.event.pull_request.number ||
      github.event.inputs.issue_number ||
      github.run_id
    }}
  cancel-in-progress: true

paired with:

safe-outputs:
  github-app:
    app-id: ${{ vars.APP_ID }}
    private-key: ${{ secrets.APP_PRIVATE_KEY }}
  add-comment:

and an issue_comment trigger in the same workflow.

That is enough for a passive App-authored follow-up comment to enter the same
concurrency group as the original targeted run.

Minimal standalone repro shape

This does not depend on any repo-specific prompt content. The minimal shape is:

---
on:
  workflow_dispatch:
    inputs:
      issue_number:
        required: false
  slash_command:
    name: test
    events: [issue_comment]
concurrency:
  group: >-
    gh-aw-${{ github.workflow }}-${{
      github.event.issue.number ||
      github.event.inputs.issue_number ||
      github.run_id
    }}
  cancel-in-progress: true
safe-outputs:
  github-app:
    app-id: ${{ vars.APP_ID }}
    private-key: ${{ secrets.APP_PRIVATE_KEY }}
  add-comment:
engine: copilot
---

Post a comment to the triggering issue.

If that workflow is dispatched with an issue_number, and safe_outputs posts
an App-authored comment back onto that same issue, the passive issue_comment
follow-up run can enter the same workflow-level concurrency group and cancel
the original run.

Product evidence

gh-aw explicitly documents GitHub App auth for safe-outputs

From .github/aw/github-agentic-workflows.md:

  • safe-outputs.github-app is documented as the token used "for all safe
    output operations"

So App-authored follow-up writes are a supported configuration, not an odd
hack.

gh-aw already recognizes that safe-outputs needs its own concurrency handling

From docs/src/content/docs/reference/concurrency.md:

  • safe-outputs.concurrency-group exists to serialize the safe_outputs job
  • when set, it uses cancel-in-progress: false

So gh-aw already treats safe-outputs concurrency as something that needs
product-level controls. The missing piece is the interaction with
workflow-level concurrency when safe-outputs itself can trigger new runs.

safe-outputs.concurrency-group does not protect against this workflow-level collision

The existing safe-outputs.concurrency-group control is useful, but it only
changes concurrency on the safe_outputs job itself.

It does not prevent a passive comment-triggered workflow run from entering the
same workflow-level concurrency group and cancelling the originating run.

That makes this particular failure mode easy to miss even for users who are
already trying to use the safe-outputs concurrency controls correctly.

Top-level GitHub App config falls through to safe-outputs automatically

From pkg/workflow/safe_outputs_config.go, lines 675-682:

// mergeAppFromIncludedConfigs merges app configuration from included safe-outputs configurations
// If the top-level workflow has an app configured, it takes precedence
// Otherwise, the first app configuration found in included configs is used
func (c *Compiler) mergeAppFromIncludedConfigs(...) {
    // If top-level workflow already has app configured, use it (no merge needed)
    if topSafeOutputs != nil && topSafeOutputs.GitHubApp != nil {
        safeOutputsAppLog.Print("Using top-level app configuration")
        return topSafeOutputs.GitHubApp, nil
    }

So a workflow that configures github-app at the top level automatically gets
App-authored safe-outputs without explicitly setting safe-outputs.github-app.
That makes this collision easier to hit — any top-level App config is enough.

Impact

This can cause the original run to self-cancel after the main agent work is
done but before all post-agent jobs complete.

In the observed case that meant:

  • safe_outputs ended as cancelled
  • downstream post-agent work was also cancelled
  • the run looked partially successful even though post-agent state was cut off

Any workflow with the same trigger/concurrency pattern is exposed, especially:

  • targeted issue implementation flows
  • slash-command workflows that listen on issue_comment
  • workflows that post App-authored status/progress comments back onto the same
    issue or PR they serialize on

Expected behavior

gh-aw should make this interaction harder to trip over.

At minimum, users should not have to discover by incident that:

  • safe-outputs.github-app can re-trigger the same workflow via comments
  • those passive follow-up runs can cancel the original run before they even
    reach activation

Proposed fixes

I think any of these would be a reasonable upstream response. A docs-only or
warning-only fix would already be useful:

  1. Docs warning in concurrency and safe-outputs references

    • explicitly document that GitHub App-authored safe outputs can re-trigger
      comment-based workflows and collide with workflow-level concurrency groups
    • explicitly note that safe-outputs.concurrency-group does not protect
      against this specific workflow-level self-cancellation pattern
  2. Heuristic compiler warning for the dangerous combination

    • warn when all of the following are true:
      • workflow-level cancel-in-progress: true
      • comment-based triggers are enabled
      • safe-outputs.github-app is enabled
      • workflow concurrency group references issue / PR identity
    • explain that passive App-authored follow-up events can cancel the
      originating run
    • if this cannot be detected reliably for arbitrary expressions, it could be
      limited to common patterns or skipped in favor of docs only
  3. Provide a documented passive-event pattern

    • recommend giving passive comment-triggered runs a unique key such as
      github.run_id, while preserving per-issue serialization for real slash
      commands / targeted dispatch
  4. Longer-term: add a first-class helper or frontmatter abstraction

    • e.g. a built-in way to distinguish passive follow-up comment events from
      real command activations in concurrency configuration

Workaround

The local workaround was to special-case passive issue_comment events so they
get a unique workflow concurrency key:

concurrency:
  group: >-
    gh-aw-${{ github.workflow }}-${{
      github.event_name == 'issue_comment' &&
      !(startsWith(github.event.comment.body, '/test ') || github.event.comment.body == '/test') &&
      format('passive-comment-{0}', github.run_id) ||
      github.event.issue.number ||
      github.event.pull_request.number ||
      github.event.inputs.issue_number ||
      github.run_id
    }}
  cancel-in-progress: true

That preserves serialization for real /test commands while preventing
App-authored follow-up comments from cancelling the primary run.

Environment

  • Repository observed on: samuelkahessay/daily-habit-tracker
  • gh-aw version observed: v0.62.2
  • Original run: 23341636440
  • Passive follow-up runs: observed after the App-authored comment
  • Relevant gh-aw docs/code:
    • .github/aw/github-agentic-workflows.md
    • docs/src/content/docs/reference/concurrency.md
    • pkg/workflow/safe_outputs_config.go (lines 675-682)

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions