Skip to content

on.github-app credentials cannot be sourced from a custom job's outputs (jobs.{pre_activation,activation}.pre-steps splicing/needs bugs + missing on.needs API) #27670

@bbonafed

Description

@bbonafed

Summary

Follow-up to #26719 / #27472. The "extend safe-outputs.needs from frontmatter" fix shipped in v0.69.1 solves the cross-job credential plumbing problem for safe_outputs only. The same problem still exists for the pre_activation and activation jobs that mint a GitHub App token for on.github-app (skip-if-no-match queries, reaction posting, role/membership checks, etc.).

Concretely, three things stand in the way of fully sourcing on.github-app credentials from an external secret manager via a custom workflow job (the pattern the maintainer hinted at when closing #27472):

  1. Compiler bug — YAML splicing corruption in jobs.pre_activation.pre-steps and jobs.activation.pre-steps: any pre-step (whether uses: or run:) gets injected inside the auto-emitted Setup Scripts step, producing invalid YAML with duplicate top-level keys.
  2. Compiler bug — needs: duplication when jobs.activation.pre-steps is set: the activation job is added a second time to the agent's needs: list (and equivalent built-in jobs would form self-cycles if the same pattern were tried on jobs.agent.pre-steps, jobs.safe_outputs.pre-steps, or jobs.conclusion.pre-steps).
  3. Missing API — no on.needs (or equivalent) for pre_activation/activation: even if (1) and (2) were fixed, pre_activation has no mechanism to declare a dependency on a custom job at all (it is the structurally-first job), and activation only auto-picks-up custom jobs that declare needs: [pre_activation]. There is no symmetric counterpart to safe-outputs.needs on the on: side.

The combined effect: tools.github.github-app (in agent) and safe-outputs.github-app (in safe_outputs) can both be fully driven from ${{ needs.<custom-job>.outputs.* }} in v0.69.1. But on.github-app (in pre_activation / activation) still requires ${{ secrets.* }} / ${{ vars.* }} for app-id / private-key. Workflows that aren't allowed to put GitHub App credentials in GitHub Actions Secrets at all (e.g. enterprises that mandate an external secret manager for all credentials) cannot use on.github-app end-to-end.

Reproduction

v0.69.1. Spike workflow that tries to source on.github-app via a custom secrets_fetcher job whose outputs come from a hypothetical external-secret-manager action:

---
on:
  schedule:
    - cron: "0 * * * *"
  workflow_dispatch:
  github-app:
    app-id: ${{ needs.secrets_fetcher.outputs.app_id }}
    private-key: ${{ needs.secrets_fetcher.outputs.app_private_key }}
    owner: "example-org"
    repositories: ["*"]
  skip-if-no-match: '...some search query...'

permissions:
  contents: read
  issues: read

engine: copilot

jobs:
  secrets_fetcher:
    runs-on: ubuntu-latest
    outputs:
      app_id: ${{ steps.get-secrets.outputs.WORKFLOW_APP_ID }}
      app_private_key: ${{ steps.get-secrets.outputs.WORKFLOW_APP_PRIVATE_KEY }}
    steps:
      - name: Get App Credentials from External Secret Manager
        id: get-secrets
        uses: org-internal/secrets@v2
        with:
          username: ${{ secrets.SECRET_MANAGER_USERNAME }}
          password: ${{ secrets.SECRET_MANAGER_PASSWORD }}
          secrets_to_retrieve: |
            WORKFLOW_APP_ID,
            WORKFLOW_APP_PRIVATE_KEY
---

# Spike: full external-secret-manager backing for on.github-app

make compile succeeds (the expression validator accepts needs.secrets_fetcher.outputs.* in on.github-app), but at runtime the pre_activation and activation jobs cannot resolve needs.secrets_fetcher.outputs.* because neither job declares secrets_fetcher as a dependency. There is no frontmatter knob to add it.

The only workarounds the docs / source suggest are:

a) Try jobs.pre_activation.pre-steps: to fetch the secrets in-job and read them via steps.get-secrets.outputs.* — this hits the YAML splicing bug below.

b) Try jobs.activation.pre-steps: — same splicing bug, plus the needs: duplication bug below.

c) Define secrets_fetcher with needs: [pre_activation] so that configureActivationNeedsAndCondition auto-adds it to activation.needs — this works for activation but does nothing for pre_activation, which still has no way to depend on a custom job.

Bug 1: YAML splicing corruption in jobs.{pre_activation,activation}.pre-steps

Reproduction

Add a single pre-step under jobs.pre_activation:

jobs:
  pre_activation:
    pre-steps:
      - name: Fetch app credentials
        id: get-secrets
        uses: org-internal/secrets@v2
        with:
          secrets_to_retrieve: |
            WORKFLOW_APP_ID

Expected (from ADR-27138 §"Compilation and Placement — Built-in Jobs")

The pre-step is inserted after the step with id: setup and before the first actions/checkout@* step.

Actual

The pre-step is spliced between two lines of the existing Setup Scripts step, producing invalid YAML such as:

      - name: Setup Scripts
        id: setup
      - name: Fetch app credentials             # <-- inserted HERE
        id: get-secrets
        uses: org-internal/secrets@v2
        with:
          secrets_to_retrieve: |
            WORKFLOW_APP_ID
        uses: <pinned setup action>             # <-- duplicate key, orphaned tail of original Setup Scripts step
        with:
          destination: ...
          job-name: ${{ github.job }}

This is rejected by actionlint (and by GitHub Actions at runtime) as a duplicate uses: / duplicate with: key.

Root cause

pkg/workflow/compiler_jobs.go:

  • exactSetupStepIDPattern = regexp.MustCompile((?m)^\sid:\ssetup\s*$) is the marker insertPreStepsAfterSetupBeforeCheckout uses to find the splice point.

  • pkg/workflow/compiler_yaml_step_generation.go generateSetupStep (and the equivalent helpers for the activation/pre-activation Setup Scripts step) returns the step as multiple separate []string entries — one entry per YAML line:

    lines := []string{
        "      - name: Setup Scripts\n",                                    // index N
        "        id: setup\n",                                              // index N+1   <-- regex matches this line only
        fmt.Sprintf("        uses: %s\n", setupActionRef),                  // index N+2
        "        with:\n",                                                  // index N+3
        ...,
    }
  • insertPreStepsAfterSetupBeforeCheckout then sets lastSetupIdx = N+1 and insertIdx = lastSetupIdx + 1 = N+2, splicing the pre-steps between id: setup and the rest of the same step's lines instead of after the whole step.

ADR-27138 §"Negative" already flagged that "the step insertion logic depends on detecting the id: setup marker in the serialized YAML step string, which is a fragile heuristic", but the actual splicing implementation assumes one step == one []string entry, which doesn't hold for the auto-generated Setup Scripts step on built-in jobs.

Suggested fix

Two options:

  • A (preferred): Treat the Setup Scripts step as a single unit by either (i) emitting it as a single []string entry containing the full multi-line YAML, or (ii) tracking the end of the step (e.g. by noticing the next entry that begins with - at the same indentation) and using that as the splice point instead of lastSetupIdx + 1.
  • B: Add a sentinel marker emitted at the end of the Setup Scripts step (e.g. a comment line # end-setup) and have insertPreStepsAfterSetupBeforeCheckout splice after that sentinel.

Either fix should be covered by a new test in pkg/workflow/compiler_jobs_test.go that compiles a workflow with jobs.pre_activation.pre-steps containing both a uses:-style step and a run:-style step, and asserts the compiled .lock.yml is valid YAML and passes actionlint.

Bug 2: needs: duplication / would-be self-cycle in jobs.<builtin>.pre-steps

Reproduction

Add jobs.activation.pre-steps: to any workflow. The compiled agent job ends up with needs: [activation, activation]. (For jobs.agent.pre-steps:, jobs.safe_outputs.pre-steps:, or jobs.conclusion.pre-steps:, the same path attempts to add the built-in to its own needs: list, producing a self-cycle that GitHub Actions rejects.)

Root cause

pkg/workflow/compiler_main_job.go (current main):

// Skip jobs.pre-activation (or pre_activation) as it's handled specially
if jobName == string(constants.PreActivationJobName) || jobName == "pre-activation" {
    continue
}

// Only add as direct dependency if it doesn't depend on pre_activation or agent
if configMap, ok := data.Jobs[jobName].(map[string]any); ok {
    if !jobDependsOnPreActivation(configMap) && !jobDependsOnAgent(configMap) {
        depends = append(depends, jobName)
    }
}

This loop iterates data.Jobs, which (after ADR-27138 introduced jobs.<builtin>.pre-steps) now legitimately contains entries keyed by built-in job names (agent, activation, safe_outputs, conclusion, …) that are intended as customization-only, not as dependencies. The skip list only covers pre_activation / pre-activation, so:

  • activation → re-added to agent.Needs (already there from the unconditional depends = []string{"activation"} above) → duplicate
  • agent → would be added to agent.Needs → self-cycle
  • safe_outputs → added to agent.Needs, but safe_outputs already needs agent → cycle
  • conclusion → similar cycle

Suggested fix

Extend the skip-list to all built-in job names that ADR-27138 explicitly recognises as customization-only targets:

if isBuiltInJobName(jobName) {
    continue
}

…where isBuiltInJobName recognises (at minimum): pre_activation, pre-activation, activation, agent, safe_outputs, safe-outputs, conclusion, and any future built-in jobs. ADR-27138 §"Duplicate Job Prevention" already establishes this categorisation; the dependency-addition loop just hasn't been updated to honour it.

Add regression tests under pkg/workflow/compiler_main_job_test.go for each built-in job name, asserting that a frontmatter jobs.<builtin>.pre-steps entry compiles cleanly and does not appear in any other job's needs: list.

Limitation 3: no on.needs (or equivalent) on the on: side

Even with Bug 1 and Bug 2 fixed, sourcing on.github-app credentials from a custom job's outputs is not possible because:

  • pkg/workflow/compiler_pre_activation_job.go buildPreActivationJob does not consume any user-supplied needs: for pre_activation. The pre-activation job is structurally the first in the graph and has no slot to depend on a custom job.
  • pkg/workflow/compiler_activation_job_builder.go configureActivationNeedsAndCondition only auto-picks-up custom jobs that already declare needs: [pre_activation] (via getCustomJobsDependingOnPreActivation). There is no symmetric on.needs: / activation.needs: knob equivalent to the new safe-outputs.needs:.

So even if a secrets_fetcher custom job is declared with needs: [pre_activation] and activation therefore depends on it transitively, pre_activation itself still cannot reference ${{ needs.secrets_fetcher.outputs.* }} in its on.github-app.app-id / private-key fields, because pre_activation has no mechanism to depend on secrets_fetcher.

Suggested fix

Mirror the new safe-outputs.needs API on the on: side. Two reasonable shapes:

  • A (preferred, parallel to safe-outputs.needs): add on.needs: [<custom-job-name>, …]. The compiler then:

    • Emits these as additional needs: for both pre_activation and activation.
    • Skips the "is structurally first" assumption for pre_activation when on.needs is non-empty (the listed jobs become its dependencies).
    • Validates that any ${{ needs.<job>.outputs.* }} reference in on.github-app.* resolves against on.needs.
  • B: introduce a dedicated pre-activation.needs / activation.needs pair under jobs.<builtin> that is honoured by the existing Needs: field of those jobs (analogous to how safe-outputs.needs was wired in for safe_outputs). This keeps on: purely about triggers and concentrates the dependency knob alongside the other per-job customisation that ADR-27138 introduced.

Either shape is a small, additive frontmatter change with the same surface area as the safe-outputs.needs work in #27472 — the difference is purely which job's needs: list it feeds.

Why this matters

on.github-app is the only configuration site where GitHub App credentials are required to live in ${{ secrets.* }} / ${{ vars.* }} after v0.69.1. For deployment environments that mandate an external secret manager and prohibit storing application credentials in GitHub Actions Secrets at all, this single remaining gap blocks the entire on.github-app feature surface (skip-if-no-match, reactions, role/membership checks via App auth, cross-org App-token lookups, etc.). The result today is a hybrid that defeats the secret-manager mandate, or losing those features entirely.

Environment

  • gh-aw version: v0.69.1
  • Engine: copilot (also reproduces on claude)
  • Trigger: schedule + workflow_dispatch in the spike (any trigger that requires on.github-app for pre_activation skip-if-no-match or activation reactions)

Related Issues / ADRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions