Skip to content

Fix template injection validator false positives when env: follows run:#11586

Merged
pelikhan merged 4 commits intomainfrom
copilot/fix-template-injection-validator
Jan 24, 2026
Merged

Fix template injection validator false positives when env: follows run:#11586
pelikhan merged 4 commits intomainfrom
copilot/fix-template-injection-validator

Conversation

Copy link
Contributor

Copilot AI commented Jan 24, 2026

The template injection validator incorrectly flagged expressions in env: blocks when the YAML serializer outputs env: after run: (common in custom job steps). The original implementation used regex patterns to extract run blocks, which was fragile and couldn't properly distinguish between run content and subsequent YAML keys.

Changes

Refactored to use proper YAML parsing instead of regex

  • Removed complex regex pattern (runBlockRegex) that tried to match YAML structure
  • Implemented extractRunBlocks() function that recursively walks the parsed YAML tree
  • Uses github.com/goccy/go-yaml to parse workflow and extract all run: field values
  • More robust approach that handles any YAML key ordering correctly

Added comprehensive test coverage

  • 11 tests for YAML key ordering variations (env before/after run with both pipe indicators)
  • 10 tests for invalid YAML handling (malformed syntax, empty YAML, parse errors)
  • 14 tests for YAML parsing edge cases (non-string values, complex nesting, multiline formats)
  • Fixed existing test YAML syntax to use proper quoting for valid YAML
  • Total: 69 template injection tests (all passing)

Example

Before: False positive when env follows run

- name: Sign image
  run: |
    echo "digest: $DIGEST"
  env:
    DIGEST: ${{ steps.build.outputs.digest }}  # Incorrectly flagged as unsafe

After: Correctly recognized as safe

# Same YAML now compiles without errors

Why YAML Parsing Is Better

Previous approach (regex):

  • Complex regex trying to parse YAML structure patterns
  • Fragile - broke with different key orderings
  • Hard to maintain and understand
  • Required special patterns to detect YAML boundaries

Current approach (YAML parsing):

  • Leverages proper YAML parser to understand structure
  • Works with any key ordering automatically
  • More maintainable and easier to understand
  • Extracts actual run: field values from parsed tree
  • Handles invalid YAML gracefully (skips validation without crashing)

Test Coverage

The validator is now thoroughly tested with:

  • Valid YAML with safe patterns (expressions in env blocks)
  • Valid YAML with unsafe patterns (expressions in run blocks)
  • Invalid/malformed YAML (graceful error handling)
  • Edge cases (null values, non-string fields, complex nesting)
  • All YAML multiline formats and chomping indicators

The validator handles env blocks correctly regardless of YAML serialization order while maintaining detection of actual template injection vulnerabilities.

Original prompt

This section details on the original issue you should resolve

<issue_title>Template injection validator incorrectly flags env: blocks due to YAML serialization order</issue_title>
<issue_description>## Bug Description

The template injection validator incorrectly flags expressions in env: blocks as unsafe, even though using environment variables is the documented safe pattern. This happens because:

  1. The validator regex expects env: to appear before run: in the YAML
  2. The YAML serializer outputs run: before env: for custom job steps
  3. The regex then captures env: blocks as part of run: blocks, flagging safe code as unsafe

Reproduction Steps

  1. Create a workflow with a custom job that uses step outputs in env blocks:
---
name: Test Workflow
on: workflow_dispatch
permissions:
  contents: read
jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - name: Build
        id: build
        uses: docker/build-push-action@v5
        with:
          context: ./
          
      - name: Sign image
        env:
          DIGEST: ${{ steps.build.outputs.digest }}
        run: |
          echo "Signing image with digest: $DIGEST"
---
Test prompt
  1. Run gh aw compile

  2. Observe the error:

✗ error: template injection vulnerabilities detected in compiled workflow

  steps.*.outputs context (1 occurrence(s)):
    - ${{ steps.build.outputs.digest }}
      in: DIGEST: ${{ steps.build.outputs.digest }}

Expected Behavior

The workflow should compile successfully because the expression is used in an env: block, which is the documented safe pattern.

Actual Behavior

The compiler flags the expression as unsafe, even though it's in an env: block.

Root Cause Analysis

1. Validator Regex Issue

In pkg/workflow/template_injection_validation.go:69, the regex for multi-line run blocks:

runBlockRegex = regexp.MustCompile(`(?m)^\s+run:\s*\|\s*\n((?:[ \t]+.+\n?)+?)\s*(?:^[ \t]*-\s|\z)|^\s+run:\s*(.+)$`)

This regex captures content until it finds:

  • A line starting with - (next step)
  • End of string

But env: lines at the same indentation don't match either pattern, so they get included in the capture.

2. YAML Serialization Order Issue

The compiled YAML has different key ordering:

For AI agent steps (correct order - validation passes):

      - env:
          DRY_RUN: ${{ github.event.inputs.dry_run }}
        name: Check dry run mode
        run: |
          ...

For custom job steps (incorrect order - validation fails):

      - name: Sign image
        run: |
          echo "$DIGEST"
        env:
          DIGEST: ${{ steps.build.outputs.digest }}

3. Test Case Confirms the Issue

In pkg/workflow/template_injection_validation_test.go:382-394, the safe test case has env: BEFORE run::

        env:
          GATEWAY_PID: ${{ steps.start-mcp-gateway.outputs.gateway-pid }}
        run: |
          bash /opt/gh-aw/actions/stop_mcp_gateway.sh "$GATEWAY_PID"

And the test passes. But when run: comes before env: (as in the custom jobs serialization), the validation fails.

Suggested Fixes

Either:

  1. Fix the YAML serializer to output env: before run: for custom job steps (consistent with AI agent steps)

  2. Fix the regex to properly detect YAML step boundaries by stopping at any YAML key at the same indentation level (env:, if:, name:, with:, etc.), not just - markers

  3. Use proper YAML parsing instead of regex to extract run block content

Environment

  • gh-aw version: v0.37.3
  • OS: Linux</issue_description>

Comments on the Issue (you are @copilot in this section)

@Mossaka ## Minimal Reproduction Found

The bug triggers when:

  1. The YAML serializer outputs run: | (literal block with keep indicator)
  2. env: appears after run: in the compiled YAML

It does NOT trigger when the serializer outputs run: |- (literal block with strip indicator).

Reproduction Files

Triggers bug (trailing blank line before --- causes run: |):

---
name: Test Bug
on: workflow_dispatch
permissions:
  contents: read
tools:
  bash:
    - "*"
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - id: get_value
        run: echo "value=test" >> $GITHUB_OUTPUT
      - name: Use value safely
        env:
          MY_VALUE: ${{ steps.get_value.outputs.value }}
        run: |
          echo "Line 1"
          echo "Value: $MY_VALUE"

---
Test prompt

Compiled output (note run: | and env: after run:):

      - name: Use value safely
        run: |
          echo "Line 1"
          echo "Value: $MY_VALUE"
        env:
          MY_VALUE: ${{ steps.get_value.outputs.value }}

Does NOT trigger bug (no trailing blank line causes run: |-):

        run: |
          echo "Line 1"
          echo "Value: $MY_VALUE"
          echo "Line 3"
--...

</details>



<!-- START COPILOT CODING AGENT SUFFIX -->

- Fixes githubnext/gh-aw#11378

<!-- START COPILOT CODING AGENT TIPS -->
---

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs.

- Updated runBlockRegex to match both run: | and run: |- indicators
- Fixed regex to stop at any YAML key at step level (env:, if:, with:, etc.)
- Added comprehensive tests for various YAML key orderings
- Tests verify safe patterns (env after run) and unsafe patterns still detected

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix template injection validator for env: blocks Fix template injection validator false positives when env: follows run: Jan 24, 2026
Copilot AI requested a review from pelikhan January 24, 2026 03:30
Copy link
Contributor

@pelikhan pelikhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot tests

Copy link
Contributor Author

Copilot AI commented Jan 24, 2026

@copilot tests

All template injection validator tests pass successfully:

Template injection tests (45/45 passing):

  • TestValidateNoTemplateInjection: 14/14 pass
  • TestTemplateInjectionYAMLKeyOrdering: 11/11 pass (new tests for this fix)
  • TestTemplateInjectionHeredocFiltering: 6/6 pass
  • TestTemplateInjectionEdgeCases: 12/12 pass
  • TestTemplateInjectionRealWorldPatterns: 2/2 pass

Manual verification:

./gh-aw compile workflow.md
⚠ Compiled 1 workflow(s): 0 error(s), 1 warning(s)

The fix correctly handles expressions in env: blocks regardless of YAML key ordering, while still detecting unsafe patterns where expressions are used directly in run: blocks.

Note: There are some pre-existing test failures in pkg/workflow unrelated to this change (MCP config rendering, numeric reactions, etc.).

Copilot AI requested a review from pelikhan January 24, 2026 05:45
// It detects cases where GitHub Actions expressions are used directly in shell commands
// instead of being passed through environment variables
func validateNoTemplateInjection(yamlContent string) error {
templateInjectionValidationLog.Print("Validating compiled YAML for template injection risks")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot try parsing the yaml string, and walk the tree to extra run fields

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored to use YAML parsing instead of regex (commit 9450b7f).

Changes:

  • Replaced regex-based extraction with extractRunBlocks() that recursively walks the parsed YAML tree
  • Uses github.com/goccy/go-yaml to parse workflow and extract all run: field values
  • More robust - handles any YAML key ordering correctly

Benefits:

  • No fragile regex patterns trying to match YAML structure
  • Works with env: before or after run:
  • More maintainable and easier to understand

All 45 template injection tests pass ✅

- Replaced regex-based extraction with proper YAML tree walking
- Added extractRunBlocks() to recursively extract run: fields from parsed YAML
- Fixed test YAML to use proper quoting (invalid YAML was causing parse errors)
- Validates by parsing YAML and walking the tree structure
- More robust and handles all YAML key orderings correctly

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI requested a review from pelikhan January 24, 2026 05:59
@pelikhan
Copy link
Contributor

@copilot add more tests, including invalid yaml

- Added TestTemplateInjectionInvalidYAML with 10 test cases for malformed YAML
- Added TestTemplateInjectionYAMLParsingEdgeCases with 14 test cases for YAML parsing edge cases
- Tests cover: invalid YAML syntax, empty/null values, non-string run fields, complex nesting
- Tests verify graceful handling of parse errors (skip validation without failing)
- All 69 template injection tests pass

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jan 24, 2026

@copilot add more tests, including invalid yaml

Added comprehensive test coverage (commit dab38bc):

TestTemplateInjectionInvalidYAML - 10 tests

  • Malformed YAML (missing brackets, invalid indentation, tabs)
  • Empty/whitespace-only YAML
  • Unquoted colons in values
  • Complex nested structures with unsafe expressions

TestTemplateInjectionYAMLParsingEdgeCases - 14 tests

  • Non-string run values (null, numeric, boolean, array, map)
  • Deeply nested job structures
  • Mix of uses/run steps
  • All YAML multiline formats (pipe, folded, chomping indicators)

Total: 69 template injection tests (all passing) ✅

Invalid YAML is handled gracefully - parse errors cause validation to skip without crashing.

@pelikhan pelikhan marked this pull request as ready for review January 24, 2026 10:49
@pelikhan pelikhan merged commit b54af87 into main Jan 24, 2026
118 checks passed
@pelikhan pelikhan deleted the copilot/fix-template-injection-validator branch January 24, 2026 10:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants