Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 88 additions & 0 deletions docs/adr/26666-add-pre-agent-steps-support.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# ADR-26666: Introduce `pre-agent-steps` as a Distinct Workflow Extension Point

**Date**: 2026-04-16
**Status**: Draft
**Deciders**: pelikhan, Copilot

---

## Part 1 — Narrative (Human-Friendly)

### Context

The gh-aw workflow compiler translates markdown workflow files into GitHub Actions YAML. It already supports two custom step extension points: `pre-steps` (injected before the checkout step, very early in the agent job) and `post-steps` (injected after AI engine execution). A gap exists for steps that must run after all framework initialization — including checkout, credential setup, and environment preparation — but *immediately before* the AI engine starts. Use cases include final context preparation scripts, last-moment environment variable injection, and validations that depend on checked-out code being available.

### Decision

We will add a new `pre-agent-steps` frontmatter field that is injected into the compiled GitHub Actions YAML immediately before the engine execution step, after all setup steps (including the CLI proxy start step). Like `pre-steps`, `pre-agent-steps` participates in the import merge system: imported `pre-agent-steps` are prepended before the main workflow's `pre-agent-steps`, preserving a deterministic, layered execution order. The existing `validateStepsSecrets` infrastructure is extended to cover the new field, applying identical secret expression restrictions.

### Alternatives Considered

#### Alternative 1: Overload the existing `steps` field with ordering semantics

The existing `steps` field (custom steps) could be repurposed or extended with explicit positioning metadata (e.g., `position: before-engine`). This was rejected because `steps` already has a defined execution context and adding positional metadata would complicate both the schema and the compiler without providing a clearer mental model. A dedicated named field communicates intent more directly.

#### Alternative 2: Extend `pre-steps` with a phase option

`pre-steps` could accept an optional `phase: post-setup` attribute to signal late-bound execution. This was rejected because it conflates two very different lifecycle positions (before checkout vs. just before the engine) into a single field, making the execution model harder to reason about. Keeping phase-specific fields separate aligns with how GitHub Actions itself structures job steps.

#### Alternative 3: Require users to use `post-steps` with negated conditions

Users could approximate "run before engine" behaviour by placing steps in `post-steps` with `if: always()` or condition expressions that short-circuit. This is semantically backwards, error-prone to write, and would not actually execute *before* the AI engine — it would run after, defeating the use case entirely.

### Consequences

#### Positive
- Users can inject pre-execution logic (context preparation, validation, environment mutations) at exactly the right lifecycle point — after setup, before the agent starts.
- The new field mirrors the naming and import-merge semantics of `pre-steps` and `post-steps`, keeping the mental model consistent and the feature discoverable.
- Secret expression validation is automatically enforced on the new field using the existing `validateStepsSecrets` infrastructure, preserving the security model without new code.
- Integration tests cover placement order (after clean-git-credentials, before engine execution) and import merge ordering.

#### Negative
- The workflow extension model now has three named step positions (`pre-steps`, `pre-agent-steps`, `post-steps`) plus the main `steps` field; the distinction between `pre-steps` and `pre-agent-steps` requires documentation to avoid confusion.
- Each new step type requires coordinated changes across the schema, type definitions, import extractor, compiler builder, YAML generator, serialization, and documentation — increasing the cost of future schema evolution.

#### Neutral
- The import merge ordering convention (imported steps prepend, main steps append) is the same as `pre-steps`; this is a neutral consistency consequence rather than a new design choice.
- The `docs/adr/` filename uses the PR number as the sequence identifier, consistent with existing ADR naming in this repository.

---

## Part 2 — Normative Specification (RFC 2119)

> The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHALL**, **SHALL NOT**, **SHOULD**, **SHOULD NOT**, **RECOMMENDED**, **MAY**, and **OPTIONAL** in this section are to be interpreted as described in [RFC 2119](https://www.rfc-editor.org/rfc/rfc2119).

### Frontmatter Schema

1. Workflows **MAY** define a `pre-agent-steps` top-level frontmatter field containing an array of GitHub Actions step definitions.
2. The `pre-agent-steps` field **MUST** conform to the same step schema as `post-steps` (object or string items, with `additionalProperties: true`).
3. Implementations **MUST NOT** treat `pre-agent-steps` and `pre-steps` as interchangeable; they represent different lifecycle positions and **MUST** be processed independently.

### Compilation and Placement

1. Implementations **MUST** inject all resolved `pre-agent-steps` into the compiled GitHub Actions YAML immediately before the engine execution step.
2. `pre-agent-steps` **MUST** be placed after all framework setup steps (including the CLI proxy start step) and **MUST NOT** be placed before any framework-owned initialization steps.
3. If no `pre-agent-steps` are defined (neither in the main workflow nor in any imports), the compiler **MUST NOT** emit any placeholder or empty steps block.

### Import Merge Ordering

1. When a workflow imports other workflows that also define `pre-agent-steps`, implementations **MUST** prepend imported `pre-agent-steps` before the main workflow's `pre-agent-steps` in the merged output.
2. Import merge order **MUST** follow the same topological ordering used for all other merged fields.

### Secret Expression Validation

1. Implementations **MUST** apply secret expression validation to the `pre-agent-steps` field using the same rules applied to `pre-steps`, `steps`, and `post-steps`.
2. In strict mode, secrets expressions in non-`env`/`with` contexts within `pre-agent-steps` **MUST** cause a compilation error.
3. In non-strict mode, secrets expressions in `pre-agent-steps` **SHOULD** emit a warning.

### Action Pinning

1. Implementations **MUST** apply action pin resolution (SHA substitution) to `uses:` references in `pre-agent-steps`, consistent with how pinning is applied to `pre-steps` and `post-steps`.

### Conformance

An implementation is considered conformant with this ADR if it satisfies all **MUST** and **MUST NOT** requirements above. Specifically: `pre-agent-steps` are injected immediately before the engine execution step (not before setup steps), imports are merged in prepend order, secret validation is applied, and action pinning is applied. Failure to meet any **MUST** or **MUST NOT** requirement constitutes non-conformance.

---

*This is a DRAFT ADR generated by the [Design Decision Gate](https://github.com/github/gh-aw/actions/runs/24519930482) workflow. The PR author must review, complete, and finalize this document before the PR can merge.*
14 changes: 14 additions & 0 deletions docs/src/content/docs/reference/frontmatter.md
Original file line number Diff line number Diff line change
Expand Up @@ -649,6 +649,20 @@ Use custom steps to precompute data, filter triggers, or prepare context for AI

Custom steps run outside the firewall sandbox. These steps execute with standard GitHub Actions security.

## Pre-Agent Steps (`pre-agent-steps:`)

Add custom steps immediately before the agent execution step, after all initialization/setup logic in the agent job.

```yaml wrap
pre-agent-steps:
- name: Finalize Context
run: ./scripts/prepare-agent-context.sh
```

Use pre-agent steps when work must happen right before the engine runs (for example, final context preparation or last-moment validations).

Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new pre-agent-steps section doesn’t mention that these steps execute outside the firewall sandbox (like steps: and post-steps: do). Consider adding a sentence clarifying the security/runtime context for pre-agent-steps for consistency and to avoid confusion.

Suggested change
Pre-agent steps run outside the firewall sandbox. These steps execute with standard GitHub Actions security.

Copilot uses AI. Check for mistakes.
Pre-agent steps run outside the firewall sandbox. These steps execute with standard GitHub Actions security.

## Post-Execution Steps (`post-steps:`)

Add custom steps after agentic execution. Run after AI engine completes regardless of success/failure (unless conditional expressions are used).
Expand Down
10 changes: 9 additions & 1 deletion pkg/parser/import_field_extractor.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ type importAccumulator struct {
stepsBuilder strings.Builder
copilotSetupStepsBuilder strings.Builder // Steps from copilot-setup-steps.yml (inserted at start)
preStepsBuilder strings.Builder
preAgentStepsBuilder strings.Builder
runtimesBuilder strings.Builder
servicesBuilder strings.Builder
networkBuilder strings.Builder
Expand Down Expand Up @@ -78,7 +79,7 @@ func newImportAccumulator() *importAccumulator {
// extractAllImportFields extracts all frontmatter fields from a single imported file
// and accumulates the results. Handles tools, engines, mcp-servers, safe-outputs,
// mcp-scripts, steps, runtimes, services, network, permissions, secret-masking, bots,
// skip-roles, skip-bots, pre-steps, post-steps, labels, cache, and features.
// skip-roles, skip-bots, pre-steps, pre-agent-steps, post-steps, labels, cache, and features.
func (acc *importAccumulator) extractAllImportFields(content []byte, item importQueueItem, visited map[string]bool) error {
log.Printf("Extracting all import fields: path=%s, section=%s, inputs=%d, content_size=%d bytes", item.fullPath, item.sectionName, len(item.inputs), len(content))

Expand Down Expand Up @@ -344,6 +345,12 @@ func (acc *importAccumulator) extractAllImportFields(content []byte, item import
acc.preStepsBuilder.WriteString(preStepsContent + "\n")
}

// Extract pre-agent-steps from imported file (prepend in order)
preAgentStepsContent, err := extractYAMLFieldFromMap(fm, "pre-agent-steps")
if err == nil && preAgentStepsContent != "" {
acc.preAgentStepsBuilder.WriteString(preAgentStepsContent + "\n")
}

// Extract post-steps from imported file (append in order)
postStepsContent, err := extractYAMLFieldFromMap(fm, "post-steps")
if err == nil && postStepsContent != "" {
Expand Down Expand Up @@ -459,6 +466,7 @@ func (acc *importAccumulator) toImportsResult(topologicalOrder []string) *Import
MergedSteps: acc.stepsBuilder.String(),
CopilotSetupSteps: acc.copilotSetupStepsBuilder.String(),
MergedPreSteps: acc.preStepsBuilder.String(),
MergedPreAgentSteps: acc.preAgentStepsBuilder.String(),
MergedRuntimes: acc.runtimesBuilder.String(),
MergedRunInstallScripts: acc.runInstallScripts,
MergedServices: acc.servicesBuilder.String(),
Expand Down
1 change: 1 addition & 0 deletions pkg/parser/import_processor.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ type ImportsResult struct {
MergedSteps string // Merged steps configuration from all imports (excluding copilot-setup-steps)
CopilotSetupSteps string // Steps from copilot-setup-steps.yml (inserted at start)
MergedPreSteps string // Merged pre-steps configuration from all imports (prepended in order)
MergedPreAgentSteps string // Merged pre-agent-steps configuration from all imports (prepended in order)
MergedRuntimes string // Merged runtimes configuration from all imports
MergedRunInstallScripts bool // true if any imported workflow sets run-install-scripts: true (global or node-level)
MergedServices string // Merged services configuration from all imports
Expand Down
31 changes: 31 additions & 0 deletions pkg/parser/schemas/main_workflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -3374,6 +3374,37 @@
}
]
},
"pre-agent-steps": {
"description": "Custom workflow steps to run immediately before AI execution, after all initialization and setup steps in the agent job.",
"oneOf": [
{
"type": "object",
"additionalProperties": true
},
{
"type": "array",
"items": {
"oneOf": [
{
"type": "string"
},
{
"type": "object",
"additionalProperties": true
}
]
},
"examples": [
[
{
"name": "Prepare final context",
"run": "echo \"ready\""
}
]
]
}
]
},
"post-steps": {
"description": "Custom workflow steps to run after AI execution",
"oneOf": [
Expand Down
3 changes: 3 additions & 0 deletions pkg/workflow/compiler_orchestrator_workflow.go
Original file line number Diff line number Diff line change
Expand Up @@ -185,6 +185,9 @@ func (c *Compiler) ParseWorkflowFile(markdownPath string) (*WorkflowData, error)
// Process and merge pre-steps
c.processAndMergePreSteps(result.Frontmatter, workflowData, engineSetup.importsResult)

// Process and merge pre-agent-steps
c.processAndMergePreAgentSteps(result.Frontmatter, workflowData, engineSetup.importsResult)

// Process and merge post-steps
c.processAndMergePostSteps(result.Frontmatter, workflowData, engineSetup.importsResult)

Expand Down
59 changes: 59 additions & 0 deletions pkg/workflow/compiler_orchestrator_workflow_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -530,6 +530,65 @@ func TestProcessAndMergePreSteps_WithImportedPreSteps(t *testing.T) {
assert.Less(t, importedIdx, mainIdx, "Imported pre-steps should come before main pre-steps")
}

// TestProcessAndMergePreAgentSteps_NoPreAgentSteps tests processAndMergePreAgentSteps with no pre-agent-steps
func TestProcessAndMergePreAgentSteps_NoPreAgentSteps(t *testing.T) {
compiler := NewCompiler()
workflowData := &WorkflowData{}
frontmatter := map[string]any{}
importsResult := &parser.ImportsResult{}

compiler.processAndMergePreAgentSteps(frontmatter, workflowData, importsResult)

assert.Empty(t, workflowData.PreAgentSteps)
}

// TestProcessAndMergePreAgentSteps_WithPreAgentSteps tests processAndMergePreAgentSteps with pre-agent-steps defined
func TestProcessAndMergePreAgentSteps_WithPreAgentSteps(t *testing.T) {
compiler := NewCompiler()
workflowData := &WorkflowData{}

frontmatter := map[string]any{
"pre-agent-steps": []any{
map[string]any{"name": "Prepare final context", "run": "echo 'prepare'"},
},
}
importsResult := &parser.ImportsResult{}

compiler.processAndMergePreAgentSteps(frontmatter, workflowData, importsResult)

assert.NotEmpty(t, workflowData.PreAgentSteps)
assert.Contains(t, workflowData.PreAgentSteps, "Prepare final context")
}

// TestProcessAndMergePreAgentSteps_WithImportedPreAgentSteps tests that imported pre-agent-steps are prepended
func TestProcessAndMergePreAgentSteps_WithImportedPreAgentSteps(t *testing.T) {
compiler := NewCompiler()
workflowData := &WorkflowData{}

frontmatter := map[string]any{
"pre-agent-steps": []any{
map[string]any{"name": "Main pre-agent step", "run": "echo 'main'"},
},
}

importedPreAgentStepsYAML, err := yaml.Marshal([]any{
map[string]any{"name": "Imported pre-agent step", "run": "echo 'imported'"},
})
require.NoError(t, err, "yaml.Marshal should not fail for well-formed pre-agent-steps")
importsResult := &parser.ImportsResult{
MergedPreAgentSteps: string(importedPreAgentStepsYAML),
}

compiler.processAndMergePreAgentSteps(frontmatter, workflowData, importsResult)

assert.Contains(t, workflowData.PreAgentSteps, "Main pre-agent step")
assert.Contains(t, workflowData.PreAgentSteps, "Imported pre-agent step")

importedIdx := strings.Index(workflowData.PreAgentSteps, "Imported pre-agent step")
mainIdx := strings.Index(workflowData.PreAgentSteps, "Main pre-agent step")
assert.Less(t, importedIdx, mainIdx, "Imported pre-agent-steps should come before main pre-agent-steps")
}

// TestProcessAndMergeServices_NoServices tests processAndMergeServices with no services
func TestProcessAndMergeServices_NoServices(t *testing.T) {
compiler := NewCompiler()
Expand Down
126 changes: 126 additions & 0 deletions pkg/workflow/compiler_pre_agent_steps_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
//go:build !integration

package workflow

import (
"os"
"path/filepath"
"strings"
"testing"

"github.com/github/gh-aw/pkg/testutil"
)

func TestPreAgentStepsGeneration(t *testing.T) {
tmpDir := testutil.TempDir(t, "pre-agent-steps-test")

testContent := `---
on: push
permissions:
contents: read
pre-agent-steps:
- name: Finalize prompt context
run: echo "finalize"
engine: claude
strict: false
---

Test pre-agent-steps.
`

testFile := filepath.Join(tmpDir, "test-pre-agent-steps.md")
if err := os.WriteFile(testFile, []byte(testContent), 0644); err != nil {
t.Fatal(err)
}

compiler := NewCompiler()
if err := compiler.CompileWorkflow(testFile); err != nil {
t.Fatalf("Unexpected error compiling workflow with pre-agent-steps: %v", err)
}

lockFile := filepath.Join(tmpDir, "test-pre-agent-steps.lock.yml")
content, err := os.ReadFile(lockFile)
if err != nil {
t.Fatalf("Failed to read generated lock file: %v", err)
}
lockContent := string(content)

if !strings.Contains(lockContent, "- name: Finalize prompt context") {
t.Error("Expected pre-agent-step to be in generated workflow")
}

cleanGitCredsIndex := indexInNonCommentLines(lockContent, "- name: Clean git credentials")
preAgentStepIndex := indexInNonCommentLines(lockContent, "- name: Finalize prompt context")
aiStepIndex := indexInNonCommentLines(lockContent, "- name: Execute Claude Code CLI")
if cleanGitCredsIndex == -1 || preAgentStepIndex == -1 || aiStepIndex == -1 {
t.Fatal("Could not find expected steps in generated workflow")
}
if preAgentStepIndex <= cleanGitCredsIndex {
t.Errorf("Pre-agent-step (%d) should appear after clean git credentials (%d)", preAgentStepIndex, cleanGitCredsIndex)
}
if preAgentStepIndex >= aiStepIndex {
t.Errorf("Pre-agent-step (%d) should appear before AI execution step (%d)", preAgentStepIndex, aiStepIndex)
}
}

func TestPreAgentStepsImportsMergeOrder(t *testing.T) {
tmpDir := testutil.TempDir(t, "pre-agent-steps-imports-test")

sharedContent := `---
pre-agent-steps:
- name: Imported pre-agent step
run: echo "imported"
---

Shared steps.
`
sharedFile := filepath.Join(tmpDir, "shared.md")
if err := os.WriteFile(sharedFile, []byte(sharedContent), 0644); err != nil {
t.Fatal(err)
}

mainContent := `---
on: issues
permissions:
contents: read
imports:
- ./shared.md
pre-agent-steps:
- name: Main pre-agent step
run: echo "main"
engine: claude
strict: false
---

Main workflow.
`
mainFile := filepath.Join(tmpDir, "main.md")
if err := os.WriteFile(mainFile, []byte(mainContent), 0644); err != nil {
t.Fatal(err)
}

compiler := NewCompiler()
if err := compiler.CompileWorkflow(mainFile); err != nil {
t.Fatalf("Unexpected error compiling workflow with imported pre-agent-steps: %v", err)
}

lockFile := filepath.Join(tmpDir, "main.lock.yml")
content, err := os.ReadFile(lockFile)
if err != nil {
t.Fatalf("Failed to read generated lock file: %v", err)
}
lockContent := string(content)

importedIdx := indexInNonCommentLines(lockContent, "- name: Imported pre-agent step")
mainIdx := indexInNonCommentLines(lockContent, "- name: Main pre-agent step")
aiStepIdx := indexInNonCommentLines(lockContent, "- name: Execute Claude Code CLI")
if importedIdx == -1 || mainIdx == -1 || aiStepIdx == -1 {
t.Fatal("Could not find expected pre-agent and AI steps in generated workflow")
}
if importedIdx >= mainIdx {
t.Errorf("Imported pre-agent-step (%d) should appear before main pre-agent-step (%d)", importedIdx, mainIdx)
}
if mainIdx >= aiStepIdx {
t.Errorf("Main pre-agent-step (%d) should appear before AI execution step (%d)", mainIdx, aiStepIdx)
}
}
Loading