From 648b8a9ec29c2eadaf23569e0475ce9297e61512 Mon Sep 17 00:00:00 2001 From: "github-actions[bot]" <41898282+github-actions[bot]@users.noreply.github.com> Date: Tue, 14 Apr 2026 10:50:50 +0000 Subject: [PATCH] docs: add package specifications for cli, parser, and workflow packages MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add README.md specifications for the three packages that were missing documentation: - pkg/cli: CLI command implementations for the gh aw extension including all command groups (add, compile, run, audit, logs, mcp, update, etc.), key types, and exported functions - pkg/parser: Markdown frontmatter parsing, import resolution, GitHub URL handling, schema validation, and schedule parsing - pkg/workflow: Workflow compilation, validation, engine integration, safe-outputs, and GitHub Actions YAML generation — the compilation core Each README follows W3C specification style with types, functions, usage examples, design decisions, and dependency documentation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- pkg/cli/README.md | 200 ++++++++++++++++++ pkg/parser/README.md | 243 ++++++++++++++++++++++ pkg/workflow/README.md | 458 +++++++++++++++++++++++++++++++++++++++++ 3 files changed, 901 insertions(+) create mode 100644 pkg/cli/README.md create mode 100644 pkg/parser/README.md create mode 100644 pkg/workflow/README.md diff --git a/pkg/cli/README.md b/pkg/cli/README.md new file mode 100644 index 00000000000..e231916732f --- /dev/null +++ b/pkg/cli/README.md @@ -0,0 +1,200 @@ +# cli Package + +> CLI command implementations for the `gh aw` extension — the primary user interface for authoring, compiling, running, and monitoring agentic GitHub workflows. + +## Overview + +The `cli` package implements all commands exposed through the `gh aw` CLI extension. Each command is implemented as a Cobra command with a dedicated `New*Command()` constructor and a `Run*()` function that encapsulates the testable business logic. + +The package is intentionally decomposed into many small files grouped by feature domain (e.g., `compile_*.go`, `audit_*.go`, `run_*.go`, `mcp_*.go`). This structure keeps individual files under 300 lines and promotes independent testing of each sub-domain. + +All diagnostic output MUST go to `stderr` using `console` formatting helpers. Structured output (JSON, hashes, graphs) goes to `stdout`. + +## Command Groups + +| Command | Entry Point | Description | +|---------|-------------|-------------| +| `gh aw add` | `NewAddCommand` | Add remote or local workflows to the repository | +| `gh aw add-wizard` | `NewAddWizardCommand` | Interactive wizard for adding workflows | +| `gh aw compile` | (compile_command.go) | Compile `.md` workflow files into GitHub Actions `.lock.yml` | +| `gh aw run` | `NewRunCommand` (run_command.go) | Dispatch and monitor workflow runs | +| `gh aw audit` | `NewAuditCommand` | Audit a specific workflow run by run ID | +| `gh aw audit diff` | `NewAuditDiffSubcommand` | Diff audit data between multiple runs | +| `gh aw logs` | `NewLogsCommand` | Download and analyze workflow run logs | +| `gh aw mcp` | `NewMCPCommand` | Manage MCP server configurations | +| `gh aw mcp add` | `NewMCPAddSubcommand` | Add an MCP tool to a workflow | +| `gh aw mcp inspect` | `NewMCPInspectSubcommand` | Inspect MCP servers in a workflow | +| `gh aw mcp list` | `NewMCPListSubcommand` | List workflows using MCP servers | +| `gh aw mcp list-tools` | `NewMCPListToolsSubcommand` | List tools for a specific MCP server | +| `gh aw mcp server` | `NewMCPServerCommand` | Run as an MCP server (for IDE integration) | +| `gh aw update` | `NewUpdateCommand` | Update workflows from upstream sources | +| `gh aw upgrade` | `NewUpgradeCommand` | Upgrade workflows to latest format | +| `gh aw validate` | `NewValidateCommand` | Validate workflow files without compiling | +| `gh aw fix` | `NewFixCommand` | Apply automatic codemods to fix deprecated patterns | +| `gh aw status` | `NewStatusCommand` | Show status of workflows in the repository | +| `gh aw health` | `NewHealthCommand` | Compute health metrics across workflow runs | +| `gh aw checks` | `NewChecksCommand` | Show CI check results for a PR | +| `gh aw domains` | `NewDomainsCommand` | List domains used by workflows | +| `gh aw hash` | `NewHashCommand` | Print frontmatter hash of a workflow file | +| `gh aw init` | `NewInitCommand` | Initialize a repository for agentic workflows | +| `gh aw list` | `NewListCommand` | List installed workflows | +| `gh aw pr` | `NewPRCommand` | Pull-request helpers | +| `gh aw project` | `NewProjectCommand` | Project management helpers | +| `gh aw remove` | `NewRemoveCommand` | Remove workflow files from the repository | +| `gh aw secrets` | `NewSecretsCommand` | Manage workflow secrets | +| `gh aw trial` | `NewTrialCommand` | Run trial workflow executions | +| `gh aw deps` | (deps_*.go) | Dependency inspection and security advisories | +| `gh aw completion` | `NewCompletionCommand` | Generate shell completion scripts | + +## Public API + +### Key Types + +| Type | File | Description | +|------|------|-------------| +| `CompileConfig` | `compile_config.go` | Configuration for `CompileWorkflows` — file list, flags, validation options | +| `ValidationResult` | `compile_config.go` | Result of a compilation validation pass | +| `AddOptions` | `add_command.go` | Options controlling workflow addition behavior | +| `AddWorkflowsResult` | `add_command.go` | Result of `AddWorkflows` / `AddResolvedWorkflows` | +| `ResolvedWorkflow` | `add_workflow_resolution.go` | A single resolved workflow with source metadata | +| `ResolvedWorkflows` | `add_workflow_resolution.go` | Collection of resolved workflows | +| `RunOptions` | `run_workflow_execution.go` | Options for `RunWorkflowOnGitHub` | +| `WorkflowRunResult` | `run_workflow_execution.go` | Result of a triggered workflow run | +| `AuditData` | `audit_report.go` | Full audit data structure for a workflow run | +| `AuditDiff` | `audit_diff.go` | Diff between two audit runs | +| `CrossRunAuditReport` | `audit_cross_run.go` | Cross-run trend analysis | +| `HealthConfig` | `health_command.go` | Configuration for health computation | +| `WorkflowHealth` | `health_metrics.go` | Per-workflow health metrics | +| `HealthSummary` | `health_metrics.go` | Aggregate health across all workflows | +| `DependencyReport` | `deps_report.go` | Full dependency report | +| `OutdatedDependency` | `deps_outdated.go` | An outdated dependency entry | +| `SecurityAdvisory` | `deps_security.go` | A security advisory entry | +| `WorkflowStatus` | `status_command.go` | Run status for a single workflow | +| `MCPRegistryClient` | `mcp_registry.go` | Client for the MCP registry API | +| `ToolGraph` | `tool_graph.go` | Dependency graph of MCP tools | +| `DependencyGraph` | `dependency_graph.go` | Dependency graph across workflows | +| `FileTracker` | `file_tracker.go` | Tracks files modified during an operation | +| `RepeatOptions` | `retry.go` | Options for `ExecuteWithRepeat` polling loop | +| `PollOptions` | `signal_aware_poll.go` | Options for `PollWithSignalHandling` | +| `FixConfig` | `fix_command.go` | Configuration for `RunFix` codemods | +| `TrialOptions` | `trial_types.go` | Options for `RunWorkflowTrials` | +| `WorkflowTrialResult` | `trial_types.go` | Result of a trial run | +| `UpgradeConfig` | `upgrade_command.go` | Configuration for `NewUpgradeCommand` | +| `ChecksConfig` | `checks_command.go` | Configuration for `RunChecks` | +| `ChecksResult` | `checks_command.go` | Result of `FetchChecksResult` | + +### Key Functions + +| Function | Signature | Description | +|----------|-----------|-------------| +| `CompileWorkflows` | `func(ctx, CompileConfig) ([]*workflow.WorkflowData, error)` | Orchestrates compilation of one or more workflow files | +| `CompileWorkflowWithValidation` | `func(*workflow.Compiler, filePath string, ...) error` | Compiles and validates a single workflow file | +| `AddWorkflows` | `func([]string, AddOptions) (*AddWorkflowsResult, error)` | Adds workflows from string specs | +| `ResolveWorkflows` | `func([]string, bool) (*ResolvedWorkflows, error)` | Resolves workflow specs to local paths and metadata | +| `RunWorkflowOnGitHub` | `func(ctx, string, RunOptions) error` | Dispatches a single workflow run on GitHub | +| `RunWorkflowsOnGitHub` | `func(ctx, []string, RunOptions) error` | Dispatches multiple workflows | +| `AuditWorkflowRun` | `func(ctx, runID int64, ...) error` | Downloads and renders an audit report for a run | +| `RunAuditDiff` | `func(ctx, baseRunID, compareRunIDs, ...) error` | Renders a diff between audit runs | +| `DownloadWorkflowLogs` | `func(ctx, workflowName string, ...) error` | Downloads and analyzes workflow logs | +| `RunListWorkflows` | `func(repo, path, pattern string, ...) error` | Lists installed workflows | +| `StatusWorkflows` | `func(pattern string, ...) error` | Prints workflow run status | +| `GetWorkflowStatuses` | `func(pattern, ref, ...) ([]WorkflowStatus, error)` | Fetches workflow statuses | +| `RunHealth` | `func(HealthConfig) error` | Computes and renders workflow health metrics | +| `CalculateWorkflowHealth` | `func(string, []WorkflowRun, float64) WorkflowHealth` | Pure health computation for a single workflow | +| `CalculateHealthSummary` | `func([]WorkflowHealth, string, float64) HealthSummary` | Aggregate health computation | +| `RunFix` | `func(FixConfig) error` | Applies automatic codemods | +| `GetAllCodemods` | `func() []Codemod` | Returns all available codemods | +| `InitRepository` | `func(InitOptions) error` | Initializes a repo with the `gh-aw` setup | +| `NewWorkflow` | `func(string, bool, bool, string) error` | Creates a new workflow markdown file | +| `IsRunnable` | `func(string) (bool, error)` | Checks whether a workflow file is runnable | +| `RunWorkflowInteractively` | `func(ctx, ...) error` | Interactive workflow selection and dispatch | +| `AddMCPTool` | `func(string, string, ...) error` | Adds an MCP server to a workflow file | +| `InspectWorkflowMCP` | `func(string, ...) error` | Inspects MCP server configurations | +| `ListWorkflowMCP` | `func(string, bool) error` | Lists MCP server info for a workflow | +| `UpdateActions` | `func(bool, bool, bool) error` | Bulk-updates GitHub Action versions in workflows | +| `UpdateWorkflows` | `func([]string, ...) error` | Updates workflows from upstream sources | +| `RemoveWorkflows` | `func(string, bool, string) error` | Removes workflow files | +| `ValidateWorkflowName` | `func(string) error` | Validates a workflow name identifier | +| `GetBinaryPath` | `func() (string, error)` | Returns the path to the `gh-aw` binary | +| `GetCurrentRepoSlug` | `func() (string, error)` | Returns `owner/repo` for the current directory | +| `GetVersion` | `func() string` | Returns the current CLI version | +| `SetVersionInfo` | `func(string)` | Sets the version at startup | +| `EnableWorkflowsByNames` | `func([]string, string) error` | Enables GitHub Actions workflows | +| `DisableWorkflowsByNames` | `func([]string, string) error` | Disables GitHub Actions workflows | +| `CheckOutdatedDependencies` | `func(bool) ([]OutdatedDependency, error)` | Checks for outdated dependencies | +| `CheckSecurityAdvisories` | `func(bool) ([]SecurityAdvisory, error)` | Checks for known CVEs | +| `GenerateDependencyReport` | `func(bool) (*DependencyReport, error)` | Full dependency analysis report | +| `InstallShellCompletion` | `func(bool, CommandProvider) error` | Installs shell completions | +| `PollWithSignalHandling` | `func(PollOptions) error` | Polls a predicate with SIGINT handling | +| `ExecuteWithRepeat` | `func(RepeatOptions) error` | Repeats an operation with delay | +| `IsRunningInCI` | `func() bool` | Detects CI environment | +| `DetectShell` | `func() ShellType` | Detects the user's current shell | + +## Usage Examples + +### Compiling a workflow + +```go +data, err := cli.CompileWorkflows(ctx, cli.CompileConfig{ + MarkdownFiles: []string{".github/workflows/my-workflow.md"}, + Verbose: true, + Validate: true, + Strict: false, +}) +``` + +### Running a workflow + +```go +err := cli.RunWorkflowOnGitHub(ctx, "my-workflow", cli.RunOptions{ + Repo: "owner/repo", + Verbose: true, +}) +``` + +### Auditing a run + +```go +err := cli.AuditWorkflowRun(ctx, runID, "owner", "repo", "github.com", + "/tmp/output", true, true, false, 0, 0, nil) +``` + +### Checking workflow health + +```go +err := cli.RunHealth(cli.HealthConfig{ + Pattern: "*.md", + Threshold: 0.8, + Period: "30d", +}) +``` + +## Design Decisions + +- **File-per-feature decomposition**: Large feature domains (compile, audit, logs, run) are split into multiple files (`_command.go`, `_config.go`, `_helpers.go`, `_orchestrator.go`, etc.) to keep each file focused and under 300 lines. +- **Testable Run functions**: Every command has a `New*Command()` for Cobra wiring and a `Run*()` function with explicit parameters for unit testing without CLI arg parsing overhead. +- **Stderr for diagnostics**: All user-visible messages use `console.Format*Message` helpers and write to `stderr`, preserving `stdout` for structured machine-readable output. +- **Context propagation**: Long-running operations accept `context.Context` to support cancellation (SIGINT, timeouts). +- **Config structs**: Command options are collected into dedicated `*Config` or `*Options` structs rather than passed as long argument lists, improving readability and testability. + +## Dependencies + +**Internal**: +- `pkg/workflow` — workflow compilation and data types +- `pkg/parser` — markdown frontmatter parsing +- `pkg/console` — terminal output formatting +- `pkg/logger` — structured debug logging +- `pkg/constants` — engine names, job names, feature flags +- `pkg/stringutil`, `pkg/fileutil`, `pkg/gitutil`, `pkg/repoutil` — utilities + +**External**: +- `github.com/spf13/cobra` — CLI framework +- `github.com/cli/go-gh/v2` — GitHub CLI integration + +## Thread Safety + +Individual command `Run*` functions are not concurrently safe unless explicitly documented. The `CompileWorkflows` orchestrator serializes compilation by default; parallel compilation is gated by `CompileConfig` flags. + +--- + +*This specification is automatically maintained by the [spec-extractor](../../.github/workflows/spec-extractor.md) workflow.* diff --git a/pkg/parser/README.md b/pkg/parser/README.md new file mode 100644 index 00000000000..07934759324 --- /dev/null +++ b/pkg/parser/README.md @@ -0,0 +1,243 @@ +# parser Package + +> Markdown frontmatter parsing, import resolution, GitHub URL handling, and schema validation for agentic workflow files. + +## Overview + +The `parser` package is responsible for extracting and processing YAML frontmatter from agentic workflow `.md` files. Frontmatter defines the workflow's entire configuration — triggers, permissions, tools, safe outputs, engine settings, network restrictions, and runtime overrides. The markdown body that follows the frontmatter serves as the AI agent's prompt text. + +Beyond basic frontmatter extraction, the package provides a rich import system that resolves `@import` directives (local files, GitHub URLs, fragments), an include expander for `@include` directives in the markdown body, a schedule parser that converts natural-language schedules into cron expressions, MCP server configuration extraction, and JSON schema–backed validation with actionable error messages. + +The package is designed for use both in the main CLI binary and in WebAssembly contexts (see `*_wasm.go` files). Build constraints separate platform-specific implementations for remote fetching and filesystem access. + +## Public API + +### Types + +| Type | Kind | Description | +|------|------|-------------| +| `FrontmatterResult` | struct | Result of extracting frontmatter from markdown content | +| `ImportCache` | struct | Thread-safe cache of resolved imports to avoid redundant fetches | +| `ImportDirectiveMatch` | struct | Parsed `@import` or `@include` directive line | +| `ImportError` | struct | Structured error for import resolution failures | +| `ImportCycleError` | struct | Structured error for circular import chains | +| `FormattedParserError` | struct | Pre-formatted parser error with display-ready message | +| `ImportsResult` | struct | Result of `ProcessImportsFromFrontmatterWithSource` | +| `ImportInputDefinition` | struct | Input definition from an imported workflow fragment | +| `ImportSpec` | struct | Resolved import specification (path, ref, optional flag) | +| `GitHubURLType` | string alias | Classifies a GitHub URL (`tree`, `blob`, `raw`, `run`, `pr`, etc.) | +| `GitHubURLComponents` | struct | Parsed components of a GitHub URL (owner, repo, ref, path, etc.) | +| `JSONPathLocation` | struct | Line/column location of a JSON path in YAML content | +| `JSONPathInfo` | struct | JSON path with human-readable description | +| `NestedSection` | struct | Locates nested YAML sections for error reporting | +| `PathSegment` | struct | A single segment in a resolved JSON path | +| `MCPServerConfig` | struct | Parsed MCP server configuration (type, command, URL, env, etc.) | +| `MCPServerInfo` | struct | Metadata about an MCP server entry | +| `ScheduleParser` | struct | Converts natural-language schedules to cron expressions | +| `DeprecatedField` | struct | A deprecated frontmatter field with migration guidance | +| `FileReader` | func type | `func(filePath string) ([]byte, error)` — abstraction for file reading | + +### Functions + +#### Frontmatter Extraction + +| Function | Signature | Description | +|----------|-----------|-------------| +| `ExtractFrontmatterFromContent` | `func(content string) (*FrontmatterResult, error)` | Extracts YAML frontmatter between `---` delimiters from markdown | +| `ExtractFrontmatterFromBuiltinFile` | `func(path string, content []byte) (*FrontmatterResult, error)` | Extracts frontmatter from an embedded/built-in workflow file | +| `ExtractMarkdownContent` | `func(content string) (string, error)` | Returns the markdown body (everything after frontmatter) | +| `ExtractMarkdownSection` | `func(content, sectionName string) (string, error)` | Extracts a named `##` section from markdown | +| `ExtractWorkflowNameFromMarkdownBody` | `func(markdownBody, virtualPath string) (string, error)` | Derives the workflow name from the first `#` heading | +| `ExtractWorkflowNameFromContent` | `func(content, virtualPath string) (string, error)` | Combines frontmatter extraction and name derivation | + +#### Import Processing + +| Function | Signature | Description | +|----------|-----------|-------------| +| `ProcessImportsFromFrontmatterWithSource` | `func(frontmatter map[string]any, baseDir string, cache *ImportCache, ...) (*ImportsResult, error)` | Resolves all `@import` directives in frontmatter, merging imported configs | +| `ParseImportDirective` | `func(line string) *ImportDirectiveMatch` | Parses a single `@import` or `@include` line | +| `NewImportCache` | `func(repoRoot string) *ImportCache` | Creates a new import cache rooted at the repository | +| `ExpandIncludesWithManifest` | `func(content, baseDir string, extractTools bool) (string, []string, error)` | Expands `@include` directives in markdown body and returns included file paths | +| `ExpandIncludesForEngines` | `func(content, baseDir string) ([]string, error)` | Returns engine names referenced via `@include` | +| `ExpandIncludesForSafeOutputs` | `func(content, baseDir string) ([]string, error)` | Returns safe output types referenced via `@include` | + +#### GitHub URL Parsing + +| Function | Signature | Description | +|----------|-----------|-------------| +| `ParseGitHubURL` | `func(urlStr string) (*GitHubURLComponents, error)` | Parses any GitHub URL into structured components | +| `ParseRunURLExtended` | `func(input string) (*GitHubURLComponents, error)` | Parses a workflow run URL (extended formats) | +| `ParsePRURL` | `func(prURL string) (owner, repo string, prNumber int, err error)` | Parses a pull request URL | +| `ParseRepoFileURL` | `func(fileURL string) (owner, repo, ref, filePath string, err error)` | Parses a repository file URL | +| `IsValidGitHubIdentifier` | `func(s string) bool` | Validates a GitHub username/org/repo name | +| `GetGitHubHost` | `func() string` | Returns the GitHub host (supports GHES via `GH_HOST`) | +| `GetGitHubHostForRepo` | `func(owner, repo string) string` | Returns the GitHub host for a specific repo | +| `GetGitHubToken` | `func() (string, error)` | Returns the GitHub auth token from the environment | + +#### Remote Fetching + +| Function | Signature | Description | +|----------|-----------|-------------| +| `ResolveIncludePath` | `func(filePath, baseDir string, cache *ImportCache) (string, error)` | Resolves a relative or GitHub URL path to an absolute path or fetches remotely | +| `DownloadFileFromGitHub` | `func(owner, repo, path, ref string) ([]byte, error)` | Downloads a file from GitHub via the API | +| `DownloadFileFromGitHubForHost` | `func(owner, repo, path, ref, host string) ([]byte, error)` | Downloads a file from a specific GitHub host | +| `ResolveRefToSHAForHost` | `func(owner, repo, ref, host string) (string, error)` | Resolves a branch/tag ref to a commit SHA | +| `ListWorkflowFiles` | `func(owner, repo, ref, workflowPath string) ([]string, error)` | Lists workflow files in a remote repository | + +#### MCP Configuration + +| Function | Signature | Description | +|----------|-----------|-------------| +| `ExtractMCPConfigurations` | `func(frontmatter map[string]any, serverFilter string) ([]MCPServerConfig, error)` | Extracts all MCP server configurations from frontmatter | +| `ParseMCPConfig` | `func(toolName string, mcpSection any, toolConfig map[string]any) (MCPServerConfig, error)` | Parses a single MCP server entry | +| `IsMCPType` | `func(typeStr string) bool` | Validates an MCP transport type string | + +#### Schedule Parsing + +| Function | Signature | Description | +|----------|-----------|-------------| +| `ParseSchedule` | `func(input string) (cron, original string, err error)` | Parses natural-language or cron schedule to a cron expression | +| `ScatterSchedule` | `func(fuzzyCron, workflowIdentifier string) (string, error)` | Deterministically scatters a daily/hourly cron to reduce thundering herd | +| `IsDailyCron` | `func(cron string) bool` | Detects whether a cron expression runs daily | +| `IsHourlyCron` | `func(cron string) bool` | Detects whether a cron expression runs hourly | +| `IsWeeklyCron` | `func(cron string) bool` | Detects whether a cron expression runs weekly | +| `IsFuzzyCron` | `func(cron string) bool` | Detects whether a cron is a fuzzy wildcard | +| `IsCronExpression` | `func(input string) bool` | Detects whether a string is already a cron expression | + +#### Schema Validation + +| Function | Signature | Description | +|----------|-----------|-------------| +| `ValidateMainWorkflowFrontmatterWithSchemaAndLocation` | `func(frontmatter map[string]any, filePath string) error` | JSON-schema validates frontmatter and returns located errors | +| `GetCompiledRepoConfigSchema` | `func() (*jsonschema.Schema, error)` | Returns the compiled JSON schema for repo config | +| `GetSafeOutputTypeKeys` | `func() ([]string, error)` | Returns valid safe-output type keys from the schema | +| `GetMainWorkflowDeprecatedFields` | `func() ([]DeprecatedField, error)` | Returns deprecated frontmatter fields with migration notes | +| `FindDeprecatedFieldsInFrontmatter` | `func(map[string]any, []DeprecatedField) []DeprecatedField` | Finds deprecated fields present in a parsed frontmatter map | +| `FindClosestMatches` | `func(target string, candidates []string, maxResults int) []string` | Finds the closest string matches (for typo suggestions) | +| `LevenshteinDistance` | `func(a, b string) int` | Computes edit distance between two strings | + +#### Frontmatter Hashing + +| Function | Signature | Description | +|----------|-----------|-------------| +| `ComputeFrontmatterHashFromFile` | `func(filePath string, cache *ImportCache) (string, error)` | Computes a stable hash of a workflow's frontmatter (including imports) | +| `ComputeFrontmatterHashFromFileWithParsedFrontmatter` | `func(filePath string, parsedFrontmatter map[string]any, ...) (string, error)` | Computes hash from already-parsed frontmatter | +| `ComputeFrontmatterHashFromFileWithReader` | `func(filePath string, cache *ImportCache, fileReader FileReader) (string, error)` | Computes hash with a custom file reader | + +#### Error Formatting + +| Function | Signature | Description | +|----------|-----------|-------------| +| `FormatImportCycleError` | `func(*ImportCycleError) error` | Formats a cycle error with the import chain | +| `FormatImportError` | `func(*ImportError, yamlContent string) error` | Formats an import error with YAML context | +| `NewFormattedParserError` | `func(formatted string) *FormattedParserError` | Creates a pre-formatted parser error | + +#### JSON Path Location + +| Function | Signature | Description | +|----------|-----------|-------------| +| `ExtractJSONPathFromValidationError` | `func(err error) []JSONPathInfo` | Extracts JSON path info from a schema validation error | +| `LocateJSONPathInYAML` | `func(yamlContent, jsonPath string) JSONPathLocation` | Maps a JSON path to a line number in YAML text | +| `LocateJSONPathInYAMLWithAdditionalProperties` | `func(yamlContent, jsonPath, errorMessage string) JSONPathLocation` | Maps path with additional-property context | + +#### Trigger Helpers + +| Function | Signature | Description | +|----------|-----------|-------------| +| `IsLabelOnlyEvent` | `func(eventValue any) bool` | Detects whether a trigger only activates on label events | +| `IsNonConflictingCommandEvent` | `func(eventValue any) bool` | Detects whether a trigger is a non-conflicting slash command | + +### Constants / Variables + +| Name | Type | Description | +|------|------|-------------| +| `ValidMCPTypes` | `[]string` | Valid MCP transport types: `"stdio"`, `"http"`, `"local"` | +| `IncludeDirectivePattern` | `*regexp.Regexp` | Matches `@import`, `@include`, and `{{#import ...}}` directives | +| `LegacyIncludeDirectivePattern` | `*regexp.Regexp` | Matches legacy `@import`/`@include` forms | +| `DefaultFileReader` | `FileReader` | Default file reader using `os.ReadFile` | +| `RepoConfigSchema` | `string` | Embedded JSON schema for repo-level configuration | + +## Usage Examples + +### Parse frontmatter from a workflow file + +```go +content, _ := os.ReadFile("my-workflow.md") +result, err := parser.ExtractFrontmatterFromContent(string(content)) +if err != nil { + log.Fatal(err) +} +fmt.Println("Triggers:", result.Frontmatter["on"]) +fmt.Println("Prompt:", result.MarkdownBody) +``` + +### Resolve imports + +```go +cache := parser.NewImportCache("/path/to/repo") +imports, err := parser.ProcessImportsFromFrontmatterWithSource( + result.Frontmatter, + filepath.Dir("my-workflow.md"), + cache, + "my-workflow.md", + result.FrontmatterYAML, +) +``` + +### Parse a schedule + +```go +cron, original, err := parser.ParseSchedule("every day at 9am") +// cron = "0 9 * * *" +``` + +### Validate frontmatter + +```go +err := parser.ValidateMainWorkflowFrontmatterWithSchemaAndLocation( + frontmatter, "my-workflow.md", +) +``` + +### Extract MCP server configurations + +```go +servers, err := parser.ExtractMCPConfigurations(frontmatter, "") +for _, s := range servers { + fmt.Printf("%s: type=%s\n", s.Name, s.Type) +} +``` + +## Architecture + +The parsing pipeline for a workflow file proceeds as: + +1. **Read** the raw markdown file content. +2. **Extract** the YAML frontmatter block between `---` delimiters (`ExtractFrontmatterFromContent`). +3. **Process imports**: resolve all `@import` directives recursively, merge imported YAML configurations, and deduplicate (`ProcessImportsFromFrontmatterWithSource`). +4. **Validate** the merged frontmatter against the JSON schema (`ValidateMainWorkflowFrontmatterWithSchemaAndLocation`). +5. **Expand includes** in the markdown body (`ExpandIncludesWithManifest`). +6. **Pass** the merged frontmatter and markdown body to `pkg/workflow` for compilation. + +Import caching is crucial for performance and cycle detection. The `ImportCache` tracks visited paths within a single compilation run to prevent infinite recursion. + +## Dependencies + +**Internal**: +- `pkg/types` — `BaseMCPServerConfig` +- `pkg/logger` — debug logging + +**External**: +- `github.com/santhosh-tekuri/jsonschema/v5` — JSON schema validation +- `go.yaml.in/yaml/v3` — YAML parsing +- `goccy/go-yaml` — YAML 1.1/1.2 compatible parsing (for GitHub Actions compatibility) + +## Thread Safety + +`ImportCache` is designed for use within a single goroutine per compilation run. Its internal map is not concurrency-safe. For concurrent compilations, create a separate `ImportCache` per compilation. + +The `DefaultFileReader` variable is safe to read but MUST NOT be mutated after package initialization. Tests may replace it with a custom `FileReader` to inject virtual filesystem content. + +--- + +*This specification is automatically maintained by the [spec-extractor](../../.github/workflows/spec-extractor.md) workflow.* diff --git a/pkg/workflow/README.md b/pkg/workflow/README.md new file mode 100644 index 00000000000..bc458cb251a --- /dev/null +++ b/pkg/workflow/README.md @@ -0,0 +1,458 @@ +# workflow Package + +> Workflow compilation, validation, engine integration, safe-outputs, and GitHub Actions YAML generation for agentic workflow files. + +## Overview + +The `workflow` package is the compilation core of `gh-aw`. It transforms parsed markdown frontmatter (from `pkg/parser`) and markdown body text into complete GitHub Actions `.lock.yml` files. Compilation covers the full lifecycle: frontmatter parsing into strongly-typed configuration structs, multi-pass validation (schema, permissions, security, strict mode), engine-specific step generation (Copilot, Claude, Codex, custom), safe-output job construction, and final YAML serialization. + +The package is organized around three major subsystems: + +1. **Compiler** (`compiler*.go`, `compiler_types.go`): The `Compiler` struct drives the main compilation pipeline. It accepts a markdown file path (or pre-parsed `WorkflowData`), builds the full GitHub Actions workflow YAML, and writes the `.lock.yml` file only when the content has changed. + +2. **Engine registry** (`agentic_engine.go`, `*_engine.go`): A pluggable engine architecture where each AI engine (`copilot`, `claude`, `codex`, `custom`) implements a set of focused interfaces (`Engine`, `CapabilityProvider`, `WorkflowExecutor`, `MCPConfigProvider`, etc.). Engines are registered in a global `EngineRegistry` and looked up by name at compile time. + +3. **Validation** (`validation.go`, `strict_mode_*.go`, `*_validation.go`): A layered validation system organized by domain. Each validator is a focused file under 300 lines. Validation runs both at compile time and optionally in strict mode for production deployments. + +The package is intentionally large (~320 source files) because it encodes all GitHub Actions generation logic, including per-action job builders for every supported safe-output type (add comment, add labels, assign to user, close issue, update PR, etc.). + +## Public API + +### Core Compiler Types + +| Type | Kind | Description | +|------|------|-------------| +| `Compiler` | struct | Main compilation engine; use `NewCompiler(opts...)` | +| `CompilerOption` | func type | Functional option for configuring a `Compiler` | +| `WorkflowData` | struct | Complete in-memory representation of a compiled workflow | +| `FileTracker` | interface | Abstraction for tracking written files | + +#### `Compiler` Methods + +| Method | Signature | Description | +|--------|-----------|-------------| +| `CompileWorkflow` | `func(markdownPath string) error` | Compiles a markdown file and writes the `.lock.yml` | +| `CompileWorkflowData` | `func(workflowData *WorkflowData, markdownPath string) error` | Compiles pre-parsed `WorkflowData` | + +#### Compiler Options + +| Function | Description | +|----------|-------------| +| `WithVerbose(bool)` | Enable verbose diagnostic output | +| `WithEngineOverride(string)` | Override the AI engine | +| `WithSkipValidation(bool)` | Skip schema validation | +| `WithNoEmit(bool)` | Validate without writing lock files | +| `WithFailFast(bool)` | Stop at first validation error | +| `WithWorkflowIdentifier(string)` | Set the workflow identifier | +| `NewCompiler(opts ...CompilerOption)` | Creates a new `Compiler` | +| `NewCompilerWithVersion(string)` | Creates a `Compiler` with a specific version | + +### Engine Architecture + +| Type | Kind | Description | +|------|------|-------------| +| `Engine` | interface | Core identity: `GetID()`, `GetDisplayName()`, `GetDescription()`, `IsExperimental()` | +| `CapabilityProvider` | interface | Optional feature detection (`SupportsToolsAllowlist`, `SupportsMaxTurns`, etc.) | +| `WorkflowExecutor` | interface | Compilation: `GetDeclaredOutputFiles`, `GetInstallationSteps`, `GetExecutionSteps` | +| `MCPConfigProvider` | interface | MCP configuration generation | +| `LogParser` | interface | Log parsing for audit/metrics | +| `SecurityProvider` | interface | Security-related configuration | +| `ModelEnvVarProvider` | interface | Model environment variable mapping | +| `AgentFileProvider` | interface | Custom agent file support | +| `ConfigRenderer` | interface | Configuration file rendering | +| `DriverProvider` | interface | Driver-level execution configuration | +| `CodingAgentEngine` | interface | Composite interface combining all engine capabilities | +| `BaseEngine` | struct | Base implementation shared by all engines | +| `EngineRegistry` | struct | Global registry mapping engine names to implementations | +| `CopilotEngine` | struct | Copilot coding agent engine | +| `ClaudeEngine` | struct | Claude coding agent engine | +| `CodexEngine` | struct | OpenAI Codex coding agent engine | + +#### Engine Registry Functions + +| Function | Signature | Description | +|----------|-----------|-------------| +| `NewEngineRegistry` | `func() *EngineRegistry` | Creates a new engine registry | +| `GetGlobalEngineRegistry` | `func() *EngineRegistry` | Returns the singleton global engine registry | +| `NewCopilotEngine` | `func() *CopilotEngine` | Creates the Copilot engine | +| `NewClaudeEngine` | `func() *ClaudeEngine` | Creates the Claude engine | +| `NewCodexEngine` | `func() *CodexEngine` | Creates the Codex engine | + +### Frontmatter Configuration Types + +| Type | Kind | Description | +|------|------|-------------| +| `FrontmatterConfig` | struct | Full parsed frontmatter with typed and legacy fields | +| `RuntimeConfig` | struct | Single runtime version configuration (version string) | +| `RuntimesConfig` | struct | All runtime versions (node, python, go, uv, bun, deno) | +| `PermissionsConfig` | struct | GitHub Actions permissions (shorthand + detailed fields) | +| `GitHubActionsPermissionsConfig` | struct | Detailed permissions with all scope fields | +| `GitHubAppPermissionsConfig` | struct | GitHub App permission scopes | +| `ObservabilityConfig` | struct | OTLP/observability configuration | +| `RateLimitConfig` | struct | Rate limit settings | +| `OTLPConfig` | struct | OpenTelemetry protocol configuration | + +### Permissions System + +| Type | Kind | Description | +|------|------|-------------| +| `Permissions` | struct | Runtime permissions representation for the compiled workflow | +| `PermissionLevel` | string alias | Permission level: `read`, `write`, `none` | +| `PermissionScope` | string alias | Permission scope (e.g., `contents`, `issues`, `pull-requests`) | +| `PermissionsParser` | struct | Parses YAML permissions blocks into `Permissions` | +| `PermissionsValidationResult` | struct | Result of `ValidatePermissions` | + +#### Permissions Functions + +| Function | Signature | Description | +|----------|-----------|-------------| +| `NewPermissionsParser` | `func(permissionsYAML string) *PermissionsParser` | Creates a parser from YAML text | +| `NewPermissionsParserFromValue` | `func(permissionsValue any) *PermissionsParser` | Creates a parser from parsed YAML value | +| `ValidatePermissions` | `func(*Permissions, ValidatableTool) *PermissionsValidationResult` | Validates permissions for a given tool | +| `FormatValidationMessage` | `func(*PermissionsValidationResult, bool) string` | Formats a validation result as a human-readable message | +| `ComputePermissionsForSafeOutputs` | `func(*SafeOutputsConfig) *Permissions` | Computes required permissions for safe-output types | +| `SortPermissionScopes` | `func([]PermissionScope)` | Sorts permission scopes alphabetically | + +#### Permissions Factory (common combinations) + +| Function | Description | +|----------|-------------| +| `NewPermissionsContentsWritePRWrite()` | contents:write + pull-requests:write | +| `NewPermissionsContentsWriteIssuesWritePRWrite()` | contents:write + issues:write + pull-requests:write | +| `NewPermissionsContentsReadDiscussionsWrite()` | contents:read + discussions:write | +| `NewPermissionsContentsReadIssuesWriteDiscussionsWrite()` | contents:read + issues:write + discussions:write | +| `NewPermissionsContentsReadPRWrite()` | contents:read + pull-requests:write | +| `NewPermissionsContentsReadSecurityEventsWrite()` | contents:read + security-events:write | +| `NewPermissionsContentsReadProjectsWrite()` | contents:read + projects:write | + +### Tools Configuration + +| Type | Kind | Description | +|------|------|-------------| +| `ToolsConfig` | struct | Parsed `tools:` block with all tool configurations | +| `Tools` | type alias | Alias for `ToolsConfig` | +| `GitHubToolConfig` | struct | GitHub MCP tool configuration (toolsets, allowed tools, integrity) | +| `PlaywrightToolConfig` | struct | Playwright browser automation tool config | +| `BashToolConfig` | struct | Bash execution tool config | +| `WebFetchToolConfig` | struct | Web fetch tool config | +| `WebSearchToolConfig` | struct | Web search tool config | +| `EditToolConfig` | struct | File edit tool config | +| `AgenticWorkflowsToolConfig` | struct | Nested agentic workflows tool config | +| `CacheMemoryToolConfig` | struct | Cache-memory persistence tool config | +| `MCPServerConfig` | struct | Generic MCP server configuration | +| `MCPGatewayRuntimeConfig` | struct | MCP Gateway runtime configuration | +| `GitHubToolName` | string alias | Named GitHub MCP tool (e.g., `"issue_read"`) | +| `GitHubAllowedTools` | `[]GitHubToolName` | Typed slice with conversion helpers | +| `GitHubToolset` | string alias | Named GitHub toolset (e.g., `"default"`, `"repos"`) | +| `GitHubToolsets` | `[]GitHubToolset` | Typed slice with conversion helpers | +| `GitHubIntegrityLevel` | string alias | Integrity level (`"low"`, `"medium"`, `"high"`) | + +#### Tools Functions + +| Function | Signature | Description | +|----------|-----------|-------------| +| `NewTools` | `func(map[string]any) *Tools` | Creates a `Tools` from a raw map | +| `ParseToolsConfig` | `func(map[string]any) (*ToolsConfig, error)` | Parses the `tools:` frontmatter section | +| `ValidateGitHubToolsAgainstToolsets` | `func([]string, []string) error` | Validates tool names against enabled toolsets | +| `GetPlaywrightTools` | `func() []any` | Returns the standard Playwright tool definitions | +| `GetSafeOutputToolOptions` | `func() []SafeOutputToolOption` | Returns valid safe-output tool option definitions | +| `GetValidationConfigJSON` | `func(enabledTypes []string) (string, error)` | Returns JSON validation config for given safe-output types | + +### Safe Outputs + +| Type | Kind | Description | +|------|------|-------------| +| `SafeOutputsConfig` | struct | Parsed `safe-outputs:` configuration | +| `SafeOutputTargetConfig` | struct | Target configuration for a safe-output job | +| `SafeOutputFilterConfig` | struct | Filter configuration for a safe-output job | +| `SafeOutputToolOption` | struct | A valid safe-output tool option | + +#### Safe Output Functions + +| Function | Signature | Description | +|----------|-----------|-------------| +| `HasSafeOutputsEnabled` | `func(*SafeOutputsConfig) bool` | Returns whether any safe-output type is enabled | +| `ParseTargetConfig` | `func(map[string]any) (SafeOutputTargetConfig, bool)` | Parses a target configuration block | +| `ParseFilterConfig` | `func(map[string]any) SafeOutputFilterConfig` | Parses a filter configuration block | +| `SafeOutputsConfigFromKeys` | `func([]string) *SafeOutputsConfig` | Creates a config from a list of type keys | + +### Network Permissions + +| Type | Kind | Description | +|------|------|-------------| +| `NetworkPermissions` | struct | Parsed `network:` block with `allowed` and `blocked` domain lists | + +#### Network Functions + +| Function | Signature | Description | +|----------|-----------|-------------| +| `GetAllowedDomains` | `func(*NetworkPermissions) []string` | Returns the full list of allowed domains | +| `GetDomainEcosystem` | `func(domain string) string` | Returns the ecosystem name for a domain | +| `GetAllowedDomainsForEngine` | `func(EngineName, *NetworkPermissions, ...) string` | Returns allowed domains for a specific engine | +| `GetCopilotAllowedDomainsWithToolsAndRuntimes` | `func(*NetworkPermissions, ...) string` | Copilot-specific allowed domains | +| `GetCodexAllowedDomainsWithToolsAndRuntimes` | `func(*NetworkPermissions, ...) string` | Codex-specific allowed domains | +| `GetClaudeAllowedDomainsWithToolsAndRuntimes` | `func(*NetworkPermissions, ...) string` | Claude-specific allowed domains | +| `GetThreatDetectionAllowedDomains` | `func(*NetworkPermissions) string` | Allowed domains for threat detection jobs | + +### Error Types + +| Type | Kind | Description | +|------|------|-------------| +| `WorkflowValidationError` | struct | Validation error with field, value, reason, and suggestion | +| `OperationError` | struct | Error from a workflow operation with entity context | +| `ConfigurationError` | struct | Configuration error with config key and suggested fix | +| `ErrorCollector` | struct | Collects multiple errors; supports `failFast` mode | +| `SharedWorkflowError` | struct | Error for shared/reusable workflow violations | + +#### Error Functions + +| Function | Signature | Description | +|----------|-----------|-------------| +| `NewValidationError` | `func(field, value, reason, suggestion string) *WorkflowValidationError` | Creates a validation error | +| `NewOperationError` | `func(operation, entityType, entityID string, cause error, suggestion string) *OperationError` | Creates an operation error | +| `NewConfigurationError` | `func(configKey, value, reason, suggestion string) *ConfigurationError` | Creates a configuration error | +| `NewErrorCollector` | `func(failFast bool) *ErrorCollector` | Creates an error collector | + +### Workflow Resolution + +| Function | Signature | Description | +|----------|-----------|-------------| +| `ResolveWorkflowName` | `func(string) (string, error)` | Resolves a workflow input string to a canonical name | +| `FindWorkflowName` | `func(string) (string, error)` | Finds the workflow name from a string or file path | +| `GetWorkflowLockFileName` | `func(string) (string, error)` | Returns the `.lock.yml` path for a workflow | +| `GetAllWorkflows` | `func() ([]WorkflowNameMatch, error)` | Returns all installed workflow names | +| `GetWorkflowIDFromPath` | `func(string) string` | Derives the workflow ID from its markdown path | + +### Action Pinning + +| Type | Kind | Description | +|------|------|-------------| +| `ActionPin` | struct | An action pin (repo + SHA) | +| `ActionPinsData` | struct | Map of all action pins | +| `ActionMode` | string alias | Action reference mode (`sha`, `tag`, `local`) | +| `ActionCache` | struct | Cache for resolved action SHAs | +| `ActionResolver` | struct | Resolves action SHAs from GitHub | + +| Function | Signature | Description | +|----------|-----------|-------------| +| `GetActionPin` | `func(actionRepo string) string` | Returns the pinned SHA for an action | +| `GetActionPinByRepo` | `func(string) (ActionPin, bool)` | Looks up a pin by repo | +| `DetectActionMode` | `func(version string) ActionMode` | Detects the action reference mode | +| `ApplyActionPinsToTypedSteps` | `func([]*WorkflowStep, *WorkflowData) []*WorkflowStep` | Applies pins to all steps | +| `ValidateActionSHAsInLockFile` | `func(string, *ActionCache, bool) error` | Validates action SHAs in a lock file | + +### String Utilities (Workflow-Specific) + +| Function | Signature | Description | +|----------|-----------|-------------| +| `SanitizeName` | `func(string, *SanitizeOptions) string` | Sanitizes a name for use in GitHub Actions | +| `SanitizeWorkflowName` | `func(string) string` | Sanitizes a workflow name | +| `SanitizeIdentifier` | `func(string) string` | Sanitizes a generic identifier | +| `SanitizeWorkflowIDForCacheKey` | `func(string) string` | Sanitizes a workflow ID for use as a cache key | +| `PrettifyToolName` | `func(string) string` | Returns a human-readable tool name | +| `ShortenCommand` | `func(string) string` | Shortens a long command for display | +| `GenerateHeredocDelimiterFromSeed` | `func(name, seed string) string` | Generates a stable heredoc delimiter | +| `ValidateHeredocContent` | `func(content, delimiter string) error` | Validates heredoc content safety | +| `ValidateHeredocDelimiter` | `func(string) error` | Validates a heredoc delimiter | + +### Secret Handling + +| Function | Signature | Description | +|----------|-----------|-------------| +| `ExtractSecretName` | `func(string) string` | Extracts the secret name from a `${{ secrets.NAME }}` expression | +| `ExtractSecretsFromValue` | `func(string) map[string]string` | Extracts all secrets from a template value | +| `ReplaceSecretsWithEnvVars` | `func(string, map[string]string) string` | Replaces secret references with env var references | +| `ExtractGitHubContextExpressionsFromValue` | `func(string) map[string]string` | Extracts GitHub context expressions | +| `CollectSecretReferences` | `func(string) []string` | Collects all secret references from YAML content | +| `CollectActionReferences` | `func(string) []string` | Collects all action references from YAML content | + +### Concurrency & Scheduling + +| Function | Signature | Description | +|----------|-----------|-------------| +| `GenerateConcurrencyConfig` | `func(*WorkflowData, bool) string` | Generates `concurrency:` YAML for a workflow | +| `GenerateJobConcurrencyConfig` | `func(*WorkflowData) string` | Generates job-level concurrency YAML | +| `ResolveRelativeDate` | `func(dateStr string, baseTime time.Time) (string, error)` | Resolves relative date strings (e.g., "2 weeks ago") | + +### YAML Utilities + +| Function | Signature | Description | +|----------|-----------|-------------| +| `UnquoteYAMLKey` | `func(yamlStr, key string) string` | Removes unnecessary quotes from a YAML key | +| `MarshalWithFieldOrder` | `func(map[string]any, []string) ([]byte, error)` | Marshals a map with priority-ordered fields | +| `OrderMapFields` | `func(map[string]any, []string) yaml.MapSlice` | Returns an ordered map slice | +| `CleanYAMLNullValues` | `func(string) string` | Removes null values from YAML output | +| `ConvertStepToYAML` | `func(map[string]any) (string, error)` | Converts a step map to YAML text | + +### Trigger Parsing + +| Type | Kind | Description | +|------|------|-------------| +| `TriggerIR` | struct | Intermediate representation of a workflow trigger | + +| Function | Signature | Description | +|----------|-----------|-------------| +| `ParseTriggerShorthand` | `func(string) (*TriggerIR, error)` | Parses a trigger shorthand string | + +### AWF Command Building + +| Function | Signature | Description | +|----------|-----------|-------------| +| `BuildAWFCommand` | `func(AWFCommandConfig) string` | Builds the `gh aw` command string for a workflow step | +| `BuildAWFArgs` | `func(AWFCommandConfig) []string` | Builds CLI argument list for `gh aw` | +| `GetAWFCommandPrefix` | `func(*WorkflowData) string` | Returns the `gh aw` command prefix | +| `WrapCommandInShell` | `func(string) string` | Wraps a command in a shell `run:` block | +| `GetCopilotAPITarget` | `func(*WorkflowData) string` | Returns the Copilot API target URL | + +### Versioning + +| Function | Signature | Description | +|----------|-----------|-------------| +| `SetVersion` | `func(string)` | Sets the package version at startup | +| `GetVersion` | `func() string` | Returns the current package version | +| `SetIsRelease` | `func(bool)` | Marks whether this is a release build | +| `IsRelease` | `func() bool` | Returns whether this is a release build | +| `IsReleasedVersion` | `func(string) bool` | Checks whether a version string is a release | + +### Validation Functions + +| Function | Signature | Description | +|----------|-----------|-------------| +| `ValidateEventFilters` | `func(map[string]any) error` | Validates `on:` event filter patterns | +| `ValidateGlobPatterns` | `func(map[string]any) error` | Validates glob patterns in trigger filters | + +### Step Types + +| Type | Kind | Description | +|------|------|-------------| +| `WorkflowStep` | struct | A single GitHub Actions step with all standard fields | +| `GitHubActionStep` | `[]string` | A multi-line run step (slice of command strings) | + +| Function | Signature | Description | +|----------|-----------|-------------| +| `MapToStep` | `func(map[string]any) (*WorkflowStep, error)` | Converts a YAML map to a typed `WorkflowStep` | +| `SliceToSteps` | `func([]any) ([]*WorkflowStep, error)` | Converts a YAML slice to typed steps | +| `StepsToSlice` | `func([]*WorkflowStep) []any` | Converts typed steps back to a YAML slice | + +### Repository Configuration + +| Type | Kind | Description | +|------|------|-------------| +| `RepoConfig` | struct | Repository-level configuration from `.github/gh-aw.yml` | + +| Function | Signature | Description | +|----------|-----------|-------------| +| `LoadRepoConfig` | `func(gitRoot string) (*RepoConfig, error)` | Loads and parses the repo config file | +| `FormatRunsOn` | `func(RunsOnValue, string) string` | Formats a `runs-on:` value for YAML output | + +### Threat Detection + +| Type | Kind | Description | +|------|------|-------------| +| `ThreatDetectionConfig` | struct | Configuration for the threat detection job | + +| Function | Signature | Description | +|----------|-----------|-------------| +| `IsDetectionJobEnabled` | `func(*SafeOutputsConfig) bool` | Returns whether threat detection is enabled | + +### Safe Update Manifest + +| Type | Kind | Description | +|------|------|-------------| +| `GHAWManifest` | struct | Signed manifest embedded in lock files for integrity checking | + +| Function | Signature | Description | +|----------|-----------|-------------| +| `NewGHAWManifest` | `func(secretNames, actionRefs []string, containers []GHAWManifestContainer) *GHAWManifest` | Creates a new manifest | +| `ExtractGHAWManifestFromLockFile` | `func(string) (*GHAWManifest, error)` | Extracts the manifest from a lock file | +| `EnforceSafeUpdate` | `func(*GHAWManifest, []string, []string) error` | Validates that a lock file update passes manifest checks | + +## Usage Examples + +### Compile a workflow file + +```go +compiler := workflow.NewCompiler( + workflow.WithVerbose(true), + workflow.WithEngineOverride("copilot"), +) +err := compiler.CompileWorkflow(".github/workflows/my-workflow.md") +``` + +### Look up an engine + +```go +registry := workflow.GetGlobalEngineRegistry() +engine, ok := registry.Get("copilot") +if ok { + steps := engine.GetExecutionSteps(workflowData) +} +``` + +### Compute permissions for safe outputs + +```go +perms := workflow.ComputePermissionsForSafeOutputs(safeOutputsConfig) +``` + +### Resolve a workflow name + +```go +name, err := workflow.ResolveWorkflowName("my-workflow") +lockFile, err := workflow.GetWorkflowLockFileName(name) +``` + +## Architecture + +``` +markdown file + │ + ▼ +pkg/parser ─── ExtractFrontmatterFromContent + │ ProcessImportsFromFrontmatterWithSource + │ + ▼ +pkg/workflow ── FrontmatterConfig (typed structs) + │ Compiler.CompileWorkflow() + │ ├─ schema validation + │ ├─ permissions computation + │ ├─ engine step generation + │ ├─ safe-output job generation + │ ├─ YAML serialization + │ └─ lock file write (if changed) + │ + ▼ +.github/workflows/my-workflow.lock.yml +``` + +## Design Decisions + +- **File-per-domain decomposition**: Each validation concern and job-builder lives in its own file. The 300-line limit is enforced by convention; validation files exceeding it SHOULD be split. +- **Functional compiler options**: `CompilerOption` functions follow the standard Go functional-options pattern, keeping `NewCompiler` signature stable as options are added. +- **Engine interface composition**: Rather than one monolithic `Engine` interface, capabilities are split into focused interfaces (`CapabilityProvider`, `WorkflowExecutor`, etc.) and combined via `CodingAgentEngine`. This prevents engines from being forced to implement unused methods. +- **Content-addressed lock files**: Lock files are only written when the normalized YAML content changes (heredoc delimiters are normalized before comparison). This avoids unnecessary git churn. +- **YAML 1.1/1.2 compatibility**: The package uses `goccy/go-yaml` for all GitHub Actions YAML generation to ensure compatibility with GitHub Actions' YAML parser. + +## Dependencies + +**Internal**: +- `pkg/parser` — frontmatter extraction and import processing +- `pkg/constants` — engine names, feature flags, job/step IDs +- `pkg/console` — terminal formatting +- `pkg/logger` — debug logging +- `pkg/stringutil`, `pkg/fileutil`, `pkg/gitutil`, `pkg/sliceutil` — utilities +- `pkg/types` — shared MCP types + +**External**: +- `goccy/go-yaml` — YAML 1.1/1.2 compatible marshaling +- `go.yaml.in/yaml/v3` — standard YAML marshaling for non-Actions YAML + +## Thread Safety + +`Compiler` instances are NOT safe for concurrent use. Create a new `Compiler` for each concurrent compilation. The `GetGlobalEngineRegistry()` singleton is initialized once at startup and is safe for concurrent reads thereafter. + +Constants (`MaxLockFileSize`) and action pin data are read-only after initialization and are safe for concurrent access. + +--- + +*This specification is automatically maintained by the [spec-extractor](../../.github/workflows/spec-extractor.md) workflow.*