diff --git a/pkg/agentdrain/README.md b/pkg/agentdrain/README.md new file mode 100644 index 00000000000..7e40067deb0 --- /dev/null +++ b/pkg/agentdrain/README.md @@ -0,0 +1,196 @@ +# agentdrain Package + +The `agentdrain` package implements the [Drain](https://jiemingzhu.github.io/pub/pjhe_icws2017.pdf) log template mining algorithm adapted for analyzing structured agent pipeline events. It is used for anomaly detection in agentic workflow runs. + +## Overview + +Drain is an online log parsing algorithm that groups log lines into clusters based on token similarity. Each cluster has a *template* — a tokenized log pattern where variable tokens are replaced with a wildcard (`<*>`). When a new log line arrives, Drain finds the most similar existing cluster or creates a new one. + +In GitHub Agentic Workflows, `agentdrain` processes `AgentEvent` records emitted by pipeline stages (e.g. `"plan"`, `"tool_call"`, `"finish"`) to: +1. Build a model of normal behavior by training on events from successful runs. +2. Detect anomalies in new runs by comparing events against the learned model. + +## Types + +### `Config` + +Tuning parameters for the Drain miner. + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `Depth` | `int` | `4` | Parse-tree depth | +| `SimThreshold` | `float64` | `0.4` | Minimum similarity score to match a cluster | +| `MaxChildren` | `int` | `100` | Maximum children per tree node | +| `ParamToken` | `string` | `"<*>"` | Wildcard inserted at variable positions | +| `RareClusterThreshold` | `int` | `2` | Clusters with `Size ≤` this value are flagged as rare | +| `MaskRules` | `[]MaskRule` | (see below) | Regex substitutions applied before tokenization | +| `ExcludeFields` | `[]string` | `["session_id", "trace_id", "span_id", "timestamp"]` | Event fields excluded from flattening | + +Use `DefaultConfig()` for production-ready defaults. + +### `MaskRule` + +A regex-based substitution applied to log lines before tokenization to normalize variable content. + +```go +type MaskRule struct { + Name string // Human-readable identifier + Pattern string // Regular expression + Replacement string // Substitution string +} +``` + +Default mask rules normalize UUIDs, session IDs, numeric values, URLs, quoted strings, and timestamps. + +### `AgentEvent` + +A structured event from an agent pipeline stage. + +```go +type AgentEvent struct { + Stage string // e.g. "plan", "tool_call", "finish" + Fields map[string]string // Key-value pairs from the log line +} +``` + +### `Cluster` + +A group of log lines that share the same template. + +```go +type Cluster struct { + ID int // Unique identifier + Template []string // Tokenized template with wildcards + Size int // Number of lines assigned to this cluster + Stage string // Pipeline stage that generated this cluster +} +``` + +### `MatchResult` + +Returned after processing a log line. + +```go +type MatchResult struct { + ClusterID int // Matched or newly created cluster ID + Template string // Space-joined template string + Params []string // Actual token values at wildcard positions + Similarity float64 // Fraction of non-wildcard tokens that matched exactly + Stage string // Pipeline stage of the matched cluster +} +``` + +### `AnomalyReport` + +Describes anomalies detected for a log line. + +```go +type AnomalyReport struct { + IsNewTemplate bool // Line created a new cluster + LowSimilarity bool // Best match score was below SimThreshold + RareCluster bool // Matched cluster has been seen ≤ RareClusterThreshold times + NewClusterCreated bool // This event produced a brand-new cluster + AnomalyScore float64 // Weighted composite score in [0, 1] + Reason string // Human-readable anomaly description +} +``` + +## Core Components + +### `Miner` + +The single-stage Drain miner. Processes one pipeline stage at a time. + +```go +cfg := agentdrain.DefaultConfig() +miner, err := agentdrain.NewMiner(cfg) + +// Training phase — call for known-good events +result, err := miner.TrainEvent(evt) + +// Analysis phase — call for events to check +result, report, err := miner.AnalyzeEvent(evt) + +// Inspect clusters +clusters := miner.Clusters() +count := miner.ClusterCount() +``` + +#### Persistence + +```go +// Save miner state to JSON +data, err := miner.SaveJSON() + +// Restore miner state from JSON +err = miner.LoadJSON(data) +``` + +### `Coordinator` + +Manages a separate `Miner` per pipeline stage, routing events to the correct miner. + +```go +stages := []string{"plan", "tool_call", "finish"} +coord, err := agentdrain.NewCoordinator(cfg, stages) + +// Load default trained weights +err = coord.LoadDefaultWeights() + +// Analyze an event +result, report, err := coord.AnalyzeEvent(evt) + +// Access all clusters across all stages +allClusters := coord.AllClusters() + +// Save/restore snapshots +snapshots, err := coord.SaveSnapshots() +err = coord.LoadSnapshots(snapshots) + +// Save/restore coordinator weights as JSON +data, err := coord.SaveWeightsJSON() +err = coord.LoadWeightsJSON(data) +``` + +### `AnomalyDetector` + +Post-processes `MatchResult` values to produce an `AnomalyReport`. + +```go +detector := agentdrain.NewAnomalyDetector(cfg.SimThreshold, cfg.RareClusterThreshold) +report := detector.Analyze(result, isNew, cluster) +``` + +### `Masker` + +Applies `MaskRule` substitutions to log lines before tokenization. + +```go +masker, err := agentdrain.NewMasker(cfg.MaskRules) +masked := masker.Mask(rawLine) +``` + +### Utility Functions + +#### `FlattenEvent(evt AgentEvent, excludeFields []string) string` + +Converts an `AgentEvent` to a single string for tokenization, omitting fields listed in `excludeFields`. Fields are sorted for deterministic output. + +#### `Tokenize(line string) []string` + +Splits a log line into tokens on whitespace boundaries. + +#### `StageSequence(events []AgentEvent) string` + +Returns a comma-separated string of the stages from a slice of events. Useful for summarizing pipeline execution paths. + +## Default Weights + +The package embeds a set of default trained weights (in `data/`) via `//go:embed`. Call `coord.LoadDefaultWeights()` to initialize the coordinator with pre-trained cluster weights rather than starting cold. + +## Design Notes + +- The Drain algorithm is O(n·d) per event, where `n` is the number of tokens and `d` is `Depth`. +- `SimThreshold` of `0.4` means at least 40% of tokens must match exactly (excluding wildcards) for a line to join an existing cluster. +- The `Coordinator` routes each `AgentEvent` to its stage-specific `Miner` so that templates from different stages do not interfere. +- `SaveJSON`/`LoadJSON` serialize the parse tree and cluster list to enable persistence across workflow runs. diff --git a/pkg/console/README.md b/pkg/console/README.md index 20a246740e9..a313eea0124 100644 --- a/pkg/console/README.md +++ b/pkg/console/README.md @@ -1,6 +1,6 @@ -# Console Rendering Package +# Console Package -The `console` package provides utilities for rendering Go structs and data structures to formatted console output, as well as progress bar and spinner components for long-running operations. +The `console` package provides utilities for formatting and rendering terminal output in GitHub Agentic Workflows. It covers message formatting, table and section rendering, interactive prompts, progress bars, spinners, struct rendering, and accessibility support. ## Design Philosophy @@ -309,3 +309,246 @@ To migrate existing rendering code to use the new system: } fmt.Print(console.RenderTable(config)) ``` + +## Message Formatting Functions + +All `Format*` functions return a styled string ready to be printed to `os.Stderr`. Colors adapt automatically to the terminal background. + +| Function | Style | Typical use | +|----------|-------|-------------| +| `FormatSuccessMessage(message string) string` | Green, bold | Operation completed successfully | +| `FormatInfoMessage(message string) string` | Cyan, bold | General informational output | +| `FormatWarningMessage(message string) string` | Orange, bold | Non-fatal warnings | +| `FormatErrorMessage(message string) string` | Red, bold | Recoverable error messages | +| `FormatCommandMessage(command string) string` | Purple | CLI commands and code snippets | +| `FormatProgressMessage(message string) string` | Yellow | In-progress status updates | +| `FormatPromptMessage(message string) string` | Cyan | Interactive prompt labels | +| `FormatVerboseMessage(message string) string` | Muted/comment | Verbose/debug detail | +| `FormatListItem(item string) string` | Foreground | Individual list entries | +| `FormatSectionHeader(header string) string` | Bold, bordered | Section titles in output | + +### Usage Pattern + +```go +import ( + "fmt" + "os" + "github.com/github/gh-aw/pkg/console" +) + +// Always write formatted messages to stderr +fmt.Fprintln(os.Stderr, console.FormatSuccessMessage("Workflow compiled successfully")) +fmt.Fprintln(os.Stderr, console.FormatInfoMessage("Processing 3 files...")) +fmt.Fprintln(os.Stderr, console.FormatWarningMessage("Network access is unrestricted")) +fmt.Fprintln(os.Stderr, console.FormatErrorMessage("File not found: workflow.md")) +fmt.Fprintln(os.Stderr, console.FormatCommandMessage("gh aw compile workflow.md")) +fmt.Fprintln(os.Stderr, console.FormatProgressMessage("Downloading release...")) +``` + +## Error Formatting + +### `FormatError(err CompilerError) string` + +Formats a structured `CompilerError` with position information, source context lines, and an optional fix hint. Used by the compiler to display actionable error messages. + +```go +err := console.CompilerError{ + Position: console.ErrorPosition{File: "workflow.md", Line: 12, Column: 5}, + Type: "error", + Message: "unknown engine: 'myengine'", + Context: []string{"engine: myengine"}, + Hint: "Valid engines are: copilot, claude, codex, gemini", +} +fmt.Fprint(os.Stderr, console.FormatError(err)) +``` + +### `FormatErrorChain(err error) string` + +Formats an error together with its entire `%w`-wrapped cause chain. Each level of the chain is shown on a new indented line for easy debugging. + +```go +fmt.Fprintln(os.Stderr, console.FormatErrorChain(err)) +``` + +## Section Rendering Functions + +These functions return `[]string` slices (lines of output) that can be composed using `RenderComposedSections`. + +### `RenderTitleBox(title string, width int) []string` + +Returns a rounded-border box containing `title`, padded to at least `width` characters. + +### `RenderErrorBox(title string) []string` + +Returns a red-bordered error box displaying `title`. + +### `RenderInfoSection(content string) []string` + +Returns `content` wrapped in a left-bordered info section with muted styling. + +### `RenderComposedSections(sections []string)` + +Prints multiple rendered sections to `os.Stderr`, separated by blank lines. + +```go +lines := append( + console.RenderTitleBox("Audit Report", 60), + console.RenderInfoSection("3 jobs completed")..., +) +console.RenderComposedSections(lines) +``` + +### `RenderTable(config TableConfig) string` + +Renders a formatted table with optional title and total row. See the `TableConfig` type for configuration options. + +```go +table := console.RenderTable(console.TableConfig{ + Headers: []string{"Name", "Status", "Duration"}, + Rows: [][]string{ + {"build", "success", "1m30s"}, + {"test", "failure", "45s"}, + }, + Title: "Job Results", +}) +fmt.Print(table) +``` + +## Types + +### `CompilerError` + +Structured error with source position, type, message, context lines, and a fix hint. + +```go +type CompilerError struct { + Position ErrorPosition // Source file position + Type string // "error", "warning", "info" + Message string + Context []string // Source lines shown around the error + Hint string // Optional actionable fix suggestion +} +``` + +### `ErrorPosition` + +```go +type ErrorPosition struct { + File string + Line int + Column int +} +``` + +### `TableConfig` + +```go +type TableConfig struct { + Headers []string + Rows [][]string + Title string // Optional table title + ShowTotal bool // Display a total row + TotalRow []string // Content for the total row +} +``` + +### `TreeNode` + +Represents a node in a hierarchical tree for tree-style rendering. + +```go +type TreeNode struct { + Value string + Children []TreeNode +} +``` + +### `SelectOption` + +A labeled option for interactive select fields. + +```go +type SelectOption struct { + Label string + Value string +} +``` + +### `FormField` + +Configuration for a single field in an interactive form. + +```go +type FormField struct { + Type string // "input", "password", "confirm", "select" + Title string + Description string + Placeholder string + Value any // Pointer to the field's result value + Options []SelectOption // For "select" type + Validate func(string) error // For "input" and "password" types +} +``` + +### `ListItem` + +An item in an interactive list with title, description, and an internal value. Create with `NewListItem(title, description, value string)`. + +## Interactive Prompts + +### `ConfirmAction(title, affirmative, negative string) (bool, error)` + +Displays an interactive yes/no confirmation dialog using the `huh` library. Returns `true` when the user selects `affirmative`. + +```go +confirmed, err := console.ConfirmAction( + "Delete all compiled workflows?", + "Yes, delete", + "Cancel", +) +if err != nil || !confirmed { + return +} +``` + +> **Note**: `ConfirmAction` is only available in non-WASM builds. In WASM environments the function is unavailable. + +## Utility Functions + +### `FormatFileSize(size int64) string` + +Formats a byte count as a human-readable string with appropriate unit suffix. + +```go +console.FormatFileSize(0) // "0 B" +console.FormatFileSize(1500) // "1.5 KB" +console.FormatFileSize(2_100_000) // "2.0 MB" +``` + +### `FormatBanner() string` + +Returns the `gh aw` ASCII art banner as a styled string. + +### `PrintBanner()` + +Prints the banner to `os.Stderr`. + +## Accessibility + +### `IsAccessibleMode() bool` + +Returns `true` when the terminal is in accessibility mode based on environment variables: +- `ACCESSIBLE` is set (any value) +- `TERM` is `"dumb"` +- `NO_COLOR` is set (any value) + +When accessibility mode is active: +- Spinner animations are disabled. +- The `huh` confirmation dialog uses accessible (plain-text) mode. +- All `Format*` functions still work normally but rendered output may differ if called with lipgloss styles. + +```go +if console.IsAccessibleMode() { + // Use simpler, non-animated output +} +``` diff --git a/pkg/constants/README.md b/pkg/constants/README.md new file mode 100644 index 00000000000..7d2937d3a22 --- /dev/null +++ b/pkg/constants/README.md @@ -0,0 +1,178 @@ +# constants Package + +The `constants` package provides shared semantic type aliases and named constants used across multiple `gh-aw` packages. Centralizing these values ensures consistency and type safety throughout the codebase. + +## Overview + +The package is organized into focused files: + +| File | Contents | +|------|----------| +| `constants.go` | Core types, formatting constants, runtime config, container images | +| `engine_constants.go` | AI engine names, options, system secrets, Copilot CLI commands | +| `feature_constants.go` | Feature flag identifiers | +| `job_constants.go` | GitHub Actions job names, step IDs, artifact names, output keys | +| `tool_constants.go` | Allowed GitHub tool expressions and default tool lists | +| `url_constants.go` | URL semantic types and well-known URL constants | +| `version_constants.go` | Default version strings for all pinned dependencies | + +## Semantic Types + +The package uses typed aliases to prevent mixing unrelated string or integer values: + +| Type | Description | Example constant | +|------|-------------|-----------------| +| `EngineName` | AI engine identifier | `CopilotEngine`, `ClaudeEngine`, `CodexEngine`, `GeminiEngine` | +| `FeatureFlag` | Feature flag identifier | `MCPGatewayFeatureFlag`, `MCPScriptsFeatureFlag` | +| `JobName` | GitHub Actions job name | `AgentJobName`, `ActivationJobName` | +| `StepID` | GitHub Actions step identifier | `CheckMembershipStepID`, `CheckRateLimitStepID` | +| `MCPServerID` | MCP server identifier | `SafeOutputsMCPServerID`, `MCPScriptsMCPServerID` | +| `LineLength` | Character count for formatting | `MaxExpressionLineLength` (120) | +| `CommandPrefix` | CLI command prefix | `CLIExtensionPrefix` ("gh aw") | +| `WorkflowID` | User-provided workflow basename (no `.md`) | — | +| `Version` | Software version string | `DefaultCopilotVersion`, `DefaultNodeVersion` | +| `ModelName` | AI model name | — | +| `URL` | URL string | `DefaultMCPRegistryURL`, `PublicGitHubHost` | +| `DocURL` | Documentation URL | — | + +All semantic types implement `String() string` and `IsValid() bool` methods. + +## Engine Constants + +```go +import "github.com/github/gh-aw/pkg/constants" + +// Engine names +constants.CopilotEngine // "copilot" +constants.ClaudeEngine // "claude" +constants.CodexEngine // "codex" +constants.GeminiEngine // "gemini" +constants.DefaultEngine // "copilot" + +// Get engine metadata +opt := constants.GetEngineOption("copilot") +// opt.Label = "GitHub Copilot" +// opt.SecretName = "COPILOT_GITHUB_TOKEN" +// opt.KeyURL = "https://github.com/settings/personal-access-tokens/new" + +// Get all secret names for all engines +secrets := constants.GetAllEngineSecretNames() +``` + +### `EngineOption` + +Describes a selectable AI engine with display metadata and required secret information: +- `Value`, `Label`, `Description` — display information +- `SecretName` — the primary secret required (e.g. `COPILOT_GITHUB_TOKEN`) +- `AlternativeSecrets` — secondary secret names that can be used instead +- `KeyURL` — URL where users can obtain their API key +- `WhenNeeded` — human-readable description of when this secret is needed + +## Feature Flags + +```go +constants.MCPScriptsFeatureFlag // "mcp-scripts" +constants.MCPGatewayFeatureFlag // "mcp-gateway" +constants.DisableXPIAPromptFeatureFlag // "disable-xpia-prompt" +constants.CopilotRequestsFeatureFlag // "copilot-requests" +constants.CliProxyFeatureFlag // "cli-proxy" +constants.IntegrityReactionsFeatureFlag // "integrity-reactions" +``` + +## Job and Step Constants + +```go +// Job names +constants.AgentJobName // "agent" +constants.ActivationJobName // "activation" +constants.PreActivationJobName // "pre_activation" +constants.DetectionJobName // "detection" +constants.SafeOutputsJobName // "safe_outputs" +constants.ConclusionJobName // "conclusion" + +// Artifact names +constants.SafeOutputArtifactName // "safe-output" +constants.AgentOutputArtifactName // "agent-output" +constants.ActivationArtifactName // "activation" + +// Step IDs +constants.CheckMembershipStepID // "check_membership" +constants.CheckRateLimitStepID // "check_rate_limit" +constants.CheckMembershipStepID // "check_membership" + +// Step output keys +constants.IsTeamMemberOutput // "is_team_member" +constants.ActivatedOutput // "activated" +constants.MatchedCommandOutput // "matched_command" +``` + +## Version Constants + +All pinned dependency versions are defined here: + +```go +constants.DefaultCopilotVersion // Copilot CLI version +constants.DefaultClaudeCodeVersion // Claude Code version +constants.DefaultCodexVersion // Codex version +constants.DefaultGeminiVersion // Gemini CLI version +constants.DefaultGitHubMCPServerVersion // GitHub MCP server version +constants.DefaultFirewallVersion // AWF firewall version +constants.DefaultNodeVersion // Node.js runtime version +constants.DefaultPythonVersion // Python runtime version +constants.DefaultGoVersion // Go runtime version +``` + +## Formatting Constants + +```go +constants.MaxExpressionLineLength // 120 — maximum line length for YAML expressions +constants.ExpressionBreakThreshold // 100 — threshold at which long lines get broken +``` + +## Runtime Configuration + +```go +constants.GhAwRootDir // "${{ runner.temp }}/gh-aw" +constants.GhAwRootDirShell // "${RUNNER_TEMP}/gh-aw" +constants.DefaultAgenticWorkflowTimeout // 20 minutes +constants.DefaultToolTimeout // 60 seconds +constants.DefaultMCPStartupTimeout // 120 seconds +constants.DefaultRateLimitMax // 5 runs per window +constants.DefaultRateLimitWindow // 60 minutes + +// GetWorkflowDir returns the workflows directory respecting GH_AW_WORKFLOWS_DIR env var +dir := constants.GetWorkflowDir() +``` + +## Container Images + +```go +constants.DefaultNodeAlpineLTSImage // "node:lts-alpine" +constants.DefaultPythonAlpineLTSImage // "python:alpine" +constants.DefaultAlpineImage // "alpine:latest" +constants.DefaultMCPGatewayContainer // ghcr.io/github/gh-aw-mcpg +constants.DefaultFirewallRegistry // ghcr.io/github/gh-aw-firewall +``` + +## Tool Lists + +```go +// GitHub API tools allowed in workflow expressions +constants.AllowedExpressions // []string of allowed GitHub tool names +constants.AllowedExpressionsSet // map[string]struct{} for O(1) lookup + +// Dangerous property names (blocked in expressions) +constants.DangerousPropertyNames +constants.DangerousPropertyNamesSet + +// Default tools for read-only GitHub operations +constants.DefaultReadOnlyGitHubTools +constants.DefaultGitHubTools +constants.DefaultBashTools +``` + +## Design Notes + +- All semantic types implement `String()` and `IsValid()` to allow consistent validation across the codebase. +- Version constants are intentionally plain string literals (not derived from build tags or embedded files) so that individual upgrades can be made as targeted one-line changes. +- `GetWorkflowDir()` reads `GH_AW_WORKFLOWS_DIR` from the environment at call time, allowing the directory to be overridden in tests and CI. diff --git a/pkg/envutil/README.md b/pkg/envutil/README.md new file mode 100644 index 00000000000..42476acda9f --- /dev/null +++ b/pkg/envutil/README.md @@ -0,0 +1,46 @@ +# envutil Package + +The `envutil` package provides utilities for reading and validating environment variables with bounds checking. + +## Overview + +This package centralizes the pattern of reading integer-valued environment variables, validating them against configured minimum and maximum bounds, and falling back to a default value when the variable is absent or out of range. It emits warning messages to stderr when an invalid value is encountered, following the console formatting conventions of the rest of the codebase. + +## Usage + +### GetIntFromEnv + +```go +import ( + "github.com/github/gh-aw/pkg/envutil" + "github.com/github/gh-aw/pkg/logger" +) + +var log = logger.New("mypackage:config") + +// Read GH_AW_MAX_CONCURRENT_DOWNLOADS, constrained to [1, 20], default 5 +concurrency := envutil.GetIntFromEnv("GH_AW_MAX_CONCURRENT_DOWNLOADS", 5, 1, 20, log) +``` + +**Behavior**: +- Returns `defaultValue` when the environment variable is not set. +- Returns `defaultValue` and emits a warning when the value cannot be parsed as an integer. +- Returns `defaultValue` and emits a warning when the value is outside `[minValue, maxValue]`. +- Logs the accepted value at debug level when `log` is non-nil. +- Pass `nil` for `log` to suppress debug output. + +### Parameters + +| Parameter | Type | Description | +|-----------|------|-------------| +| `envVar` | `string` | Environment variable name (e.g. `"GH_AW_TIMEOUT"`) | +| `defaultValue` | `int` | Value returned when env var is absent or invalid | +| `minValue` | `int` | Minimum allowed value (inclusive) | +| `maxValue` | `int` | Maximum allowed value (inclusive) | +| `log` | `*logger.Logger` | Optional logger for debug output; pass `nil` to disable | + +## Design Notes + +- Warning messages use `console.FormatWarningMessage` so they render consistently in terminals. +- All warnings go to `os.Stderr` to avoid polluting structured stdout output. +- The function only handles integers; floating-point or string env vars should be read directly via `os.Getenv`. diff --git a/pkg/fileutil/README.md b/pkg/fileutil/README.md new file mode 100644 index 00000000000..3f50b3f14fe --- /dev/null +++ b/pkg/fileutil/README.md @@ -0,0 +1,90 @@ +# fileutil Package + +The `fileutil` package provides utility functions for safe file path validation and common file operations. + +## Overview + +This package focuses on security-conscious file handling: path validation, boundary enforcement, and straightforward file/directory operations. It also provides a cross-platform tar extraction helper. + +## Functions + +### Path Validation + +#### `ValidateAbsolutePath(path string) (string, error)` + +Validates that a file path is absolute and safe to use. The function: +1. Rejects empty paths. +2. Cleans the path with `filepath.Clean` to normalize `.` and `..` components. +3. Verifies the cleaned path is absolute. + +Returns the cleaned absolute path on success, or an error otherwise. Use this before any file operation to defend against relative path traversal. + +```go +import "github.com/github/gh-aw/pkg/fileutil" + +cleanPath, err := fileutil.ValidateAbsolutePath(userInput) +if err != nil { + return fmt.Errorf("invalid path: %w", err) +} +content, err := os.ReadFile(cleanPath) +``` + +#### `MustBeWithin(base, candidate string) error` + +Checks that `candidate` is located within the `base` directory tree. Both paths are resolved through `filepath.EvalSymlinks` (falling back to `filepath.Abs` for paths that do not yet exist on disk) before comparison, preventing both `..` traversal and symlink escapes. + +```go +if err := fileutil.MustBeWithin("/workspace", outputPath); err != nil { + return fmt.Errorf("output path escapes workspace: %w", err) +} +``` + +### File and Directory Checks + +#### `FileExists(path string) bool` + +Returns `true` if `path` exists and is a regular file (not a directory). + +#### `DirExists(path string) bool` + +Returns `true` if `path` exists and is a directory. + +#### `IsDirEmpty(path string) bool` + +Returns `true` if the directory at `path` contains no entries. Returns `true` if the directory cannot be read. + +### File Operations + +#### `CopyFile(src, dst string) error` + +Copies the file at `src` to `dst` using buffered I/O. Creates `dst` if it does not exist; truncates it if it does. Calls `Sync` on the destination before closing. + +```go +if err := fileutil.CopyFile("source.txt", "destination.txt"); err != nil { + return fmt.Errorf("copy failed: %w", err) +} +``` + +### Archive Operations + +#### `ExtractFileFromTar(data []byte, path string) ([]byte, error)` + +Extracts a single file by `path` from a tar archive stored in `data`. Uses Go's `archive/tar` for cross-platform compatibility. + +Security guarantees: +- `path` must be a local, relative path (no `..` components or absolute paths). +- Individual tar entries with unsafe names are skipped, not extracted. + +```go +tarData, _ := io.ReadAll(response.Body) +content, err := fileutil.ExtractFileFromTar(tarData, "bin/gh") +if err != nil { + return fmt.Errorf("binary not found in release archive: %w", err) +} +``` + +## Design Notes + +- All debug output uses `logger.New("fileutil:fileutil")` and `logger.New("fileutil:tar")` and is only emitted when `DEBUG=fileutil:*`. +- `MustBeWithin` resolves symlinks before comparison, providing defence-in-depth against symlink attacks in addition to the `..` checking that `ValidateAbsolutePath` provides. +- `ExtractFileFromTar` rejects path-traversal payloads in both the caller-supplied path and in tar entry names using `filepath.IsLocal`. diff --git a/pkg/gitutil/README.md b/pkg/gitutil/README.md new file mode 100644 index 00000000000..c6b92aaa55e --- /dev/null +++ b/pkg/gitutil/README.md @@ -0,0 +1,85 @@ +# gitutil Package + +The `gitutil` package provides utility functions for interacting with Git repositories and classifying GitHub API errors. + +## Overview + +This package contains helpers for: +- Detecting rate-limit and authentication errors from GitHub API responses. +- Validating hex strings (e.g. commit SHAs). +- Extracting base repository slugs from action paths. +- Finding the root directory of the current Git repository. +- Reading file contents from the `HEAD` commit. + +## Functions + +### Error Classification + +#### `IsRateLimitError(errMsg string) bool` + +Returns `true` when `errMsg` indicates a GitHub API rate-limit error (HTTP 403 "API rate limit exceeded" or HTTP 429). + +```go +if gitutil.IsRateLimitError(err.Error()) { + // Back off and retry +} +``` + +#### `IsAuthError(errMsg string) bool` + +Returns `true` when `errMsg` indicates an authentication or authorization failure (`GH_TOKEN`, `GITHUB_TOKEN`, `unauthorized`, `forbidden`, SAML enforcement, etc.). + +```go +if gitutil.IsAuthError(err.Error()) { + fmt.Fprintln(os.Stderr, "Check that GH_TOKEN is set correctly") +} +``` + +### String Utilities + +#### `IsHexString(s string) bool` + +Returns `true` if `s` consists entirely of hexadecimal characters (`0–9`, `a–f`, `A–F`). Returns `false` for the empty string. + +```go +if gitutil.IsHexString(sha) { + // Valid commit SHA +} +``` + +#### `ExtractBaseRepo(repoPath string) string` + +Extracts the `owner/repo` portion from an action path that may include a sub-folder. + +```go +gitutil.ExtractBaseRepo("actions/checkout") // → "actions/checkout" +gitutil.ExtractBaseRepo("github/codeql-action/upload-sarif") // → "github/codeql-action" +``` + +### Repository Operations + +#### `FindGitRoot() (string, error)` + +Returns the absolute path of the root directory of the current Git repository by running `git rev-parse --show-toplevel`. Returns an error if the working directory is not inside a Git repository. + +```go +root, err := gitutil.FindGitRoot() +if err != nil { + return fmt.Errorf("not in a git repository: %w", err) +} +``` + +#### `ReadFileFromHEADWithRoot(filePath, gitRoot string) (string, error)` + +Reads a file's content from the `HEAD` commit without touching the working tree. `gitRoot` must be the repository root (typically from `FindGitRoot`). The function rejects paths that escape the repository (i.e. paths containing `..` after resolution). + +```go +root, _ := gitutil.FindGitRoot() +content, err := gitutil.ReadFileFromHEADWithRoot("pkg/workflow/compiler.go", root) +``` + +## Design Notes + +- All debug output uses `logger.New("gitutil:gitutil")` and is only emitted when `DEBUG=gitutil:*`. +- Error classification is case-insensitive string matching — no external dependency on GitHub API client types. +- `ReadFileFromHEADWithRoot` uses `git show HEAD:` and resolves paths with `filepath.Rel` to prevent path-traversal attacks. diff --git a/pkg/logger/README.md b/pkg/logger/README.md index 6779a969238..1ed65d21221 100644 --- a/pkg/logger/README.md +++ b/pkg/logger/README.md @@ -164,6 +164,44 @@ var parseLog = logger.New("parse") var validateLog = logger.New("validate") ``` +## slog Integration + +The package includes a bridge to Go's standard `log/slog` library for libraries that expect a `slog.Logger` instead of the custom `Logger` type. + +### `SlogHandler` + +`SlogHandler` implements `slog.Handler` by delegating to an existing `Logger`. It respects the logger's enabled state, formats attributes as `key=value` pairs, and prefixes each message with the slog level (`[DEBUG]`, `[INFO]`, `[WARN]`, `[ERROR]`). + +### `NewSlogHandler(logger *Logger) *SlogHandler` + +Creates a new `slog.Handler` wrapping the provided `Logger`. + +```go +import "github.com/github/gh-aw/pkg/logger" + +var log = logger.New("myapp:feature") +handler := logger.NewSlogHandler(log) +slogLogger := slog.New(handler) +slogLogger.Info("using slog interface", "key", "value") +``` + +### `NewSlogLoggerWithHandler(logger *Logger) *slog.Logger` + +Convenience constructor that creates both the `SlogHandler` and the `slog.Logger` in one call. + +```go +var log = logger.New("myapp:feature") +slogLogger := logger.NewSlogLoggerWithHandler(log) +slogLogger.Warn("something unusual happened", "count", 42) +``` + +### Behavior + +- **Enabled check**: `SlogHandler.Enabled` returns `false` when the underlying `Logger` is disabled (i.e. the namespace does not match the `DEBUG` pattern). This prevents expensive attribute collection for disabled loggers. +- **Attribute formatting**: All record attributes are appended as `key=value` pairs after the message. +- **Groups and persistent attributes**: `WithAttrs` and `WithGroup` return the handler unchanged — attributes are not persisted across calls. This keeps the adapter lightweight. +- **Output destination**: All output goes to `stderr` via the underlying `Logger`. + ## Implementation Notes - The `DEBUG` environment variable is read once when the package is initialized diff --git a/pkg/repoutil/README.md b/pkg/repoutil/README.md new file mode 100644 index 00000000000..ff9d22454f3 --- /dev/null +++ b/pkg/repoutil/README.md @@ -0,0 +1,42 @@ +# repoutil Package + +The `repoutil` package provides utility functions for working with GitHub repository slugs. + +## Overview + +This package offers a single focused helper for parsing and validating `owner/repo` repository slug strings, which are used throughout the codebase wherever GitHub repositories are referenced. + +## Functions + +### `SplitRepoSlug(slug string) (owner, repo string, err error)` + +Splits a repository slug of the form `owner/repo` into its two components. Returns an error when the slug does not contain exactly one `/` separator or when either the owner or repository name is empty. + +```go +import "github.com/github/gh-aw/pkg/repoutil" + +owner, repo, err := repoutil.SplitRepoSlug("github/gh-aw") +if err != nil { + return fmt.Errorf("invalid repository: %w", err) +} +// owner = "github", repo = "gh-aw" +``` + +**Error cases**: + +```go +// Missing separator +repoutil.SplitRepoSlug("github") // error: invalid repo format: github + +// Empty component +repoutil.SplitRepoSlug("/gh-aw") // error: invalid repo format: /gh-aw +repoutil.SplitRepoSlug("github/") // error: invalid repo format: github/ + +// Too many separators +repoutil.SplitRepoSlug("github/gh-aw/x") // error: invalid repo format: github/gh-aw/x +``` + +## Design Notes + +- All debug output uses `logger.New("repoutil:repoutil")` and is only emitted when `DEBUG=repoutil:*`. +- For paths that include sub-folders (e.g. GitHub Actions `uses:` fields such as `github/codeql-action/upload-sarif`), use `gitutil.ExtractBaseRepo` first to strip the sub-path before calling `SplitRepoSlug`. diff --git a/pkg/semverutil/README.md b/pkg/semverutil/README.md new file mode 100644 index 00000000000..ae547da0afe --- /dev/null +++ b/pkg/semverutil/README.md @@ -0,0 +1,103 @@ +# semverutil Package + +The `semverutil` package provides shared semantic versioning primitives used across `pkg/workflow` and `pkg/cli`. Centralizing these helpers ensures that semver parsing, comparison, and compatibility logic is defined in one place. + +## Overview + +This package wraps `golang.org/x/mod/semver` with additional helpers for: +- Normalizing version strings (adding the required `v` prefix). +- Validating GitHub Actions version tags (`vmajor`, `vmajor.minor`, `vmajor.minor.patch`). +- Parsing versions into a structured `SemanticVersion` type. +- Comparing and checking compatibility between version strings. + +## Types + +### `SemanticVersion` + +A parsed semantic version with individual numeric components. + +```go +type SemanticVersion struct { + Major int + Minor int + Patch int + Pre string // Prerelease identifier without leading hyphen (e.g. "beta.1") + Raw string // Original version string without leading "v" +} +``` + +#### Methods + +| Method | Description | +|--------|-------------| +| `IsPreciseVersion() bool` | Returns `true` if the version has at least two dots (e.g. `v6.0.0` is precise, `v6` is not) | +| `IsNewer(other *SemanticVersion) bool` | Returns `true` if this version is newer than `other` | + +## Functions + +### `EnsureVPrefix(v string) string` + +Adds a leading `"v"` if `v` does not already have one. Required because `golang.org/x/mod/semver` demands the prefix. + +```go +semverutil.EnsureVPrefix("1.2.3") // → "v1.2.3" +semverutil.EnsureVPrefix("v1.2.3") // → "v1.2.3" +``` + +### `IsActionVersionTag(s string) bool` + +Reports whether `s` is a valid GitHub Actions version tag. Accepted forms: `vmajor`, `vmajor.minor`, `vmajor.minor.patch`. Prerelease and build-metadata suffixes are **not** accepted. + +```go +semverutil.IsActionVersionTag("v4") // true +semverutil.IsActionVersionTag("v4.1") // true +semverutil.IsActionVersionTag("v4.1.0") // true +semverutil.IsActionVersionTag("v4.1.0-rc") // false +``` + +### `IsValid(ref string) bool` + +Reports whether `ref` is a valid semantic version string (accepts any valid semver including prerelease/build-metadata, and bare versions without `"v"`). + +```go +semverutil.IsValid("1.2.3") // true +semverutil.IsValid("v1.2.3-beta") // true +semverutil.IsValid("not-a-ver") // false +``` + +### `ParseVersion(v string) *SemanticVersion` + +Parses `v` into a `SemanticVersion`. Returns `nil` if `v` is not a valid semver string. + +```go +ver := semverutil.ParseVersion("v1.2.3") +if ver != nil { + fmt.Println(ver.Major, ver.Minor, ver.Patch) // 1 2 3 +} +``` + +### `Compare(v1, v2 string) int` + +Compares two semantic versions using `golang.org/x/mod/semver`. Returns `1` if `v1 > v2`, `-1` if `v1 < v2`, or `0` if equal. Bare versions (without `"v"`) are accepted. + +```go +semverutil.Compare("v2.0.0", "v1.9.9") // 1 (v2 is newer) +semverutil.Compare("v1.0.0", "v1.0.0") // 0 (equal) +semverutil.Compare("v0.9.0", "v1.0.0") // -1 (v0.9 is older) +``` + +### `IsCompatible(pinVersion, requestedVersion string) bool` + +Reports whether `pinVersion` is semver-compatible with `requestedVersion`. Compatibility is defined as sharing the same major version. + +```go +semverutil.IsCompatible("v5.0.0", "v5") // true +semverutil.IsCompatible("v5.1.0", "v5.0.0") // true +semverutil.IsCompatible("v6.0.0", "v5") // false +``` + +## Design Notes + +- All debug output uses `logger.New("semverutil:semverutil")` and is only emitted when `DEBUG=semverutil:*`. +- The package intentionally delegates to `golang.org/x/mod/semver` for canonical semver logic rather than implementing its own parsing. +- `ParseVersion` uses `semver.Canonical` before splitting into components, ensuring correct handling of short forms like `v1` (canonicalized to `v1.0.0`). diff --git a/pkg/sliceutil/README.md b/pkg/sliceutil/README.md new file mode 100644 index 00000000000..ae472db5108 --- /dev/null +++ b/pkg/sliceutil/README.md @@ -0,0 +1,79 @@ +# sliceutil Package + +The `sliceutil` package provides generic utility functions for working with slices and maps. + +## Overview + +All functions in this package are pure: they never modify their input. They are generic and work with any element type using Go's type-parameter syntax. + +## Functions + +### `Filter[T any](slice []T, predicate func(T) bool) []T` + +Returns a new slice containing only elements for which `predicate` returns `true`. + +```go +import "github.com/github/gh-aw/pkg/sliceutil" + +numbers := []int{1, 2, 3, 4, 5} +evens := sliceutil.Filter(numbers, func(n int) bool { return n%2 == 0 }) +// evens = [2, 4] +``` + +### `Map[T, U any](slice []T, transform func(T) U) []U` + +Applies `transform` to every element and returns the results as a new slice. + +```go +names := []string{"alice", "bob"} +upper := sliceutil.Map(names, strings.ToUpper) +// upper = ["ALICE", "BOB"] +``` + +### `MapToSlice[K comparable, V any](m map[K]V) []K` + +Converts the keys of a map into a slice. **Order is not guaranteed.** + +```go +m := map[string]int{"a": 1, "b": 2} +keys := sliceutil.MapToSlice(m) +// keys = ["a", "b"] (in some order) +``` + +### `FilterMapKeys[K comparable, V any](m map[K]V, predicate func(K, V) bool) []K` + +Returns the map keys for which `predicate(key, value)` is `true`. **Order is not guaranteed.** + +```go +scores := map[string]int{"alice": 90, "bob": 50, "carol": 80} +passed := sliceutil.FilterMapKeys(scores, func(name string, score int) bool { + return score >= 75 +}) +// passed = ["alice", "carol"] (in some order) +``` + +### `Any[T any](slice []T, predicate func(T) bool) bool` + +Returns `true` if at least one element in `slice` satisfies `predicate`. Returns `false` for nil or empty slices. + +```go +words := []string{"hello", "world"} +hasWorld := sliceutil.Any(words, func(w string) bool { return w == "world" }) +// hasWorld = true +``` + +### `Deduplicate[T comparable](slice []T) []T` + +Returns a new slice with duplicate elements removed, preserving the order of first occurrence. + +```go +items := []string{"a", "b", "a", "c", "b"} +unique := sliceutil.Deduplicate(items) +// unique = ["a", "b", "c"] +``` + +## Design Notes + +- `Any` is implemented via `slices.ContainsFunc` from the standard library. +- `Deduplicate` uses a `map[T]bool` for O(n) time complexity. +- None of these functions sort their output; callers that require sorted results should call `slices.Sort` on the returned slice. diff --git a/pkg/stringutil/README.md b/pkg/stringutil/README.md new file mode 100644 index 00000000000..38da4436558 --- /dev/null +++ b/pkg/stringutil/README.md @@ -0,0 +1,179 @@ +# stringutil Package + +The `stringutil` package provides utility functions for working with strings. It is organized into focused sub-files covering ANSI stripping, identifier normalization, sanitization, URL utilities, and PAT (Personal Access Token) validation. + +## Overview + +| Sub-file | Functions | +|----------|-----------| +| `stringutil.go` | General string helpers | +| `ansi.go` | ANSI escape-code stripping | +| `identifiers.go` | Workflow name and path normalization | +| `sanitize.go` | Security-sensitive string sanitization | +| `urls.go` | URL normalization and domain extraction | +| `pat_validation.go` | GitHub PAT classification and validation | + +## General Utilities (`stringutil.go`) + +### `Truncate(s string, maxLen int) string` + +Truncates `s` to at most `maxLen` characters, appending `"..."` when truncation occurs. For `maxLen ≤ 3` the string is truncated without ellipsis. + +```go +stringutil.Truncate("hello world", 8) // "hello..." +stringutil.Truncate("hi", 8) // "hi" +``` + +### `NormalizeWhitespace(content string) string` + +Collapses multiple consecutive whitespace characters (spaces, tabs, newlines) into a single space and trims leading/trailing whitespace. + +### `ParseVersionValue(version any) string` + +Converts a `any`-typed version value (typically from YAML parsing, which may produce `int`, `float64`, or `string`) into a string. Returns an empty string for nil. + +```go +stringutil.ParseVersionValue("20") // "20" +stringutil.ParseVersionValue(20) // "20" +stringutil.ParseVersionValue(20.0) // "20" +``` + +### `IsPositiveInteger(s string) bool` + +Returns `true` if `s` is a non-empty string containing only digit characters (`0–9`). + +## ANSI Escape Code Stripping (`ansi.go`) + +### `StripANSI(s string) string` + +Removes all ANSI/VT100 escape sequences from `s`. Handles CSI sequences (e.g. `\x1b[31m` for colors) and other ESC-prefixed sequences. This function is used before writing text into YAML files to prevent invisible characters from corrupting workflow output. + +```go +colored := "\x1b[32mSuccess\x1b[0m" +plain := stringutil.StripANSI(colored) // "Success" +``` + +## Identifier Normalization (`identifiers.go`) + +### `NormalizeWorkflowName(name string) string` + +Removes `.md` and `.lock.yml` extensions from workflow names, returning the bare workflow identifier. + +```go +stringutil.NormalizeWorkflowName("weekly-research.md") // "weekly-research" +stringutil.NormalizeWorkflowName("weekly-research.lock.yml") // "weekly-research" +stringutil.NormalizeWorkflowName("weekly-research") // "weekly-research" +``` + +### `NormalizeSafeOutputIdentifier(identifier string) string` + +Converts dashes to underscores in safe-output identifiers, normalizing the user-facing `dash-separated` format to the internal `underscore_separated` format. + +```go +stringutil.NormalizeSafeOutputIdentifier("create-issue") // "create_issue" +``` + +### `MarkdownToLockFile(mdPath string) string` + +Converts a workflow markdown path (`.md`) to its compiled lock file path (`.lock.yml`). Returns the path unchanged if it already ends with `.lock.yml`. + +```go +stringutil.MarkdownToLockFile(".github/workflows/test.md") +// → ".github/workflows/test.lock.yml" +``` + +### `LockFileToMarkdown(lockPath string) string` + +Converts a compiled lock file path (`.lock.yml`) back to its markdown source path (`.md`). Returns the path unchanged if it already ends with `.md`. + +```go +stringutil.LockFileToMarkdown(".github/workflows/test.lock.yml") +// → ".github/workflows/test.md" +``` + +## Sanitization (`sanitize.go`) + +These functions remove sensitive information to prevent accidental leakage in logs or error messages. + +### `SanitizeErrorMessage(message string) string` + +Redacts potential secret key names from error messages. Matches uppercase `SNAKE_CASE` identifiers (e.g. `MY_SECRET_KEY`, `API_TOKEN`) and PascalCase identifiers ending with security-related suffixes (e.g. `GitHubToken`, `ApiKey`). Common GitHub Actions workflow keywords (`GITHUB`, `RUNNER`, `WORKFLOW`, etc.) are excluded from redaction. + +```go +stringutil.SanitizeErrorMessage("Error: MY_SECRET_TOKEN is invalid") +// → "Error: [REDACTED] is invalid" +``` + +### `SanitizeParameterName(name string) string` + +Sanitizes a parameter name for use as a GitHub Actions output or environment variable name. Replaces non-alphanumeric characters with underscores. + +### `SanitizePythonVariableName(name string) string` + +Sanitizes a string for use as a Python variable name. Similar to `SanitizeParameterName` but follows Python identifier rules. + +### `SanitizeToolID(toolID string) string` + +Sanitizes a tool identifier for safe use in generated code. Replaces characters that are not valid in identifiers with underscores. + +### `SanitizeForFilename(slug string) string` + +Converts a string into a filesystem-safe filename by lowercasing and replacing non-alphanumeric characters with hyphens. + +## URL Utilities (`urls.go`) + +### `NormalizeGitHubHostURL(rawHostURL string) string` + +Normalizes a GitHub host URL by ensuring it has an `https://` scheme and no trailing slash. Accepts bare hostnames, URLs with or without a scheme, and URLs with trailing slashes. + +```go +stringutil.NormalizeGitHubHostURL("github.example.com") // "https://github.example.com" +stringutil.NormalizeGitHubHostURL("https://github.com/") // "https://github.com" +``` + +### `ExtractDomainFromURL(urlStr string) string` + +Extracts the hostname (without port) from a URL string. Falls back to simple string parsing when `url.Parse` cannot handle the input. + +```go +stringutil.ExtractDomainFromURL("https://api.github.com/repos") // "api.github.com" +``` + +## PAT Validation (`pat_validation.go`) + +### `PATType` + +A string type representing the category of a GitHub Personal Access Token. + +| Constant | Value | Prefix | +|----------|-------|--------| +| `PATTypeFineGrained` | `"fine-grained"` | `github_pat_` | +| `PATTypeClassic` | `"classic"` | `ghp_` | +| `PATTypeOAuth` | `"oauth"` | `gho_` | +| `PATTypeUnknown` | `"unknown"` | (other) | + +Methods: `String() string`, `IsFineGrained() bool`, `IsValid() bool` + +### `ClassifyPAT(token string) PATType` + +Determines the token type from its prefix. + +### `ValidateCopilotPAT(token string) error` + +Returns `nil` if the token is a fine-grained PAT; returns an actionable error message with a link to create the correct token type otherwise. + +```go +if err := stringutil.ValidateCopilotPAT(token); err != nil { + fmt.Fprintln(os.Stderr, console.FormatErrorMessage(err.Error())) +} +``` + +### `GetPATTypeDescription(token string) string` + +Returns a human-readable description of the token type (e.g. `"fine-grained personal access token"`). + +## Design Notes + +- All debug output uses namespace-prefixed loggers (`stringutil:identifiers`, `stringutil:sanitize`, `stringutil:urls`, `stringutil:pat_validation`) and is only emitted when `DEBUG=stringutil:*`. +- `SanitizeErrorMessage` is intentionally conservative: it excludes common GitHub Actions keywords to avoid over-redacting legitimate error messages. +- `StripANSI` handles both CSI sequences (`ESC[`) and other ESC-prefixed sequences to cover the full range of ANSI escape codes found in terminal output. diff --git a/pkg/styles/README.md b/pkg/styles/README.md new file mode 100644 index 00000000000..05468b69f55 --- /dev/null +++ b/pkg/styles/README.md @@ -0,0 +1,101 @@ +# styles Package + +The `styles` package provides centralized color constants, adaptive color variables, border definitions, and pre-configured `lipgloss` styles for consistent terminal output across the codebase. + +## Overview + +All colors use `compat.AdaptiveColor` to automatically choose between light and dark variants based on the terminal's background. The dark palette is inspired by the [Dracula theme](https://draculatheme.com/); the light palette uses darker, more saturated colors for good contrast on light backgrounds. + +## Adaptive Color Variables + +These variables provide `compat.AdaptiveColor` values that auto-select the correct shade at render time: + +| Variable | Semantic use | Light | Dark | +|----------|-------------|-------|------| +| `ColorError` | Error messages, critical issues | `#D73737` | `#FF5555` | +| `ColorWarning` | Warnings, cautionary information | `#E67E22` | `#FFB86C` | +| `ColorSuccess` | Success messages, confirmations | `#27AE60` | `#50FA7B` | +| `ColorInfo` | Informational messages | `#2980B9` | `#8BE9FD` | +| `ColorPurple` | File paths, commands, highlights | `#8E44AD` | `#BD93F9` | +| `ColorYellow` | Progress, attention-grabbing content | `#B7950B` | `#F1FA8C` | +| `ColorComment` | Secondary/muted information, line numbers | `#6C7A89` | `#6272A4` | +| `ColorForeground` | Primary text content | `#2C3E50` | `#F8F8F2` | +| `ColorBackground` | Highlighted backgrounds | `#ECF0F1` | `#282A36` | +| `ColorBorder` | Table borders and dividers | `#BDC3C7` | `#44475A` | +| `ColorTableAltRow` | Alternating table row backgrounds | `#F5F5F5` | `#1A1A1A` | + +## Border Definitions + +| Variable | Style | Usage | +|----------|-------|-------| +| `RoundedBorder` | `╭╮╰╯` rounded corners | Tables, boxes, panels (primary) | +| `NormalBorder` | Straight lines | Left-side emphasis, subtle dividers | +| `ThickBorder` | Thick lines | Reserved for maximum visual emphasis | + +## Pre-configured Styles + +These `lipgloss.Style` values are ready to use directly: + +| Variable | Color | Usage | +|----------|-------|-------| +| `Error` | Red, bold | Error messages | +| `Warning` | Orange, bold | Warning messages | +| `Success` | Green, bold | Success confirmations | +| `Info` | Cyan, bold | Informational messages | +| `FilePath` | Purple | File paths | +| `LineNumber` | Comment/muted | Line numbers in diffs | +| `ContextLine` | Foreground | Context lines in diffs | +| `Highlight` | Yellow, bold | Highlighted text | +| `Location` | Purple, bold | Location references | +| `Command` | Purple | CLI commands | +| `Progress` | Yellow | Progress indicators | +| `Prompt` | Cyan | Interactive prompts | +| `Count` | Yellow, bold | Numeric counts | +| `Verbose` | Comment/muted | Verbose/debug output | +| `ListHeader` | Purple, bold | List section headers | +| `ListItem` | Foreground | List items | +| `TableHeader` | Purple, bold | Table column headers | +| `TableCell` | Foreground | Table cell content | +| `TableTotal` | Yellow, bold | Table total/summary rows | +| `TableTitle` | Purple, bold | Table titles | +| `TableBorder` | Border color | Table border lines | +| `ServerName` | Purple, bold | MCP server names | +| `ServerType` | Comment/muted | MCP server type labels | +| `ErrorBox` | Error color, rounded border | Error message boxes | +| `Header` | Foreground, bold, border | Section headers | +| `TreeEnumerator` | Comment/muted | Tree branch characters | +| `TreeNode` | Foreground | Tree node text | + +## Usage + +```go +import "github.com/github/gh-aw/pkg/styles" + +// Use pre-configured styles +fmt.Println(styles.Error.Render("Something went wrong")) +fmt.Println(styles.Success.Render("Operation completed")) +fmt.Println(styles.Command.Render("gh aw compile")) + +// Use adaptive colors for custom styles +customStyle := lipgloss.NewStyle(). + Foreground(styles.ColorInfo). + Bold(true) +fmt.Println(customStyle.Render("Custom styled text")) +``` + +## Huh Theme + +The package also exports `HuhTheme` — a `huh.ThemeFunc` that applies the same Dracula-inspired color palette to interactive forms rendered with the [huh](https://github.com/charmbracelet/huh) library. + +```go +import "github.com/github/gh-aw/pkg/styles" + +form := huh.NewForm(...).WithTheme(styles.HuhTheme) +``` + +## Design Notes + +- Colors are defined with both light and dark hex constants (`hexColor*Light`, `hexColor*Dark`) so tests can assert exact color values without depending on the `lipgloss` type system. +- The package uses `charm.land/lipgloss/v2` and `charm.land/lipgloss/v2/compat` for adaptive color support. +- For visual examples and detailed usage guidelines, see `scratchpad/styles-guide.md`. +- All `*` styles export pre-configured `lipgloss.Style` values (not functions), so they can be used with method chaining: `styles.Error.Copy().Underline(true)`. diff --git a/pkg/testutil/README.md b/pkg/testutil/README.md new file mode 100644 index 00000000000..c458334fdaa --- /dev/null +++ b/pkg/testutil/README.md @@ -0,0 +1,62 @@ +# testutil Package + +The `testutil` package provides shared test helpers for isolating test artifacts and capturing output. + +## Overview + +This package is imported only in test files (`_test.go`). It provides: +- A shared, isolated temporary directory for each test run (outside the git repository). +- Per-test subdirectories that are cleaned up automatically. +- Helpers for capturing `os.Stderr` output during tests. +- A helper for stripping YAML comment headers from compiled workflow output. + +## Functions + +### `GetTestRunDir() string` + +Returns the path to the unique top-level directory for the current test run. It is created once per process under `$TMPDIR/gh-aw-test-runs/-`. Using a directory outside the repository prevents `git` commands from interfering with test artifacts. + +```go +dir := testutil.GetTestRunDir() +// e.g. /tmp/gh-aw-test-runs/20240101-120000-12345 +``` + +### `TempDir(t *testing.T, pattern string) string` + +Creates a temporary subdirectory inside the test run directory matching `pattern`. The directory is automatically removed when the test completes via `t.Cleanup`. + +```go +func TestCompile(t *testing.T) { + dir := testutil.TempDir(t, "compile-*") + // Use dir for test artifacts; cleaned up automatically +} +``` + +### `CaptureStderr(t *testing.T, fn func()) string` + +Runs `fn` and returns everything written to `os.Stderr` during its execution. `os.Stderr` is restored automatically via `t.Cleanup`. + +```go +func TestWarningMessage(t *testing.T) { + output := testutil.CaptureStderr(t, func() { + myFunction() // writes to os.Stderr + }) + assert.Contains(t, output, "expected warning") +} +``` + +### `StripYAMLCommentHeader(yamlContent string) string` + +Removes the leading comment block from a generated YAML file and returns only the non-comment content. Useful for tests that need to verify compiled output without matching the auto-generated header. + +```go +raw, _ := os.ReadFile("workflow.lock.yml") +yaml := testutil.StripYAMLCommentHeader(string(raw)) +assert.Contains(t, yaml, "runs-on: ubuntu-latest") +``` + +## Design Notes + +- `GetTestRunDir` uses `sync.Once` so the directory is created exactly once per process even when multiple test packages run concurrently. +- `TempDir` delegates to `os.MkdirTemp` to generate unique subdirectory names. +- Test artifacts placed in the test run directory are outside any git repository, which prevents `git` commands executed by tests from picking them up as untracked files. diff --git a/pkg/timeutil/README.md b/pkg/timeutil/README.md new file mode 100644 index 00000000000..4537652be2e --- /dev/null +++ b/pkg/timeutil/README.md @@ -0,0 +1,62 @@ +# timeutil Package + +The `timeutil` package provides human-readable duration formatting utilities. + +## Overview + +This package contains helpers for converting `time.Duration` values and raw numeric durations (milliseconds, nanoseconds) into compact, readable strings. The primary formatting style follows the [debug npm package](https://www.npmjs.com/package/debug) conventions used by the `logger` package. + +## Functions + +### `FormatDuration(d time.Duration) string` + +Formats a `time.Duration` for display. Provides granular output from nanoseconds to hours. + +| Range | Example output | +|-------|---------------| +| `< 1µs` | `"500ns"` | +| `1µs – < 1ms` | `"250µs"` | +| `1ms – < 1s` | `"750ms"` | +| `1s – < 1min` | `"2.5s"` | +| `1min – < 1h` | `"1.3m"` | +| `≥ 1h` | `"2.0h"` | + +```go +import "github.com/github/gh-aw/pkg/timeutil" + +timeutil.FormatDuration(500 * time.Millisecond) // "500ms" +timeutil.FormatDuration(2500 * time.Millisecond) // "2.5s" +timeutil.FormatDuration(90 * time.Second) // "1.5m" +``` + +### `FormatDurationMs(ms int) string` + +Formats a duration given in **milliseconds** as a human-readable string. + +| Range | Example | +|-------|---------| +| `< 1000ms` | `"500ms"` | +| `1000ms – < 60s` | `"1.5s"` | +| `≥ 60s` | `"1m30s"` | + +```go +timeutil.FormatDurationMs(500) // "500ms" +timeutil.FormatDurationMs(1500) // "1.5s" +timeutil.FormatDurationMs(90000) // "1m30s" +``` + +### `FormatDurationNs(ns int64) string` + +Formats a duration given in **nanoseconds** as a human-readable string. Returns `"—"` for zero or negative values. Uses Go's standard `time.Duration.Round(time.Second)` for output. + +```go +timeutil.FormatDurationNs(0) // "—" +timeutil.FormatDurationNs(2_500_000_000) // "2s" +timeutil.FormatDurationNs(90_000_000_000) // "1m30s" +``` + +## Design Notes + +- `FormatDuration` is used by the `logger` package to display time-diff between consecutive log calls (the `+500ms` suffix in debug output). +- `FormatDurationMs` is used for workflow run duration display, where GitHub Actions reports durations in milliseconds. +- `FormatDurationNs` is used for job duration display, where GitHub Actions reports billing durations in nanoseconds. diff --git a/pkg/tty/README.md b/pkg/tty/README.md new file mode 100644 index 00000000000..6b758d08240 --- /dev/null +++ b/pkg/tty/README.md @@ -0,0 +1,40 @@ +# tty Package + +The `tty` package provides TTY (terminal) detection utilities. + +## Overview + +This package exposes two simple functions for checking whether the standard output or error streams are connected to a real terminal. The detection uses `golang.org/x/term`, which is the same library used by the spinner and progress-bar components in this codebase. + +On WebAssembly targets (`js/wasm`) the package provides stub implementations that always return `false`, since WASM environments do not have real TTY file descriptors. + +## Functions + +### `IsStdoutTerminal() bool` + +Returns `true` if `stdout` (`os.Stdout`) is connected to a terminal. + +```go +import "github.com/github/gh-aw/pkg/tty" + +if tty.IsStdoutTerminal() { + // Safe to emit colored or animated output to stdout +} +``` + +### `IsStderrTerminal() bool` + +Returns `true` if `stderr` (`os.Stderr`) is connected to a terminal. + +```go +if tty.IsStderrTerminal() { + // Safe to emit colored or animated output to stderr +} +``` + +## Design Notes + +- Terminal detection is evaluated at call time, not cached. This is intentional: the streams could be redirected between calls in some testing scenarios. +- The WASM stub (`tty_wasm.go`) always returns `false` so that components built for the browser never attempt to use ANSI escape codes. +- Prefer this package over calling `term.IsTerminal` directly to keep the TTY detection logic centralized and easily testable. +- Components that need to adapt output for terminals (spinners, progress bars, colored messages) should call `IsStderrTerminal()` rather than checking `os.Stderr` directly. diff --git a/pkg/types/README.md b/pkg/types/README.md new file mode 100644 index 00000000000..28fd060c788 --- /dev/null +++ b/pkg/types/README.md @@ -0,0 +1,109 @@ +# types Package + +The `types` package provides shared type definitions used across multiple `gh-aw` packages to avoid circular dependencies. + +## Overview + +This package defines common data structures that are shared between the `parser` and `workflow` packages. Centralizing these types here allows both packages to reference the same definitions without creating import cycles. + +## Types + +### `BaseMCPServerConfig` + +The foundational configuration structure for MCP (Model Context Protocol) servers. This type is embedded by both `parser.MCPServerConfig` and `workflow.MCPServerConfig`. + +MCP servers can run as: +- **stdio processes**: `Command` + `Args`, launched as a child process. +- **HTTP endpoints**: `URL` + optional `Headers` and `Auth`, reached over HTTP/HTTPS. +- **Container services**: `Container` image + optional `Mounts`, run inside a container. + +```go +import "github.com/github/gh-aw/pkg/types" + +// Stdio MCP server +cfg := types.BaseMCPServerConfig{ + Type: "stdio", + Command: "npx", + Args: []string{"-y", "@modelcontextprotocol/server-filesystem"}, + Env: map[string]string{ + "ALLOWED_PATHS": "/workspace", + }, +} + +// HTTP MCP server with OIDC auth +cfg := types.BaseMCPServerConfig{ + Type: "http", + URL: "https://my-mcp-server.example.com", + Auth: &types.MCPAuthConfig{ + Type: "github-oidc", + Audience: "https://my-mcp-server.example.com", + }, +} +``` + +#### Fields + +| Field | Type | Description | +|-------|------|-------------| +| `Type` | `string` | Server type: `"stdio"`, `"http"`, `"local"`, or `"remote"` | +| `Command` | `string` | Executable to launch (stdio mode) | +| `Args` | `[]string` | Arguments passed to the command | +| `Env` | `map[string]string` | Environment variables injected into the process | +| `Version` | `string` | Optional version or tag | +| `URL` | `string` | HTTP endpoint URL (HTTP mode) | +| `Headers` | `map[string]string` | Additional HTTP headers (HTTP mode) | +| `Auth` | `*MCPAuthConfig` | Upstream authentication (HTTP mode only) | +| `Container` | `string` | Container image (container mode) | +| `Entrypoint` | `string` | Optional entrypoint override for the container | +| `EntrypointArgs` | `[]string` | Arguments passed to the container entrypoint | +| `Mounts` | `[]string` | Volume mounts in `"source:dest:mode"` format | + +### `MCPAuthConfig` + +Authentication configuration for HTTP MCP servers. When configured, the MCP gateway dynamically acquires tokens and injects them as `Authorization` headers on each outgoing request. + +```go +auth := &types.MCPAuthConfig{ + Type: "github-oidc", // Currently the only supported type + Audience: "https://my-service.example.com", +} +``` + +| Field | Type | Description | +|-------|------|-------------| +| `Type` | `string` | Auth type; currently only `"github-oidc"` is supported | +| `Audience` | `string` | OIDC token audience (`aud` claim); defaults to the server URL if omitted | + +### `TokenWeights` + +Defines custom model cost information for effective token computation. Specified under `engine.token-weights` in workflow frontmatter and stored in `aw_info.json` at runtime. + +```go +weights := types.TokenWeights{ + Multipliers: map[string]float64{ + "gpt-4o": 2.5, + }, + TokenClassWeights: &types.TokenClassWeights{ + Input: 1.0, + Output: 3.0, + }, +} +``` + +### `TokenClassWeights` + +Per-token-class weights for effective token computation. Each field corresponds to one token class; a zero value means "use the default weight". + +| Field | Token class | +|-------|-------------| +| `Input` | Standard input tokens | +| `CachedInput` | Cache-hit input tokens | +| `Output` | Generated output tokens | +| `Reasoning` | Internal reasoning tokens | +| `CacheWrite` | Cache-write tokens | + +## Design Notes + +- This package has no dependencies on other `gh-aw` packages, making it safe to import from anywhere. +- All struct fields use both `json` and `yaml` struct tags so they can be round-tripped through both serialization formats. +- `BaseMCPServerConfig` is designed to be embedded — packages add domain-specific fields and validation on top of the shared base. diff --git a/pkg/typeutil/README.md b/pkg/typeutil/README.md new file mode 100644 index 00000000000..c4c6712470c --- /dev/null +++ b/pkg/typeutil/README.md @@ -0,0 +1,77 @@ +# typeutil Package + +The `typeutil` package provides general-purpose type conversion utilities for working with heterogeneous `any` values, particularly those arising from JSON and YAML parsing. + +## Overview + +JSON and YAML parsers produce `any` values whose concrete type varies at runtime (`int`, `float64`, `string`, etc.). This package provides safe, well-documented conversion functions that handle the common cases without requiring callers to write their own type switches. + +## Functions + +### Strict Conversions + +#### `ParseIntValue(value any) (int, bool)` + +Strictly parses numeric types (`int`, `int64`, `uint64`, `float64`) to `int`. Returns `(value, true)` on success and `(0, false)` for any unrecognized or non-numeric type. + +Use this when the caller **must distinguish** a missing or invalid value from a legitimate zero (e.g. YAML config field parsing where the YAML library has already produced a typed numeric value). + +```go +v, ok := typeutil.ParseIntValue(someYAMLField) +if !ok { + return errors.New("field is missing or not an integer") +} +``` + +#### `ParseBool(m map[string]any, key string) bool` + +Extracts a boolean value from a `map[string]any` by key. Returns `false` if the map is `nil`, the key is absent, or the value is not a `bool`. + +```go +enabled := typeutil.ParseBool(config, "enabled") +``` + +### Safe Overflow Conversions + +#### `SafeUint64ToInt(u uint64) int` + +Converts `uint64` to `int`, returning `0` if the value would overflow `int`. + +#### `SafeUintToInt(u uint) int` + +Converts `uint` to `int`, returning `0` if the value would overflow `int`. Thin wrapper around `SafeUint64ToInt`. + +### Lenient Conversions + +#### `ConvertToInt(val any) int` + +Leniently converts any value to `int`, returning `0` on failure. Unlike `ParseIntValue`, this function also handles string inputs via `strconv.Atoi`, making it suitable for heterogeneous sources such as JSON metrics, log-parsed data, or user-provided configuration where a zero default on failure is acceptable. + +```go +// Works with int, int64, float64, and string inputs +count := typeutil.ConvertToInt(jsonData["count"]) +``` + +#### `ConvertToFloat(val any) float64` + +Safely converts any value (`float64`, `int`, `int64`, `string`) to `float64`, returning `0` on failure. + +```go +ratio := typeutil.ConvertToFloat(jsonData["ratio"]) +``` + +## Choosing the Right Function + +| Situation | Function to use | +|-----------|----------------| +| YAML/Go-typed numeric field; must detect missing vs zero | `ParseIntValue` | +| JSON / log-parsed metric; zero default on failure is fine | `ConvertToInt` | +| Boolean flag in a `map[string]any` | `ParseBool` | +| Casting `uint64` counter to `int` | `SafeUint64ToInt` | +| Numeric value from any source as float | `ConvertToFloat` | + +## Design Notes + +- All debug output uses `logger.New("typeutil:convert")` and is only emitted when `DEBUG=typeutil:*`. +- `float64 → int` truncation is logged at debug level when the fractional part is lost. +- `uint64 → int` overflow returns `0` rather than panicking, following the defensive convention used elsewhere in the codebase.