Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 80 additions & 0 deletions docs/adr/28452-workflow-exclusion-filter-for-logs-command.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# ADR-28452: Pre-Download Workflow Exclusion Filter for the Logs Command

**Date**: 2026-04-25
**Status**: Draft
**Deciders**: pelikhan, Copilot

---

## Part 1 — Narrative (Human-Friendly)

### Context

The `gh aw logs` command downloads workflow run artifacts and logs for analysis. As the number of workflows in a repository grows, users frequently need to download logs for only a subset of workflows — for example, skipping noisy CI workflows that are not relevant to a particular investigation. Without an exclusion mechanism, every `logs` invocation downloads artifacts for all matching runs, consuming unnecessary GitHub API quota and network bandwidth. There is already a `--workflow` flag for positive selection, but no complement for negative selection.

### Decision

We will add a `--exclude` (StringSlice) flag to the `logs` CLI command and a corresponding `exclude_workflows` field to the MCP `logs` tool struct. Excluded workflows are resolved to their canonical display names via lock files and then filtered out of each batch of runs **before** `downloadRunArtifactsConcurrent` is called. This pre-download filtering ensures no API requests or download traffic are incurred for excluded workflows. Matching is case-insensitive to tolerate capitalization differences between user input and stored workflow names.

### Alternatives Considered

#### Alternative 1: Post-Download Filtering

Filter the downloaded run list after artifacts have already been fetched. This approach is simpler to implement because it does not require threading the exclude list through the batch loop. It was rejected because it defeats the stated goal of saving API requests and download bandwidth — the excluded artifacts would still be fetched before being discarded.

#### Alternative 2: Deny-List Configuration File

Store excluded workflows in a persistent configuration file (e.g., `.github/aw/logs-config.yml`) rather than as a command-line flag. This would allow users to set standing exclusions without repeating the flag on every invocation. It was not chosen for this PR because it introduces config file discovery, parsing, and merge-with-CLI-flag semantics that add scope beyond the immediate need. A config file could be added in a future iteration.

#### Alternative 3: Negate the Existing `--workflow` Positive Filter

Extend the existing `--workflow` flag with a negation prefix (e.g., `--workflow !ci-tests`). This was considered because it avoids adding a new flag. It was not chosen because mixing positive and negative selectors in a single flag is less ergonomic and more error-prone than two separate, clearly named flags (`--workflow` / `--exclude`).

### Consequences

#### Positive
- Excluded runs are skipped before any artifact download, preserving GitHub API quota and reducing network traffic proportionally to the number of excluded runs.
- Case-insensitive matching and lock-file resolution make the flag resilient to minor naming discrepancies between user input and stored display names.
- The MCP `logs` tool gains the same capability through a new `exclude_workflows` array field, keeping CLI and MCP interfaces in parity.

#### Negative
- `DownloadWorkflowLogs` gains yet another parameter, worsening an already long function signature. This may be a motivation for future refactoring of the signature into a struct.
- If lock files are absent or out of date, `resolveExcludeWorkflows` silently falls back to raw-value matching, which may produce unexpected results without a clear error to the user.
- The exclusion filter is applied per batch, not globally across the full pagination history, which means the effective run count returned may be less than `--count` when many runs are excluded in a batch.

#### Neutral
- All existing callers of `DownloadWorkflowLogs` must pass a new `nil` argument for `excludeWorkflows`, requiring updates across test files.
- Verbose mode (`--verbose`) now prints a summary of active exclude filters and individual skipped run IDs to stderr.

---

## Part 2 — Normative Specification (RFC 2119)

> The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHALL**, **SHALL NOT**, **SHOULD**, **SHOULD NOT**, **RECOMMENDED**, **MAY**, and **OPTIONAL** in this section are to be interpreted as described in [RFC 2119](https://www.rfc-editor.org/rfc/rfc2119).

### Exclusion Flag and Parameter

1. The `logs` CLI command **MUST** expose an `--exclude` flag of type `StringSlice` that accepts one or more workflow names or IDs, either comma-separated or via repeated flags.
2. The MCP `logs` tool **MUST** expose an `exclude_workflows` array field in its argument struct with a JSON schema description.
3. Implementations **MUST** pass the exclude list to `DownloadWorkflowLogs` as the `excludeWorkflows []string` parameter.

### Resolution of Workflow Names

1. Implementations **MUST** attempt to resolve each entry in the exclude list to a canonical display name via lock files before applying the filter.
2. Implementations **MUST** fall back to the raw value for case-insensitive matching when lock-file resolution fails.
3. Implementations **MUST NOT** abort or return an error when a workflow name cannot be resolved; the raw value **MUST** be used as the fallback.

### Pre-Download Filtering

1. Implementations **MUST** apply the exclusion filter to each batch of `WorkflowRun` objects **before** calling `downloadRunArtifactsConcurrent`.
2. Implementations **MUST NOT** download any artifacts or logs for runs whose `WorkflowName` matches an entry in the resolved exclude list.
3. Matching **MUST** be performed case-insensitively (i.e., `strings.ToLower` on both sides of the comparison).
4. Implementations **SHOULD** log each skipped run at the debug level, and **SHOULD** print each skipped run to stderr when verbose mode is active.

### Conformance

An implementation is considered conformant with this ADR if it satisfies all **MUST** and **MUST NOT** requirements above. Failure to meet any **MUST** or **MUST NOT** requirement constitutes non-conformance.

---

*This is a DRAFT ADR generated by the [Design Decision Gate](https://github.com/github/gh-aw/actions/runs/24930818842) workflow. The PR author must review, complete, and finalize this document before the PR can merge.*
4 changes: 2 additions & 2 deletions pkg/cli/context_cancellation_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ func TestDownloadWorkflowLogsWithCancellation(t *testing.T) {
cancel()

// Try to download logs with a cancelled context
err := DownloadWorkflowLogs(ctx, "", 10, "", "", "/tmp/test-logs", "", "", 0, 0, "", false, false, false, false, false, false, false, 0, "", "", false, false, "", nil)
err := DownloadWorkflowLogs(ctx, "", 10, "", "", "/tmp/test-logs", "", "", 0, 0, "", false, false, false, false, false, false, false, 0, "", "", false, false, "", nil, nil)

// Should return context.Canceled error
assert.ErrorIs(t, err, context.Canceled, "Should return context.Canceled error when context is cancelled")
Expand Down Expand Up @@ -111,7 +111,7 @@ func TestDownloadWorkflowLogsTimeoutRespected(t *testing.T) {

start := time.Now()
// Use a workflow name that doesn't exist to avoid actual network calls
_ = DownloadWorkflowLogs(ctx, "nonexistent-workflow-12345", 100, "", "", "/tmp/test-logs", "", "", 0, 0, "", false, false, false, false, false, false, false, 1, "", "", false, false, "", nil)
_ = DownloadWorkflowLogs(ctx, "nonexistent-workflow-12345", 100, "", "", "/tmp/test-logs", "", "", 0, 0, "", false, false, false, false, false, false, false, 1, "", "", false, false, "", nil, nil)
elapsed := time.Since(start)

// Should complete within reasonable time (give 5 seconds buffer for test overhead)
Expand Down
1 change: 1 addition & 0 deletions pkg/cli/logs_ci_scenario_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ func TestLogsJSONOutputWithNoRuns(t *testing.T) {
false, // train
"", // format
nil, // artifactSets
nil, // excludeWorkflows
)

// Restore stdout and read output
Expand Down
34 changes: 33 additions & 1 deletion pkg/cli/logs_command.go
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,10 @@ Examples:
` + string(constants.CLIExtensionPrefix) + ` logs --filtered-integrity # Filter logs containing items that were filtered by gateway integrity checks
` + string(constants.CLIExtensionPrefix) + ` logs --no-staged # Exclude staged workflow runs from results

# Excluding specific workflows
` + string(constants.CLIExtensionPrefix) + ` logs --exclude ci-tests # Exclude a specific workflow from results
` + string(constants.CLIExtensionPrefix) + ` logs --exclude ci-tests,nightly-build # Exclude multiple workflows

# Run ID range filtering
` + string(constants.CLIExtensionPrefix) + ` logs --after-run-id 1000 # Filter runs after run ID 1000
` + string(constants.CLIExtensionPrefix) + ` logs --before-run-id 2000 # Filter runs before run ID 2000
Expand Down Expand Up @@ -155,6 +159,7 @@ Examples:
train, _ := cmd.Flags().GetBool("train")
format, _ := cmd.Flags().GetString("format")
artifacts, _ := cmd.Flags().GetStringSlice("artifacts")
excludeWorkflows, _ := cmd.Flags().GetStringSlice("exclude")

// Resolve relative dates to absolute dates for GitHub CLI
now := time.Now()
Expand Down Expand Up @@ -189,7 +194,33 @@ Examples:

logsCommandLog.Printf("Executing logs download: workflow=%s, count=%d, engine=%s, train=%v", workflowName, count, engine, train)

return DownloadWorkflowLogs(cmd.Context(), workflowName, count, startDate, endDate, outputDir, engine, ref, beforeRunID, afterRunID, repoOverride, verbose, toolGraph, noStaged, firewallOnly, noFirewall, parse, jsonOutput, timeout, summaryFile, safeOutputType, filteredIntegrity, train, format, artifacts)
return DownloadWorkflowLogsWithOptions(cmd.Context(), DownloadWorkflowLogsOptions{
WorkflowName: workflowName,
Count: count,
StartDate: startDate,
EndDate: endDate,
OutputDir: outputDir,
Engine: engine,
Ref: ref,
BeforeRunID: beforeRunID,
AfterRunID: afterRunID,
RepoOverride: repoOverride,
Verbose: verbose,
ToolGraph: toolGraph,
NoStaged: noStaged,
FirewallOnly: firewallOnly,
NoFirewall: noFirewall,
Parse: parse,
JSONOutput: jsonOutput,
Timeout: timeout,
SummaryFile: summaryFile,
SafeOutputType: safeOutputType,
FilteredIntegrity: filteredIntegrity,
Train: train,
Format: format,
ArtifactSets: artifacts,
ExcludeWorkflows: excludeWorkflows,
})
},
}

Expand Down Expand Up @@ -217,6 +248,7 @@ Examples:
logsCmd.Flags().String("format", "", "Output format for cross-run audit report: pretty, markdown (generates security audit report instead of default metrics table)")
logsCmd.Flags().Int("last", 0, "Alias for --count: number of recent runs to download")
logsCmd.Flags().StringSlice("artifacts", nil, "Artifact sets to download (default: all). Valid sets: "+strings.Join(ValidArtifactSetNames(), ", "))
logsCmd.Flags().StringSlice("exclude", nil, "Workflow names or IDs to exclude from results (comma-separated or repeated flag). Excluded runs are skipped before artifact download, saving API resources.")
logsCmd.MarkFlagsMutuallyExclusive("firewall", "no-firewall")

// Register completions for logs command
Expand Down
77 changes: 77 additions & 0 deletions pkg/cli/logs_command_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -264,3 +264,80 @@ func TestLogsCommandHelpText(t *testing.T) {
assert.Contains(t, safeOutputFlag.Usage, "noop", "safe-output flag help should mention noop")
assert.Contains(t, safeOutputFlag.Usage, "report-incomplete", "safe-output flag help should mention report-incomplete")
}

func TestLogsCommandExcludeFlag(t *testing.T) {
cmd := NewLogsCommand()
flags := cmd.Flags()

// Verify the --exclude flag exists
excludeFlag := flags.Lookup("exclude")
require.NotNil(t, excludeFlag, "Should have 'exclude' flag")
assert.Equal(t, "stringSlice", excludeFlag.Value.Type(), "exclude should be stringSlice type")
assert.Equal(t, "[]", excludeFlag.DefValue, "exclude should default to empty slice")
assert.Contains(t, excludeFlag.Usage, "exclude", "exclude flag usage should describe its purpose")
}

func TestIsWorkflowExcluded(t *testing.T) {
tests := []struct {
name string
workflowName string
excludes []string
expected bool
}{
{
name: "exact match",
workflowName: "Weekly Research",
excludes: []string{"Weekly Research"},
expected: true,
},
{
name: "case-insensitive match",
workflowName: "Weekly Research",
excludes: []string{"weekly research"},
expected: true,
},
{
name: "slug match - id matches display name",
workflowName: "Weekly Research",
excludes: []string{"weekly-research"},
expected: true,
},
{
name: "slug match - multi-word id",
workflowName: "CI Failure Doctor",
excludes: []string{"ci-failure-doctor"},
expected: true,
},
{
name: "no match",
workflowName: "Weekly Research",
excludes: []string{"Nightly Build"},
expected: false,
},
{
name: "empty excludes",
workflowName: "Weekly Research",
excludes: nil,
expected: false,
},
{
name: "multiple excludes, one matches",
workflowName: "CI Tests",
excludes: []string{"Weekly Research", "CI Tests", "Nightly Build"},
expected: true,
},
{
name: "multiple excludes, none match",
workflowName: "My Workflow",
excludes: []string{"Weekly Research", "CI Tests"},
expected: false,
},
}

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := isWorkflowExcluded(tt.workflowName, tt.excludes)
assert.Equal(t, tt.expected, result, "isWorkflowExcluded(%q, %v) should be %v", tt.workflowName, tt.excludes, tt.expected)
})
}
}
4 changes: 2 additions & 2 deletions pkg/cli/logs_download_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ func TestDownloadWorkflowLogs(t *testing.T) {
// Test the DownloadWorkflowLogs function
// This should either fail with auth error (if not authenticated)
// or succeed with no results (if authenticated but no workflows match)
err := DownloadWorkflowLogs(context.Background(), "", 1, "", "", "./test-logs", "", "", 0, 0, "", false, false, false, false, false, false, false, 0, "summary.json", "", false, false, "", nil)
err := DownloadWorkflowLogs(context.Background(), "", 1, "", "", "./test-logs", "", "", 0, 0, "", false, false, false, false, false, false, false, 0, "summary.json", "", false, false, "", nil, nil)

// If GitHub CLI is authenticated, the function may succeed but find no results
// If not authenticated, it should return an auth error
Expand Down Expand Up @@ -393,7 +393,7 @@ func TestDownloadWorkflowLogsWithEngineFilter(t *testing.T) {
if !tt.expectError {
// For valid engines, test that the function can be called without panic
// It may still fail with auth errors, which is expected
err := DownloadWorkflowLogs(context.Background(), "", 1, "", "", "./test-logs", tt.engine, "", 0, 0, "", false, false, false, false, false, false, false, 0, "summary.json", "", false, false, "", nil)
err := DownloadWorkflowLogs(context.Background(), "", 1, "", "", "./test-logs", tt.engine, "", 0, 0, "", false, false, false, false, false, false, false, 0, "summary.json", "", false, false, "", nil, nil)

// Clean up any created directories
os.RemoveAll("./test-logs")
Expand Down
2 changes: 2 additions & 0 deletions pkg/cli/logs_json_stderr_order_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ func TestLogsJSONOutputBeforeStderr(t *testing.T) {
false, // train
"", // format
nil, // artifactSets
nil, // excludeWorkflows
)

// Close writers first
Expand Down Expand Up @@ -187,6 +188,7 @@ func TestLogsJSONAndStderrRedirected(t *testing.T) {
false, // train
"", // format
nil, // artifactSets
nil, // excludeWorkflows
)

// Close the writer
Expand Down
Loading