Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/aw/debug-agentic-workflow.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,7 @@ Report back with specific findings and actionable fixes.
> - `compile` tool → equivalent to `gh aw compile`
> - `logs` tool → equivalent to `gh aw logs`
> - `audit` tool → equivalent to `gh aw audit`
> - `checks` tool → equivalent to `gh aw checks`
> - `update` tool → equivalent to `gh aw update`
> - `add` tool → equivalent to `gh aw add`
> - `mcp-inspect` tool → equivalent to `gh aw mcp inspect`
Expand Down
16 changes: 16 additions & 0 deletions docs/src/content/docs/reference/gh-aw-as-mcp-server.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,22 @@ Investigate a workflow run, job, or specific step and generate a detailed report

Returns JSON with `overview`, `metrics`, `jobs`, `downloaded_files`, `missing_tools`, `mcp_failures`, `errors`, `warnings`, `tool_usage`, and `firewall_analysis`.

### `checks`

Classify CI check state for a pull request and return a normalized result.

- `pr_number` (required): Pull request number to classify CI checks for
- `repo` (optional): Repository in `owner/repo` format (defaults to current repository)

Returns JSON with:
- `state`: Aggregate check state across all check runs and commit statuses
- `required_state`: State derived from check runs and policy commit statuses only (ignores optional third-party statuses like Vercel/Netlify deployments)
- `pr_number`, `head_sha`, `check_runs`, `statuses`, `total_count`

Normalized states: `success`, `failed`, `pending`, `no_checks`, `policy_blocked`.

Use `required_state` as the authoritative CI verdict in repos with optional deployment integrations.

### `mcp-inspect`

Inspect MCP servers in workflows and list available tools, resources, and roots.
Expand Down
1 change: 1 addition & 0 deletions pkg/cli/mcp_server.go
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@ func createMCPServer(cmdPath string, actor string, validateActor bool) *mcp.Serv
}

// Register remaining read-only tools
registerChecksTool(server)
registerMCPInspectTool(server, execCmd)

Comment on lines 72 to 75
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In createMCPServer, checks is registered after logs/audit registration. Since createMCPServer returns early if registerLogsTool/registerAuditTool fail (e.g., schema generation errors), this read-only tool may never be available even though it doesn’t depend on those privileged tools. Consider registering checks alongside the other read-only tools before any early-return points.

Copilot uses AI. Check for mistakes.
// Register workflow management tools
Expand Down
1 change: 1 addition & 0 deletions pkg/cli/mcp_server_command.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ The server provides the following tools:
- compile - Compile Markdown workflows to GitHub Actions YAML
- logs - Download and analyze workflow logs (requires write+ access)
- audit - Investigate a workflow run, job, or step and generate a report (requires write+ access)
- checks - Classify CI check state for a pull request
- mcp-inspect - Inspect MCP servers in workflows and list available tools
- add - Add workflows from remote repositories to .github/workflows
- update - Update workflows from their source repositories
Expand Down
92 changes: 92 additions & 0 deletions pkg/cli/mcp_server_json_integration_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ import (
"os"
"os/exec"
"path/filepath"
"strings"
"testing"
"time"

Expand Down Expand Up @@ -367,6 +368,90 @@ func TestMCPServer_LogsToolReturnsValidJSON(t *testing.T) {
}
}

// TestMCPServer_ChecksToolReturnsValidJSON tests that the checks tool returns valid JSON
// (or a well-formed MCP error when GitHub credentials are unavailable in test environments).
func TestMCPServer_ChecksToolReturnsValidJSON(t *testing.T) {
// Skip if the binary doesn't exist
binaryPath := "../../gh-aw"
if _, err := os.Stat(binaryPath); os.IsNotExist(err) {
t.Skip("Skipping test: gh-aw binary not found. Run 'make build' first.")
}

session, _, ctx, cancel := setupMCPServerTest(t, binaryPath)
defer cancel()
defer session.Close()

t.Run("missing pr_number returns MCP error", func(t *testing.T) {
params := &mcp.CallToolParams{
Name: "checks",
Arguments: map[string]any{},
}
_, err := session.CallTool(ctx, params)
if err == nil {
t.Error("Expected MCP error when pr_number is missing")
} else {
t.Logf("Checks tool correctly returned error for missing pr_number: %v", err)
}
})

t.Run("valid pr_number returns JSON or auth error", func(t *testing.T) {
params := &mcp.CallToolParams{
Name: "checks",
Arguments: map[string]any{
"pr_number": "1",
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test calls checks with only pr_number while running in a temporary git repo that has no remotes configured. Even if credentials are available, gh api typically can’t resolve the current repository without a remote/--repo, so the “verify JSON structure” branch is effectively unreachable. To actually validate JSON when creds are present, either set a repo remote in setupMCPServerTest or pass a known public repo argument here.

Suggested change
"pr_number": "1",
"pr_number": "1",
"repo": "cli/cli",

Copilot uses AI. Check for mistakes.
},
}
result, err := session.CallTool(ctx, params)
if err != nil {
// Expected: GitHub credentials are not available in the test environment
t.Logf("Checks tool correctly returned error (expected without GitHub credentials): %v", err)
return
}

if len(result.Content) == 0 {
t.Fatal("Expected non-empty result from checks tool")
}

textContent, ok := result.Content[0].(*mcp.TextContent)
if !ok {
t.Fatal("Expected text content from checks tool")
}

if textContent.Text == "" {
t.Fatal("Expected non-empty text content from checks tool")
}

// In test environments without GitHub credentials, an error message is returned
if strings.HasPrefix(textContent.Text, "Error:") {
t.Logf("Checks tool returned error message (expected in test environment without GitHub credentials)")
return
}

// If credentials are available, verify JSON structure
Comment on lines +420 to +430
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After session.CallTool succeeds, this test checks for an "Error:" prefix in the returned content. But the checks tool returns failures as an MCP (jsonrpc) error (i.e., via the err return), so this branch should never be hit and may confuse future readers. Consider removing it or adjusting it to match the tool’s actual error-returning behavior.

Suggested change
if textContent.Text == "" {
t.Fatal("Expected non-empty text content from checks tool")
}
// In test environments without GitHub credentials, an error message is returned
if strings.HasPrefix(textContent.Text, "Error:") {
t.Logf("Checks tool returned error message (expected in test environment without GitHub credentials)")
return
}
// If credentials are available, verify JSON structure
if strings.TrimSpace(textContent.Text) == "" {
t.Fatal("Expected non-empty text content from checks tool")
}
// On success, the checks tool should return structured output; failures are
// reported by session.CallTool via err above.

Copilot uses AI. Check for mistakes.
jsonOutput := extractJSONFromOutput(textContent.Text)
if !isValidJSON(jsonOutput) {
t.Errorf("Checks tool did not return valid JSON. Output: %s", textContent.Text)
return
}

var checksData map[string]any
if err := json.Unmarshal([]byte(jsonOutput), &checksData); err != nil {
t.Errorf("Failed to unmarshal checks JSON: %v", err)
return
}

// Fields mirror the ChecksResult struct JSON tags defined in checks_command.go.
expectedFields := []string{"state", "required_state", "pr_number", "head_sha", "check_runs", "statuses", "total_count"}
for _, field := range expectedFields {
if _, ok := checksData[field]; !ok {
t.Errorf("Expected field '%s' not found in checks output", field)
}
}

t.Logf("Checks tool returned valid JSON with state=%v", checksData["state"])
})
}

// TestMCPServer_AllToolsReturnContent tests that all tools return non-empty content
func TestMCPServer_AllToolsReturnContent(t *testing.T) {
// Skip if the binary doesn't exist
Expand Down Expand Up @@ -417,6 +502,13 @@ func TestMCPServer_AllToolsReturnContent(t *testing.T) {
expectJSON: false, // May return error message in test environment
mayFailInTest: true, // Expected to fail without workflow runs
},
{
name: "checks",
toolName: "checks",
args: map[string]any{"pr_number": "1"},
expectJSON: false, // May return error in test environment without GitHub credentials
mayFailInTest: true, // Expected to fail without GitHub credentials
},
{
name: "mcp-inspect",
toolName: "mcp-inspect",
Expand Down
70 changes: 70 additions & 0 deletions pkg/cli/mcp_tools_readonly.go
Original file line number Diff line number Diff line change
Expand Up @@ -299,6 +299,76 @@ Returns formatted text output showing:
})
}

// registerChecksTool registers the checks tool with the MCP server.
// The checks tool is read-only and idempotent.
func registerChecksTool(server *mcp.Server) {
type checksArgs struct {
PRNumber string `json:"pr_number" jsonschema:"Pull request number to classify CI checks for"`
Repo string `json:"repo,omitempty" jsonschema:"Repository in owner/repo format (defaults to current repository)"`
}

mcp.AddTool(server, &mcp.Tool{
Name: "checks",
Annotations: &mcp.ToolAnnotations{
ReadOnlyHint: true,
IdempotentHint: true,
OpenWorldHint: boolPtr(true),
},
Description: `Classify CI check state for a pull request and return a normalized result.

Maps PR check rollups to one of the following normalized states:
success - all checks passed
failed - one or more checks failed
pending - checks are still running or queued
no_checks - no checks configured or triggered
policy_blocked - policy or account gates are blocking the PR

Returns JSON with two state fields:
state - aggregate state across all check runs and commit statuses
required_state - state derived from check runs and policy commit statuses only;
ignores optional third-party commit statuses (e.g. Vercel,
Netlify deployments) but still surfaces policy_blocked when
branch-protection or account-gate statuses fail

Use required_state as the authoritative CI verdict in repos that have optional
deployment integrations posting commit statuses alongside required CI checks.

Also returns pr_number, head_sha, check_runs, statuses, and total_count.`,
Icons: []mcp.Icon{
{Source: "✅"},
},
}, func(ctx context.Context, req *mcp.CallToolRequest, args checksArgs) (*mcp.CallToolResult, any, error) {
// Check for cancellation before starting
select {
case <-ctx.Done():
return nil, nil, newMCPError(jsonrpc.CodeInternalError, "request cancelled", ctx.Err().Error())
default:
}

if args.PRNumber == "" {
return nil, nil, newMCPError(jsonrpc.CodeInvalidParams, "missing required parameter: pr_number", nil)
}

mcpLog.Printf("Executing checks tool: pr_number=%s, repo=%s", args.PRNumber, args.Repo)

result, err := FetchChecksResult(args.Repo, args.PRNumber)
if err != nil {
return nil, nil, newMCPError(jsonrpc.CodeInternalError, "failed to fetch checks", map[string]any{"error": err.Error()})
Comment on lines +352 to +356
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description claims this tool avoids subprocesses, but FetchChecksResult ultimately shells out to the gh CLI via workflow.ExecGH (checks_command.go:186). If the goal is specifically to avoid calling gh aw checks that’s fine, but the description (and/or tool docs) should be clarified to avoid implying a pure in-process implementation.

Copilot uses AI. Check for mistakes.
}

jsonBytes, err := json.Marshal(result)
if err != nil {
return nil, nil, newMCPError(jsonrpc.CodeInternalError, "failed to marshal checks result", map[string]any{"error": err.Error()})
}

return &mcp.CallToolResult{
Content: []mcp.Content{
&mcp.TextContent{Text: string(jsonBytes)},
},
}, nil, nil
})
}

// buildDockerErrorResults builds a []ValidationResult with a config_error for each target
// workflow. It is used when Docker is unavailable so the compile tool returns consistent
// structured JSON instead of a protocol-level error.
Expand Down
Loading