diff --git a/README.md b/README.md index 82a78e35..bc2c89ad 100644 --- a/README.md +++ b/README.md @@ -147,10 +147,25 @@ See **[docs/CONFIGURATION.md](docs/CONFIGURATION.md)** for the complete mapping Supported MCP methods: `tools/list`, `tools/call`, and any other method (forwarded as-is). +## Proxy Mode + +The gateway can also run as an HTTP forward proxy (`awmg proxy`) that intercepts GitHub API requests from tools like `gh` CLI and applies the same DIFC filtering: + +```bash +awmg proxy \ + --guard-wasm guards/github-guard/github_guard.wasm \ + --policy '{"allow-only":{"repos":["org/repo"],"min-integrity":"approved"}}' \ + --github-token "$GITHUB_TOKEN" \ + --listen localhost:8080 +``` + +This maps ~25 REST URL patterns and GraphQL queries to guard tool names, then runs the same 6-phase DIFC pipeline used by the MCP gateway. See [docs/PROXY_MODE.md](docs/PROXY_MODE.md) for full documentation. + ## Further Reading | Topic | Link | |-------|------| +| **Proxy Mode** | [docs/PROXY_MODE.md](docs/PROXY_MODE.md) — HTTP forward proxy for DIFC filtering of `gh` CLI and REST/GraphQL requests | | **Configuration Reference** | [docs/CONFIGURATION.md](docs/CONFIGURATION.md) — Server fields, TOML/JSON formats, guard-policy details, custom schemas, gateway fields, validation rules | | **Environment Variables** | [docs/ENVIRONMENT_VARIABLES.md](docs/ENVIRONMENT_VARIABLES.md) — All env vars for production, development, Docker, and guard configuration | | **Full Specification** | [MCP Gateway Configuration Reference](https://github.com/github/gh-aw/blob/main/docs/src/content/docs/reference/mcp-gateway.md) — Upstream spec with complete validation rules | diff --git a/docs/PROXY_MODE.md b/docs/PROXY_MODE.md new file mode 100644 index 00000000..8a366da9 --- /dev/null +++ b/docs/PROXY_MODE.md @@ -0,0 +1,145 @@ +# Proxy Mode + +Proxy mode (`awmg proxy`) is an HTTP forward proxy that intercepts GitHub API requests and applies DIFC (Data Information Flow Control) filtering using the same guard WASM module as the MCP gateway. + +## Motivation + +The MCP gateway enforces DIFC on MCP tool calls, but tools that call the GitHub API directly — such as `gh api`, `gh issue list`, or raw `curl` — bypass it entirely. Proxy mode closes this gap by sitting between the HTTP client and `api.github.com`, applying guard policies to REST and GraphQL requests. + +## Quick Start + +```bash +# Start the proxy +awmg proxy \ + --guard-wasm guards/github-guard/github_guard.wasm \ + --policy '{"allow-only":{"repos":["org/repo"],"min-integrity":"approved"}}' \ + --github-token "$GITHUB_TOKEN" \ + --listen localhost:8080 + +# Point gh CLI at the proxy +GH_HOST=localhost:8080 GH_TOKEN="$GITHUB_TOKEN" gh issue list -R org/repo + +# Or use curl directly +curl -H "Authorization: token $GITHUB_TOKEN" \ + http://localhost:8080/api/v3/repos/org/repo/issues +``` + +## How It Works + +``` +HTTP client → awmg proxy (localhost:8080) → api.github.com + ↓ + 6-phase DIFC pipeline + (same guard WASM module) +``` + +1. The proxy receives an HTTP request (REST GET or GraphQL POST) +2. It maps the URL/query to a guard tool name (e.g., `/repos/:owner/:repo/issues` → `list_issues`) +3. The guard WASM module evaluates access based on the configured policy +4. If allowed, the request is forwarded to `api.github.com` +5. The response is filtered per-item based on secrecy/integrity labels +6. The filtered response is returned to the client + +Write operations (PUT, POST, DELETE, PATCH) pass through unmodified. + +## Flags + +| Flag | Default | Description | +|------|---------|-------------| +| `--guard-wasm` | *(required)* | Path to the guard WASM module | +| `--policy` | | Guard policy JSON (e.g., `{"allow-only":{"repos":["org/repo"]}}`) | +| `--github-token` | `$GITHUB_TOKEN` | GitHub API token for upstream requests | +| `--listen` / `-l` | `127.0.0.1:8080` | HTTP listen address | +| `--log-dir` | `/tmp/gh-aw/mcp-logs` | Log file directory | +| `--guards-mode` | `filter` | DIFC mode: `strict`, `filter`, or `propagate` | +| `--github-api-url` | `https://api.github.com` | Upstream GitHub API URL | + +## DIFC Pipeline + +The proxy reuses the same 6-phase pipeline as the MCP gateway, with Phase 3 adapted for HTTP forwarding: + +| Phase | Description | Shared with Gateway? | +|-------|-------------|---------------------| +| **0** | Extract agent labels from registry | ✅ | +| **1** | `Guard.LabelResource()` — coarse access check | ✅ | +| **2** | `Evaluator.Evaluate()` — secrecy/integrity evaluation | ✅ | +| **3** | Forward request to GitHub API | ❌ Proxy-specific | +| **4** | `Guard.LabelResponse()` — per-item labeling | ✅ | +| **5** | `Evaluator.FilterCollection()` — fine-grained filtering | ✅ | + +## REST Route Mapping + +The proxy maps ~25 GitHub REST API URL patterns to guard tool names: + +| URL Pattern | Guard Tool | +|-------------|-----------| +| `/repos/:owner/:repo/issues` | `list_issues` | +| `/repos/:owner/:repo/issues/:number` | `get_issue` | +| `/repos/:owner/:repo/pulls` | `list_pull_requests` | +| `/repos/:owner/:repo/pulls/:number` | `get_pull_request` | +| `/repos/:owner/:repo/commits` | `list_commits` | +| `/repos/:owner/:repo/commits/:sha` | `get_commit` | +| `/repos/:owner/:repo/contents/:path` | `get_file_contents` | +| `/repos/:owner/:repo/branches` | `list_branches` | +| `/repos/:owner/:repo/releases` | `list_releases` | +| `/search/issues` | `search_issues` | +| `/search/code` | `search_code` | +| `/search/repositories` | `search_repositories` | +| `/user` | `get_me` | +| ... | See `internal/proxy/router.go` for full list | + +Unrecognized URLs pass through without DIFC filtering. + +## GraphQL Support + +GraphQL queries to `/graphql` are parsed to extract the operation type and owner/repo context: + +- **Repository-scoped queries** (issues, PRs, commits) — mapped to corresponding tool names +- **Search queries** — mapped to `search_issues` or `search_code` +- **Viewer queries** — mapped to `get_me` +- **Unknown queries** — passed through without filtering + +Owner and repo are extracted from GraphQL variables (`$owner`, `$name`/`$repo`) or inline string arguments. + +## Policy Notes + +- **Repo names must be lowercase** in policies (e.g., `octocat/hello-world` not `octocat/Hello-World`). The guard performs case-insensitive matching against actual GitHub data. +- All policy formats supported by the MCP gateway work identically in proxy mode: + - Specific repos: `{"allow-only":{"repos":["org/repo"]}}` + - Owner wildcards: `{"allow-only":{"repos":["org/*"]}}` + - Multiple repos: `{"allow-only":{"repos":["org/repo1","org/repo2"]}}` + - Integrity filtering: `{"allow-only":{"repos":["org/repo"],"min-integrity":"approved"}}` + +## Container Usage + +The proxy is included in the same container image as the MCP gateway: + +```bash +docker run --rm \ + --entrypoint /app/awmg \ + -p 8080:8080 \ + -e GITHUB_TOKEN \ + ghcr.io/github/gh-aw-mcpg:latest \ + proxy \ + --guard-wasm /guards/github/00-github-guard.wasm \ + --policy '{"allow-only":{"repos":["org/repo"],"min-integrity":"none"}}' \ + --github-token "$GITHUB_TOKEN" \ + --listen 0.0.0.0:8080 \ + --guards-mode filter +``` + +Note: The container entrypoint defaults to `run_containerized.sh` (MCP gateway mode). Use `--entrypoint /app/awmg` to run proxy mode directly. + +## Guards Mode + +| Mode | Behavior | +|------|----------| +| `strict` | Blocks entire response if any items are filtered | +| `filter` | Removes filtered items, returns remaining (default) | +| `propagate` | Labels accumulate on the agent; no filtering | + +## Known Limitations + +- **gh CLI HTTPS requirement**: `gh` forces HTTPS when connecting to `GH_HOST`. The proxy serves plain HTTP, so direct `gh` CLI interception requires a TLS-terminating reverse proxy in front. Use `curl` or `gh api --hostname` with HTTP for testing. +- **GraphQL nested filtering**: Deeply nested GraphQL response structures depend on guard support for item-level labeling. +- **Read-only filtering**: Only GET requests and GraphQL POST queries are filtered. Write operations pass through unmodified. diff --git a/internal/cmd/proxy.go b/internal/cmd/proxy.go new file mode 100644 index 00000000..5742d847 --- /dev/null +++ b/internal/cmd/proxy.go @@ -0,0 +1,141 @@ +package cmd + +import ( + "fmt" + "log" + "net" + "net/http" + "os" + "os/signal" + "syscall" + + "github.com/github/gh-aw-mcpg/internal/logger" + "github.com/github/gh-aw-mcpg/internal/proxy" + "github.com/spf13/cobra" +) + +// Proxy subcommand flag variables +var ( + proxyGuardWasm string + proxyPolicy string + proxyToken string + proxyListen string + proxyLogDir string + proxyDIFCMode string + proxyAPIURL string +) + +func init() { + rootCmd.AddCommand(newProxyCmd()) +} + +func newProxyCmd() *cobra.Command { + cmd := &cobra.Command{ + Use: "proxy", + Short: "Run as a GitHub API filtering proxy", + Long: `Run the gateway in proxy mode — an HTTP forward proxy that intercepts +gh CLI requests and applies DIFC filtering using the same guard WASM module. + +Usage with the gh CLI: + + # Start the proxy + awmg proxy \ + --guard-wasm guards/github-guard/github_guard.wasm \ + --policy '{"allow-only":{"repos":["org/repo"],"min-integrity":"approved"}}' \ + --github-token "$GITHUB_TOKEN" \ + --listen localhost:8080 + + # Point gh at the proxy + GH_HOST=localhost:8080 GH_TOKEN="$GITHUB_TOKEN" gh issue list -R org/repo`, + SilenceUsage: true, + RunE: runProxy, + } + + cmd.Flags().StringVar(&proxyGuardWasm, "guard-wasm", "", "Path to the guard WASM module (required)") + cmd.Flags().StringVar(&proxyPolicy, "policy", getDefaultGuardPolicyJSON(), "Guard policy JSON") + cmd.Flags().StringVar(&proxyToken, "github-token", os.Getenv("GITHUB_TOKEN"), "GitHub API token") + cmd.Flags().StringVarP(&proxyListen, "listen", "l", "127.0.0.1:8080", "HTTP proxy listen address") + cmd.Flags().StringVar(&proxyLogDir, "log-dir", getDefaultLogDir(), "Log file directory") + cmd.Flags().StringVar(&proxyDIFCMode, "guards-mode", "filter", "DIFC enforcement mode: strict, filter, propagate") + cmd.Flags().StringVar(&proxyAPIURL, "github-api-url", proxy.DefaultGitHubAPIBase, "Upstream GitHub API URL") + + cmd.MarkFlagRequired("guard-wasm") + + return cmd +} + +func runProxy(cmd *cobra.Command, args []string) error { + ctx, cancel := signal.NotifyContext(cmd.Context(), os.Interrupt, syscall.SIGTERM) + defer cancel() + + // Initialize loggers + if err := logger.InitFileLogger(proxyLogDir, "proxy.log"); err != nil { + log.Printf("Warning: Failed to initialize file logger: %v", err) + } + if err := logger.InitJSONLLogger(proxyLogDir, "proxy-rpc.jsonl"); err != nil { + log.Printf("Warning: Failed to initialize JSONL logger: %v", err) + } + + logger.LogInfo("startup", "MCPG Proxy starting: listen=%s, guard=%s, mode=%s", proxyListen, proxyGuardWasm, proxyDIFCMode) + + // Resolve GitHub token + token := proxyToken + if token == "" { + token = os.Getenv("GH_TOKEN") + } + if token == "" { + token = os.Getenv("GITHUB_PERSONAL_ACCESS_TOKEN") + } + + // Create the proxy server + proxySrv, err := proxy.New(ctx, proxy.Config{ + WasmPath: proxyGuardWasm, + Policy: proxyPolicy, + GitHubToken: token, + GitHubAPIURL: proxyAPIURL, + DIFCMode: proxyDIFCMode, + }) + if err != nil { + return fmt.Errorf("failed to create proxy server: %w", err) + } + + // Create and start the HTTP server + httpServer := &http.Server{ + Addr: proxyListen, + Handler: proxySrv.Handler(), + } + + // Start HTTP server in background + go func() { + listener, err := net.Listen("tcp", proxyListen) + if err != nil { + log.Printf("Failed to listen on %s: %v", proxyListen, err) + cancel() + return + } + + actualAddr := listener.Addr().String() + log.Printf("MCPG Proxy listening on %s", actualAddr) + logger.LogInfo("startup", "Proxy listening on %s", actualAddr) + + // Print connection info + fmt.Fprintf(os.Stderr, "\nMCPG GitHub API Proxy\n") + fmt.Fprintf(os.Stderr, " Listening: %s\n", actualAddr) + fmt.Fprintf(os.Stderr, " Mode: %s\n", proxyDIFCMode) + fmt.Fprintf(os.Stderr, " Guard: %s\n", proxyGuardWasm) + fmt.Fprintf(os.Stderr, "\nConnect with:\n") + fmt.Fprintf(os.Stderr, " GH_HOST=%s GH_TOKEN= gh ...\n\n", actualAddr) + + if err := httpServer.Serve(listener); err != nil && err != http.ErrServerClosed { + log.Printf("HTTP server error: %v", err) + cancel() + } + }() + + // Wait for shutdown signal + <-ctx.Done() + log.Println("Shutting down proxy...") + logger.LogInfo("shutdown", "Proxy shutting down") + + return httpServer.Close() +} diff --git a/internal/proxy/graphql.go b/internal/proxy/graphql.go new file mode 100644 index 00000000..9ce3edb7 --- /dev/null +++ b/internal/proxy/graphql.go @@ -0,0 +1,167 @@ +package proxy + +import ( + "encoding/json" + "regexp" + "strings" + + "github.com/github/gh-aw-mcpg/internal/logger" +) + +var logGraphQL = logger.New("proxy:graphql") + +// GraphQLRequest represents a parsed GraphQL request body. +type GraphQLRequest struct { + Query string `json:"query"` + Variables map[string]interface{} `json:"variables,omitempty"` +} + +// GraphQLRouteMatch contains the result of matching a GraphQL query to a guard tool name. +type GraphQLRouteMatch struct { + ToolName string + Owner string + Repo string + Args map[string]interface{} +} + +// graphqlPattern maps operation name patterns to guard tool names. +type graphqlPattern struct { + // namePattern matches the GraphQL operation name (case-insensitive) + namePattern *regexp.Regexp + // queryPattern matches content within the query string + queryPattern *regexp.Regexp + toolName string +} + +// graphqlPatterns is the ordered list of GraphQL operation → tool name mappings. +var graphqlPatterns = []graphqlPattern{ + // Issue operations (singular before plural — more specific first) + {queryPattern: regexp.MustCompile(`(?i)repository\s*\([^)]*\)\s*\{[^}]*\bissue\s*\(`), toolName: "issue_read"}, + {queryPattern: regexp.MustCompile(`(?i)repository\s*\([^)]*\)\s*\{[^}]*\bissues\s*[\({]`), toolName: "list_issues"}, + + // PR operations (singular before plural) + {queryPattern: regexp.MustCompile(`(?i)repository\s*\([^)]*\)\s*\{[^}]*\bpullRequest\s*\(`), toolName: "pull_request_read"}, + {queryPattern: regexp.MustCompile(`(?i)repository\s*\([^)]*\)\s*\{[^}]*\bpullRequests\s*[\({]`), toolName: "list_pull_requests"}, + + // Search operations + {queryPattern: regexp.MustCompile(`(?i)\bsearch\s*\(`), toolName: "search_issues"}, + + // Project operations + {queryPattern: regexp.MustCompile(`(?i)projectV2`), toolName: "list_projects"}, + + // Repository info + {queryPattern: regexp.MustCompile(`(?i)\brepository\s*\(`), toolName: "get_file_contents"}, + + // viewer { ... } is intentionally not mapped — the guard does not recognize a tool name + // with equivalent semantics for user/account data, and it may include private fields. + // Unknown GraphQL queries are blocked by the handler. +} + +// ownerRepoPattern extracts owner and repo from GraphQL variables or query text. +var ( + varOwnerPattern = regexp.MustCompile(`(?i)"owner"\s*:\s*"([^"]+)"`) + varRepoPattern = regexp.MustCompile(`(?i)"(?:name|repo)"\s*:\s*"([^"]+)"`) + // Matches: repository(owner: "X", name: "Y") or repository(owner: $owner, name: $name) + queryRepoPattern = regexp.MustCompile(`(?i)repository\s*\(\s*owner\s*:\s*(?:"([^"]+)"|\$\w+)\s*,?\s*name\s*:\s*(?:"([^"]+)"|\$\w+)`) +) + +// MatchGraphQL matches a GraphQL request body to a guard tool name. +func MatchGraphQL(body []byte) *GraphQLRouteMatch { + var gql GraphQLRequest + if err := json.Unmarshal(body, &gql); err != nil { + logGraphQL.Printf("failed to parse GraphQL request: %v", err) + return nil + } + + if gql.Query == "" { + logGraphQL.Printf("empty GraphQL query") + return nil + } + + // Match the query against known patterns + var toolName string + for _, p := range graphqlPatterns { + if p.namePattern != nil { + // Not currently used but available for operation name matching + continue + } + if p.queryPattern != nil && p.queryPattern.MatchString(gql.Query) { + toolName = p.toolName + break + } + } + + if toolName == "" { + logGraphQL.Printf("no GraphQL pattern match for query: %.100s", gql.Query) + return nil + } + + // Extract owner/repo from variables + owner, repo := extractOwnerRepo(gql.Variables, gql.Query) + + args := map[string]interface{}{} + if owner != "" { + args["owner"] = owner + } + if repo != "" { + args["repo"] = repo + } + + logGraphQL.Printf("matched GraphQL → tool=%s owner=%s repo=%s", toolName, owner, repo) + return &GraphQLRouteMatch{ + ToolName: toolName, + Owner: owner, + Repo: repo, + Args: args, + } +} + +// extractOwnerRepo extracts owner and repo from GraphQL variables and query text. +func extractOwnerRepo(variables map[string]interface{}, query string) (string, string) { + var owner, repo string + + // Try variables first + if variables != nil { + if v, ok := variables["owner"].(string); ok { + owner = v + } + if v, ok := variables["name"].(string); ok { + repo = v + } + if v, ok := variables["repo"].(string); ok && repo == "" { + repo = v + } + } + + // Fall back to parsing the query string + if owner == "" || repo == "" { + if m := queryRepoPattern.FindStringSubmatch(query); m != nil { + if m[1] != "" && owner == "" { + owner = m[1] + } + if m[2] != "" && repo == "" { + repo = m[2] + } + } + } + + // Try parsing raw variable JSON embedded in query (some gh commands inline variables) + if owner == "" { + if m := varOwnerPattern.FindStringSubmatch(query); m != nil { + owner = m[1] + } + } + if repo == "" { + if m := varRepoPattern.FindStringSubmatch(query); m != nil { + repo = m[1] + } + } + + return owner, repo +} + +// IsGraphQLPath returns true if the request path is the GraphQL endpoint. +func IsGraphQLPath(path string) bool { + cleaned := strings.TrimSuffix(path, "/") + return cleaned == "/graphql" || cleaned == "/api/v3/graphql" +} diff --git a/internal/proxy/handler.go b/internal/proxy/handler.go new file mode 100644 index 00000000..47a7118d --- /dev/null +++ b/internal/proxy/handler.go @@ -0,0 +1,340 @@ +package proxy + +import ( + "bytes" + "encoding/json" + "fmt" + "io" + "log" + "net/http" + + "github.com/github/gh-aw-mcpg/internal/difc" + "github.com/github/gh-aw-mcpg/internal/logger" +) + +var logHandler = logger.New("proxy:handler") + +// proxyHandler implements http.Handler and runs the DIFC pipeline on proxied requests. +type proxyHandler struct { + server *Server +} + +func (h *proxyHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) { + // Strip the /api/v3 prefix that GH_HOST adds + rawPath := StripGHHostPrefix(r.URL.Path) + // Preserve query string for upstream forwarding + fullPath := rawPath + if r.URL.RawQuery != "" { + fullPath = rawPath + "?" + r.URL.RawQuery + } + + logHandler.Printf("incoming %s %s", r.Method, rawPath) + + // Health check endpoint + if rawPath == "/health" || rawPath == "/healthz" { + w.Header().Set("Content-Type", "application/json") + w.WriteHeader(http.StatusOK) + json.NewEncoder(w).Encode(map[string]string{"status": "ok"}) + return + } + + // Only filter read operations (GET + GraphQL POST to /graphql) + isGraphQL := IsGraphQLPath(rawPath) + isRead := r.Method == http.MethodGet || (r.Method == http.MethodPost && isGraphQL) + if !isRead { + // Pass through write operations unmodified + h.passthrough(w, r, fullPath) + return + } + + // Route the request to a guard tool name + var toolName string + var args map[string]interface{} + var graphQLBody []byte + + if isGraphQL { + // Read and parse the GraphQL body + var err error + graphQLBody, err = io.ReadAll(r.Body) + r.Body.Close() + if err != nil { + http.Error(w, "failed to read request body", http.StatusBadRequest) + return + } + + match := MatchGraphQL(graphQLBody) + if match == nil { + // Unknown GraphQL query — fail closed: deny rather than risk leaking unfiltered data + logHandler.Printf("unknown GraphQL query, blocking request") + w.Header().Set("Content-Type", "application/json") + w.WriteHeader(http.StatusForbidden) + json.NewEncoder(w).Encode(map[string]interface{}{ + "errors": []map[string]string{{"message": "access denied: unrecognized GraphQL operation"}}, + "data": nil, + }) + return + } + toolName = match.ToolName + args = match.Args + } else { + match := MatchRoute(rawPath) + if match == nil { + // Unknown REST endpoint — fail closed: deny rather than risk leaking unfiltered data + logHandler.Printf("unknown REST endpoint %s, blocking request", rawPath) + http.Error(w, "access denied: unrecognized endpoint", http.StatusForbidden) + return + } + toolName = match.ToolName + args = match.Args + } + + // Run the DIFC pipeline + h.handleWithDIFC(w, r, fullPath, toolName, args, graphQLBody) +} + +// handleWithDIFC runs the 6-phase DIFC pipeline on a request. +func (h *proxyHandler) handleWithDIFC(w http.ResponseWriter, r *http.Request, path, toolName string, args map[string]interface{}, graphQLBody []byte) { + ctx := r.Context() + s := h.server + backend := &stubBackendCaller{} + + if !s.guardInitialized { + log.Printf("[proxy] WARNING: guard not initialized, blocking request") + http.Error(w, "proxy enforcement not configured", http.StatusServiceUnavailable) + return + } + + // **Phase 0: Get agent labels** + agentLabels := s.agentRegistry.GetOrCreate("proxy") + logHandler.Printf("[DIFC] Phase 0: agent secrecy=%v integrity=%v", + agentLabels.GetSecrecyTags(), agentLabels.GetIntegrityTags()) + + // **Phase 1: Guard labels the resource** + resource, operation, err := s.guard.LabelResource(ctx, toolName, args, backend, s.capabilities) + if err != nil { + logHandler.Printf("[DIFC] Phase 1 failed: %v", err) + // On labeling failure, fail closed to prevent enforcement bypass + http.Error(w, "resource labeling failed", http.StatusBadGateway) + return + } + + logHandler.Printf("[DIFC] Phase 1: resource=%s op=%s secrecy=%v integrity=%v", + resource.Description, operation, + resource.Secrecy.Label.GetTags(), resource.Integrity.Label.GetTags()) + + // **Phase 2: Coarse-grained access check** + evalResult := s.evaluator.Evaluate(agentLabels.Secrecy, agentLabels.Integrity, resource, operation) + + if !evalResult.IsAllowed() { + if operation == difc.OperationRead { + // Read in filter mode: skip coarse block, proceed to fine-grained filtering + logHandler.Printf("[DIFC] Phase 2: coarse check failed for read, proceeding to Phase 3") + } else { + // Write blocked + logHandler.Printf("[DIFC] Phase 2: BLOCKED %s %s — %s", r.Method, path, evalResult.Reason) + w.Header().Set("Content-Type", "application/json") + w.WriteHeader(http.StatusForbidden) + json.NewEncoder(w).Encode(map[string]string{ + "message": fmt.Sprintf("DIFC policy violation: %s", evalResult.Reason), + }) + return + } + } + + // **Phase 3: Forward to upstream GitHub API** + var resp *http.Response + if graphQLBody != nil { + resp, err = s.forwardToGitHub(ctx, http.MethodPost, "/graphql", bytes.NewReader(graphQLBody), "application/json") + } else { + resp, err = s.forwardToGitHub(ctx, r.Method, path, nil, "") + } + if err != nil { + logHandler.Printf("[DIFC] Phase 3 failed: %v", err) + http.Error(w, "upstream request failed", http.StatusBadGateway) + return + } + defer resp.Body.Close() + + // Read the response body + respBody, err := io.ReadAll(resp.Body) + if err != nil { + http.Error(w, "failed to read upstream response", http.StatusBadGateway) + return + } + + // For non-200 responses, pass through as-is + if resp.StatusCode >= 300 { + h.writeResponse(w, resp, respBody) + return + } + + // Parse the response as JSON for DIFC filtering + var responseData interface{} + if err := json.Unmarshal(respBody, &responseData); err != nil { + // Non-JSON response — pass through + logHandler.Printf("[DIFC] response is not JSON, passing through") + h.writeResponse(w, resp, respBody) + return + } + + // **Phase 4: Guard labels the response** + labeledData, err := s.guard.LabelResponse(ctx, toolName, responseData, backend, s.capabilities) + if err != nil { + logHandler.Printf("[DIFC] Phase 4 failed: %v", err) + // On labeling failure, use coarse-grained result + if evalResult.IsAllowed() { + h.writeResponse(w, resp, respBody) + } else { + h.writeEmptyResponse(w, resp, responseData) + } + return + } + + // **Phase 5: Fine-grained filtering** + var finalData interface{} + if labeledData != nil { + if collection, ok := labeledData.(*difc.CollectionLabeledData); ok { + filtered := s.evaluator.FilterCollection( + agentLabels.Secrecy, agentLabels.Integrity, collection, operation) + + logHandler.Printf("[DIFC] Phase 5: %d/%d items accessible", + filtered.GetAccessibleCount(), filtered.TotalCount) + + // Log filtered items + if filtered.GetFilteredCount() > 0 { + logHandler.Printf("[DIFC] Filtered %d items", filtered.GetFilteredCount()) + logger.LogInfo("proxy", "DIFC filtered %d/%d items for %s %s (tool=%s)", + filtered.GetFilteredCount(), filtered.TotalCount, r.Method, path, toolName) + } + + // Strict mode: block entire response if any item filtered + if s.enforcementMode == difc.EnforcementStrict && filtered.GetFilteredCount() > 0 { + logHandler.Printf("[DIFC] STRICT: blocking response — %d filtered items", filtered.GetFilteredCount()) + w.Header().Set("Content-Type", "application/json") + w.WriteHeader(http.StatusForbidden) + json.NewEncoder(w).Encode(map[string]string{ + "message": fmt.Sprintf("DIFC policy violation: %d of %d items not accessible", + filtered.GetFilteredCount(), filtered.TotalCount), + }) + return + } + + finalData, err = filtered.ToResult() + if err != nil { + logHandler.Printf("[DIFC] Phase 5 ToResult failed: %v", err) + h.writeEmptyResponse(w, resp, responseData) + return + } + } else { + // Simple labeled data — already passed coarse check + finalData, err = labeledData.ToResult() + if err != nil { + logHandler.Printf("[DIFC] Phase 5 ToResult failed: %v", err) + h.writeEmptyResponse(w, resp, responseData) + return + } + } + } else { + // No fine-grained labels — use coarse result + if evalResult.IsAllowed() { + finalData = responseData + } else { + h.writeEmptyResponse(w, resp, responseData) + return + } + } + + // **Phase 6: Label accumulation (propagate mode)** + if s.enforcementMode == difc.EnforcementPropagate && labeledData != nil { + overall := labeledData.Overall() + agentLabels.AccumulateFromRead(overall) + logHandler.Printf("[DIFC] Phase 6: accumulated labels") + } + + // Write the filtered response + filteredJSON, err := json.Marshal(finalData) + if err != nil { + http.Error(w, "failed to serialize filtered response", http.StatusInternalServerError) + return + } + + // Copy response headers + copyResponseHeaders(w, resp) + w.Header().Set("Content-Type", "application/json") + w.WriteHeader(resp.StatusCode) + w.Write(filteredJSON) +} + +// passthrough forwards a request to the upstream GitHub API without DIFC filtering. +func (h *proxyHandler) passthrough(w http.ResponseWriter, r *http.Request, path string) { + logHandler.Printf("passthrough %s %s", r.Method, path) + + var body io.Reader + if r.Body != nil { + body = r.Body + defer r.Body.Close() + } + + resp, err := h.server.forwardToGitHub(r.Context(), r.Method, path, body, r.Header.Get("Content-Type")) + if err != nil { + http.Error(w, "upstream request failed", http.StatusBadGateway) + return + } + defer resp.Body.Close() + + respBody, err := io.ReadAll(resp.Body) + if err != nil { + http.Error(w, "failed to read upstream response", http.StatusBadGateway) + return + } + + h.writeResponse(w, resp, respBody) +} + +// writeResponse writes an upstream response to the client. +func (h *proxyHandler) writeResponse(w http.ResponseWriter, resp *http.Response, body []byte) { + copyResponseHeaders(w, resp) + w.WriteHeader(resp.StatusCode) + w.Write(body) +} + +// writeEmptyResponse writes an empty JSON response matching the shape of the original data. +// originalData should be the parsed upstream response; nil or unrecognized types fall back to "[]". +// For JSON arrays it writes "[]", for GraphQL objects with a "data" key it writes {"data":null}, +// and for other JSON objects it writes "{}". +func (h *proxyHandler) writeEmptyResponse(w http.ResponseWriter, resp *http.Response, originalData interface{}) { + copyResponseHeaders(w, resp) + w.Header().Set("Content-Type", "application/json") + w.WriteHeader(resp.StatusCode) + + var empty string + switch obj := originalData.(type) { + case []interface{}: + empty = "[]" + case map[string]interface{}: + // GraphQL responses wrap their payload in a "data" key + if _, ok := obj["data"]; ok { + empty = `{"data":null}` + } else { + empty = "{}" + } + default: + empty = "[]" // safe default for nil or unknown types + } + w.Write([]byte(empty)) +} + +// copyResponseHeaders copies relevant headers from upstream to the client response. +func copyResponseHeaders(w http.ResponseWriter, resp *http.Response) { + // Copy rate limit headers + for _, h := range []string{ + "X-RateLimit-Limit", "X-RateLimit-Remaining", "X-RateLimit-Reset", + "X-RateLimit-Resource", "X-RateLimit-Used", + "Link", // pagination + "X-GitHub-Request-Id", + } { + if v := resp.Header.Get(h); v != "" { + w.Header().Set(h, v) + } + } +} diff --git a/internal/proxy/proxy.go b/internal/proxy/proxy.go new file mode 100644 index 00000000..2c2e71c2 --- /dev/null +++ b/internal/proxy/proxy.go @@ -0,0 +1,216 @@ +// Package proxy implements a filtering HTTP proxy for the GitHub API. +// It intercepts gh CLI requests (via GH_HOST redirect) and applies +// the same DIFC enforcement pipeline as the MCP gateway, reusing the +// guard WASM module, evaluator, and agent registry. +package proxy + +import ( + "context" + "crypto/tls" + "encoding/json" + "fmt" + "io" + "log" + "net/http" + "strings" + "time" + + "github.com/github/gh-aw-mcpg/internal/config" + "github.com/github/gh-aw-mcpg/internal/difc" + "github.com/github/gh-aw-mcpg/internal/guard" + "github.com/github/gh-aw-mcpg/internal/logger" +) + +var logProxy = logger.New("proxy:proxy") + +const ( + // DefaultGitHubAPIBase is the upstream GitHub API URL. + DefaultGitHubAPIBase = "https://api.github.com" + + // ghHostPathPrefix is the /api/v3/ prefix that gh adds when using GH_HOST. + ghHostPathPrefix = "/api/v3" +) + +// Server is a filtering HTTP forward proxy for the GitHub REST/GraphQL API. +// It loads the same WASM guard used by the MCP gateway and runs the 6-phase +// DIFC pipeline on every proxied response. +type Server struct { + guard guard.Guard + evaluator *difc.Evaluator + agentRegistry *difc.AgentRegistry + capabilities *difc.Capabilities + + githubToken string + githubAPIURL string // upstream base URL (no trailing slash) + + httpClient *http.Client + + // guardInitialized tracks whether LabelAgent has been called + guardInitialized bool + enforcementMode difc.EnforcementMode +} + +// Config holds the configuration for creating a proxy Server. +type Config struct { + // WasmPath is the file path to the guard WASM module. + WasmPath string + + // Policy is the guard policy JSON (e.g. {"allow-only":{...}}). + Policy string + + // GitHubToken is the token forwarded to the upstream GitHub API. + GitHubToken string + + // GitHubAPIURL overrides the upstream API base URL (default: https://api.github.com). + GitHubAPIURL string + + // DIFCMode is the enforcement mode (strict, filter, propagate). + DIFCMode string +} + +// New creates a new proxy Server from the given Config. +func New(ctx context.Context, cfg Config) (*Server, error) { + if cfg.WasmPath == "" { + return nil, fmt.Errorf("guard WASM path is required") + } + if cfg.GitHubToken == "" { + return nil, fmt.Errorf("GitHub token is required") + } + + apiURL := cfg.GitHubAPIURL + if apiURL == "" { + apiURL = DefaultGitHubAPIBase + } + apiURL = strings.TrimRight(apiURL, "/") + + // Parse enforcement mode + difcMode, err := difc.ParseEnforcementMode(cfg.DIFCMode) + if err != nil { + if cfg.DIFCMode != "" { + log.Printf("[proxy] WARNING: invalid DIFC mode %q, defaulting to filter", cfg.DIFCMode) + } + difcMode = difc.EnforcementFilter // default to filter for proxy + } + + // Load the WASM guard + g, err := guard.NewWasmGuard(ctx, "github", cfg.WasmPath, nil) + if err != nil { + return nil, fmt.Errorf("failed to load WASM guard from %s: %w", cfg.WasmPath, err) + } + + s := &Server{ + guard: g, + evaluator: difc.NewEvaluatorWithMode(difcMode), + agentRegistry: difc.NewAgentRegistryWithDefaults(nil, nil), + capabilities: difc.NewCapabilities(), + githubToken: cfg.GitHubToken, + githubAPIURL: apiURL, + enforcementMode: difcMode, + httpClient: &http.Client{ + Timeout: 60 * time.Second, + Transport: &http.Transport{ + TLSClientConfig: &tls.Config{MinVersion: tls.VersionTLS12}, + }, + }, + } + + // Initialize guard policy (LabelAgent) + if cfg.Policy != "" { + if err := s.initGuardPolicy(ctx, cfg.Policy); err != nil { + return nil, fmt.Errorf("failed to initialize guard policy: %w", err) + } + } + + return s, nil +} + +// initGuardPolicy calls LabelAgent with the provided policy JSON. +func (s *Server) initGuardPolicy(ctx context.Context, policyJSON string) error { + var policy interface{} + if err := json.Unmarshal([]byte(policyJSON), &policy); err != nil { + return fmt.Errorf("invalid policy JSON: %w", err) + } + + // Validate the policy structure + policyMap, ok := policy.(map[string]interface{}) + if !ok { + return fmt.Errorf("policy must be a JSON object") + } + guardPolicy := &config.GuardPolicy{} + if ao, hasAO := policyMap["allow-only"]; hasAO { + aoBytes, _ := json.Marshal(ao) + var allowOnly config.AllowOnlyPolicy + if err := json.Unmarshal(aoBytes, &allowOnly); err != nil { + return fmt.Errorf("invalid allow-only policy: %w", err) + } + guardPolicy.AllowOnly = &allowOnly + } + if err := config.ValidateGuardPolicy(guardPolicy); err != nil { + return fmt.Errorf("policy validation failed: %w", err) + } + + backend := &stubBackendCaller{} + result, err := s.guard.LabelAgent(ctx, policy, backend, s.capabilities) + if err != nil { + return fmt.Errorf("LabelAgent failed: %w", err) + } + + // Apply agent labels + agentLabels := s.agentRegistry.GetOrCreate("proxy") + for _, tag := range result.Agent.Secrecy { + agentLabels.AddSecrecyTag(difc.Tag(tag)) + } + for _, tag := range result.Agent.Integrity { + agentLabels.AddIntegrityTag(difc.Tag(tag)) + } + + // Parse enforcement mode from guard response + if result.DIFCMode != "" { + mode, err := difc.ParseEnforcementMode(result.DIFCMode) + if err == nil { + s.enforcementMode = mode + s.evaluator.SetMode(mode) + } + } + + s.guardInitialized = true + log.Printf("[proxy] Guard initialized: mode=%s, secrecy=%v, integrity=%v", + s.enforcementMode, result.Agent.Secrecy, result.Agent.Integrity) + + return nil +} + +// Handler returns an http.Handler for the proxy server. +func (s *Server) Handler() http.Handler { + return &proxyHandler{server: s} +} + +// stubBackendCaller is a no-op BackendCaller for the proxy. +// The guard receives the full API response in LabelResponse, so it +// does not need to make recursive backend calls. +type stubBackendCaller struct{} + +func (s *stubBackendCaller) CallTool(_ context.Context, toolName string, _ interface{}) (interface{}, error) { + logProxy.Printf("stub BackendCaller: ignoring CallTool(%s) — proxy provides full responses", toolName) + return nil, fmt.Errorf("CallTool not supported in proxy mode") +} + +// forwardToGitHub sends a request to the upstream GitHub API and returns the response body. +func (s *Server) forwardToGitHub(ctx context.Context, method, path string, body io.Reader, contentType string) (*http.Response, error) { + url := s.githubAPIURL + path + logProxy.Printf("forwarding %s %s → %s", method, path, url) + + req, err := http.NewRequestWithContext(ctx, method, url, body) + if err != nil { + return nil, fmt.Errorf("failed to create upstream request: %w", err) + } + + req.Header.Set("Authorization", "token "+s.githubToken) + req.Header.Set("Accept", "application/vnd.github+json") + req.Header.Set("User-Agent", "awmg-proxy/1.0") + if contentType != "" { + req.Header.Set("Content-Type", contentType) + } + + return s.httpClient.Do(req) +} diff --git a/internal/proxy/proxy_test.go b/internal/proxy/proxy_test.go new file mode 100644 index 00000000..ef61aed2 --- /dev/null +++ b/internal/proxy/proxy_test.go @@ -0,0 +1,311 @@ +package proxy + +import ( + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" +) + +func TestMatchRoute(t *testing.T) { + tests := []struct { + name string + path string + wantTool string + wantArgs map[string]interface{} + wantNil bool + }{ + // Issues + { + name: "list issues", + path: "/repos/octocat/hello-world/issues", + wantTool: "list_issues", + wantArgs: map[string]interface{}{"owner": "octocat", "repo": "hello-world"}, + }, + { + name: "get issue", + path: "/repos/octocat/hello-world/issues/42", + wantTool: "issue_read", + wantArgs: map[string]interface{}{"owner": "octocat", "repo": "hello-world", "issue_number": "42"}, + }, + { + name: "issue comments", + path: "/repos/octocat/hello-world/issues/42/comments", + wantTool: "issue_read", + wantArgs: map[string]interface{}{"owner": "octocat", "repo": "hello-world", "issue_number": "42", "method": "get_comments"}, + }, + { + name: "issue labels", + path: "/repos/octocat/hello-world/issues/42/labels", + wantTool: "issue_read", + wantArgs: map[string]interface{}{"owner": "octocat", "repo": "hello-world", "issue_number": "42", "method": "get_labels"}, + }, + + // Pull Requests + { + name: "list PRs", + path: "/repos/github/gh-aw/pulls", + wantTool: "list_pull_requests", + wantArgs: map[string]interface{}{"owner": "github", "repo": "gh-aw"}, + }, + { + name: "get PR", + path: "/repos/github/gh-aw/pulls/123", + wantTool: "pull_request_read", + wantArgs: map[string]interface{}{"owner": "github", "repo": "gh-aw", "pullNumber": "123", "method": "get"}, + }, + { + name: "PR files", + path: "/repos/github/gh-aw/pulls/123/files", + wantTool: "pull_request_read", + wantArgs: map[string]interface{}{"owner": "github", "repo": "gh-aw", "pullNumber": "123", "method": "get_files"}, + }, + { + name: "PR reviews", + path: "/repos/github/gh-aw/pulls/123/reviews", + wantTool: "pull_request_read", + wantArgs: map[string]interface{}{"owner": "github", "repo": "gh-aw", "pullNumber": "123", "method": "get_reviews"}, + }, + + // Commits + { + name: "list commits", + path: "/repos/org/repo/commits", + wantTool: "list_commits", + wantArgs: map[string]interface{}{"owner": "org", "repo": "repo"}, + }, + { + name: "get commit", + path: "/repos/org/repo/commits/abc123", + wantTool: "get_commit", + wantArgs: map[string]interface{}{"owner": "org", "repo": "repo", "sha": "abc123"}, + }, + + // Branches + { + name: "list branches", + path: "/repos/org/repo/branches", + wantTool: "list_branches", + wantArgs: map[string]interface{}{"owner": "org", "repo": "repo"}, + }, + + // Tags + { + name: "list tags", + path: "/repos/org/repo/tags", + wantTool: "list_tags", + wantArgs: map[string]interface{}{"owner": "org", "repo": "repo"}, + }, + + // Releases + { + name: "list releases", + path: "/repos/org/repo/releases", + wantTool: "list_releases", + wantArgs: map[string]interface{}{"owner": "org", "repo": "repo"}, + }, + { + name: "latest release", + path: "/repos/org/repo/releases/latest", + wantTool: "get_latest_release", + wantArgs: map[string]interface{}{"owner": "org", "repo": "repo"}, + }, + { + name: "release by tag", + path: "/repos/org/repo/releases/tags/v1.0.0", + wantTool: "get_release_by_tag", + wantArgs: map[string]interface{}{"owner": "org", "repo": "repo", "tag": "v1.0.0"}, + }, + + // Contents + { + name: "file contents", + path: "/repos/org/repo/contents/README.md", + wantTool: "get_file_contents", + wantArgs: map[string]interface{}{"owner": "org", "repo": "repo", "path": "README.md"}, + }, + { + name: "nested file contents", + path: "/repos/org/repo/contents/src/main.go", + wantTool: "get_file_contents", + wantArgs: map[string]interface{}{"owner": "org", "repo": "repo", "path": "src/main.go"}, + }, + + // Labels + { + name: "get label", + path: "/repos/org/repo/labels/bug", + wantTool: "get_label", + wantArgs: map[string]interface{}{"owner": "org", "repo": "repo", "name": "bug"}, + }, + + // Search + { + name: "search code", + path: "/search/code", + wantTool: "search_code", + wantArgs: map[string]interface{}{}, + }, + { + name: "search issues", + path: "/search/issues", + wantTool: "search_issues", + wantArgs: map[string]interface{}{}, + }, + + // User — not mapped; unknown paths are blocked (fail closed) + { + name: "get me", + path: "/user", + wantNil: true, + }, + + // Query string stripping + { + name: "path with query string", + path: "/repos/org/repo/issues?state=open&per_page=10", + wantTool: "list_issues", + wantArgs: map[string]interface{}{"owner": "org", "repo": "repo"}, + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + match := MatchRoute(tt.path) + if tt.wantNil { + assert.Nil(t, match) + return + } + require.NotNil(t, match, "expected route match for %s", tt.path) + assert.Equal(t, tt.wantTool, match.ToolName) + assert.Equal(t, tt.wantArgs, match.Args) + }) + } +} + +func TestStripGHHostPrefix(t *testing.T) { + tests := []struct { + input string + want string + }{ + {"/api/v3/repos/org/repo/issues", "/repos/org/repo/issues"}, + {"/api/v3/user", "/user"}, + {"/api/v3/graphql", "/graphql"}, + {"/repos/org/repo/issues", "/repos/org/repo/issues"}, + {"/user", "/user"}, + } + + for _, tt := range tests { + t.Run(tt.input, func(t *testing.T) { + assert.Equal(t, tt.want, StripGHHostPrefix(tt.input)) + }) + } +} + +func TestMatchGraphQL(t *testing.T) { + tests := []struct { + name string + body string + wantTool string + wantNil bool + }{ + { + name: "issue list query", + body: `{"query":"query { repository(owner: \"octocat\", name: \"hello-world\") { issues(first: 10) { nodes { title } } } }"}`, + wantTool: "list_issues", + }, + { + name: "single issue query", + body: `{"query":"query { repository(owner: \"octocat\", name: \"hello-world\") { issue(number: 1) { title body } } }"}`, + wantTool: "issue_read", + }, + { + name: "PR list query", + body: `{"query":"query { repository(owner: \"org\", name: \"repo\") { pullRequests(first: 10) { nodes { title } } } }"}`, + wantTool: "list_pull_requests", + }, + { + name: "single PR query", + body: `{"query":"query { repository(owner: \"org\", name: \"repo\") { pullRequest(number: 1) { title } } }"}`, + wantTool: "pull_request_read", + }, + { + name: "search query", + body: `{"query":"query { search(query: \"is:issue\", type: ISSUE, first: 10) { nodes { ... on Issue { title } } } }"}`, + wantTool: "search_issues", + }, + { + name: "viewer query", + body: `{"query":"query { viewer { login name email } }"}`, + wantNil: true, + }, + { + name: "empty query", + body: `{"query":""}`, + wantNil: true, + }, + { + name: "invalid JSON", + body: `not json`, + wantNil: true, + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + match := MatchGraphQL([]byte(tt.body)) + if tt.wantNil { + assert.Nil(t, match) + return + } + require.NotNil(t, match, "expected GraphQL match") + assert.Equal(t, tt.wantTool, match.ToolName) + }) + } +} + +func TestMatchGraphQL_ExtractsOwnerRepo(t *testing.T) { + tests := []struct { + name string + body string + wantOwner string + wantRepo string + }{ + { + name: "inline owner/name", + body: `{"query":"query { repository(owner: \"github\", name: \"copilot\") { issues { nodes { title } } } }"}`, + wantOwner: "github", + wantRepo: "copilot", + }, + { + name: "variables owner/name", + body: `{"query":"query($owner: String!, $name: String!) { repository(owner: $owner, name: $name) { issues { nodes { title } } } }","variables":{"owner":"github","name":"copilot"}}`, + wantOwner: "github", + wantRepo: "copilot", + }, + { + name: "variables with repo key", + body: `{"query":"query { repository(owner: $owner, name: $name) { issues { nodes { title } } } }","variables":{"owner":"org","repo":"myrepo"}}`, + wantOwner: "org", + wantRepo: "myrepo", + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + match := MatchGraphQL([]byte(tt.body)) + require.NotNil(t, match) + assert.Equal(t, tt.wantOwner, match.Owner) + assert.Equal(t, tt.wantRepo, match.Repo) + }) + } +} + +func TestIsGraphQLPath(t *testing.T) { + assert.True(t, IsGraphQLPath("/graphql")) + assert.True(t, IsGraphQLPath("/graphql/")) + assert.True(t, IsGraphQLPath("/api/v3/graphql")) + assert.True(t, IsGraphQLPath("/api/v3/graphql/")) + assert.False(t, IsGraphQLPath("/repos/org/repo")) + assert.False(t, IsGraphQLPath("/user")) +} diff --git a/internal/proxy/router.go b/internal/proxy/router.go new file mode 100644 index 00000000..a9108d1b --- /dev/null +++ b/internal/proxy/router.go @@ -0,0 +1,290 @@ +package proxy + +import ( + "regexp" + "strings" + + "github.com/github/gh-aw-mcpg/internal/logger" +) + +var logRouter = logger.New("proxy:router") + +// RouteMatch contains the result of matching a REST API path to a guard tool name. +type RouteMatch struct { + ToolName string + Owner string + Repo string + Args map[string]interface{} // Arguments to pass to LabelResource +} + +// route defines a pattern → tool name mapping. +type route struct { + pattern *regexp.Regexp + toolName string + // extractArgs is called with submatches to build the args map + extractArgs func(matches []string) map[string]interface{} +} + +// repoArgs builds the standard owner/repo args map. +func repoArgs(owner, repo string) map[string]interface{} { + return map[string]interface{}{ + "owner": owner, + "repo": repo, + } +} + +// routes is the ordered list of REST URL patterns mapped to guard tool names. +// Patterns are tried in order; first match wins. +var routes = []route{ + // Issues + { + pattern: regexp.MustCompile(`^/repos/([^/]+)/([^/]+)/issues/(\d+)/comments$`), + toolName: "issue_read", + extractArgs: func(m []string) map[string]interface{} { + return map[string]interface{}{"owner": m[1], "repo": m[2], "issue_number": m[3], "method": "get_comments"} + }, + }, + { + pattern: regexp.MustCompile(`^/repos/([^/]+)/([^/]+)/issues/(\d+)/labels$`), + toolName: "issue_read", + extractArgs: func(m []string) map[string]interface{} { + return map[string]interface{}{"owner": m[1], "repo": m[2], "issue_number": m[3], "method": "get_labels"} + }, + }, + { + pattern: regexp.MustCompile(`^/repos/([^/]+)/([^/]+)/issues/(\d+)$`), + toolName: "issue_read", + extractArgs: func(m []string) map[string]interface{} { + return map[string]interface{}{"owner": m[1], "repo": m[2], "issue_number": m[3]} + }, + }, + { + pattern: regexp.MustCompile(`^/repos/([^/]+)/([^/]+)/issues$`), + toolName: "list_issues", + extractArgs: func(m []string) map[string]interface{} { + return repoArgs(m[1], m[2]) + }, + }, + + // Pull Requests + { + pattern: regexp.MustCompile(`^/repos/([^/]+)/([^/]+)/pulls/(\d+)/files$`), + toolName: "pull_request_read", + extractArgs: func(m []string) map[string]interface{} { + return map[string]interface{}{"owner": m[1], "repo": m[2], "pullNumber": m[3], "method": "get_files"} + }, + }, + { + pattern: regexp.MustCompile(`^/repos/([^/]+)/([^/]+)/pulls/(\d+)/reviews$`), + toolName: "pull_request_read", + extractArgs: func(m []string) map[string]interface{} { + return map[string]interface{}{"owner": m[1], "repo": m[2], "pullNumber": m[3], "method": "get_reviews"} + }, + }, + { + pattern: regexp.MustCompile(`^/repos/([^/]+)/([^/]+)/pulls/(\d+)/comments$`), + toolName: "pull_request_read", + extractArgs: func(m []string) map[string]interface{} { + return map[string]interface{}{"owner": m[1], "repo": m[2], "pullNumber": m[3], "method": "get_review_comments"} + }, + }, + { + pattern: regexp.MustCompile(`^/repos/([^/]+)/([^/]+)/pulls/(\d+)$`), + toolName: "pull_request_read", + extractArgs: func(m []string) map[string]interface{} { + return map[string]interface{}{"owner": m[1], "repo": m[2], "pullNumber": m[3], "method": "get"} + }, + }, + { + pattern: regexp.MustCompile(`^/repos/([^/]+)/([^/]+)/pulls$`), + toolName: "list_pull_requests", + extractArgs: func(m []string) map[string]interface{} { + return repoArgs(m[1], m[2]) + }, + }, + + // Commits + { + pattern: regexp.MustCompile(`^/repos/([^/]+)/([^/]+)/commits/([^/]+)$`), + toolName: "get_commit", + extractArgs: func(m []string) map[string]interface{} { + return map[string]interface{}{"owner": m[1], "repo": m[2], "sha": m[3]} + }, + }, + { + pattern: regexp.MustCompile(`^/repos/([^/]+)/([^/]+)/commits$`), + toolName: "list_commits", + extractArgs: func(m []string) map[string]interface{} { + return repoArgs(m[1], m[2]) + }, + }, + + // Branches and Tags + { + pattern: regexp.MustCompile(`^/repos/([^/]+)/([^/]+)/branches$`), + toolName: "list_branches", + extractArgs: func(m []string) map[string]interface{} { + return repoArgs(m[1], m[2]) + }, + }, + { + pattern: regexp.MustCompile(`^/repos/([^/]+)/([^/]+)/git/ref/tags/(.+)$`), + toolName: "get_tag", + extractArgs: func(m []string) map[string]interface{} { + return map[string]interface{}{"owner": m[1], "repo": m[2], "tag": m[3]} + }, + }, + { + pattern: regexp.MustCompile(`^/repos/([^/]+)/([^/]+)/tags$`), + toolName: "list_tags", + extractArgs: func(m []string) map[string]interface{} { + return repoArgs(m[1], m[2]) + }, + }, + + // Releases + { + pattern: regexp.MustCompile(`^/repos/([^/]+)/([^/]+)/releases/latest$`), + toolName: "get_latest_release", + extractArgs: func(m []string) map[string]interface{} { + return repoArgs(m[1], m[2]) + }, + }, + { + pattern: regexp.MustCompile(`^/repos/([^/]+)/([^/]+)/releases/tags/(.+)$`), + toolName: "get_release_by_tag", + extractArgs: func(m []string) map[string]interface{} { + return map[string]interface{}{"owner": m[1], "repo": m[2], "tag": m[3]} + }, + }, + { + pattern: regexp.MustCompile(`^/repos/([^/]+)/([^/]+)/releases$`), + toolName: "list_releases", + extractArgs: func(m []string) map[string]interface{} { + return repoArgs(m[1], m[2]) + }, + }, + + // Contents + { + pattern: regexp.MustCompile(`^/repos/([^/]+)/([^/]+)/contents/(.+)$`), + toolName: "get_file_contents", + extractArgs: func(m []string) map[string]interface{} { + return map[string]interface{}{"owner": m[1], "repo": m[2], "path": m[3]} + }, + }, + { + pattern: regexp.MustCompile(`^/repos/([^/]+)/([^/]+)/git/trees/(.+)$`), + toolName: "get_file_contents", + extractArgs: func(m []string) map[string]interface{} { + return map[string]interface{}{"owner": m[1], "repo": m[2], "path": m[3]} + }, + }, + + // Labels + { + pattern: regexp.MustCompile(`^/repos/([^/]+)/([^/]+)/labels/(.+)$`), + toolName: "get_label", + extractArgs: func(m []string) map[string]interface{} { + return map[string]interface{}{"owner": m[1], "repo": m[2], "name": m[3]} + }, + }, + { + pattern: regexp.MustCompile(`^/repos/([^/]+)/([^/]+)/labels$`), + toolName: "list_labels", + extractArgs: func(m []string) map[string]interface{} { + return repoArgs(m[1], m[2]) + }, + }, + + // Actions (Workflows) + { + pattern: regexp.MustCompile(`^/repos/([^/]+)/([^/]+)/actions/workflows$`), + toolName: "actions_list", + extractArgs: func(m []string) map[string]interface{} { + return map[string]interface{}{"owner": m[1], "repo": m[2], "method": "list_workflows"} + }, + }, + { + pattern: regexp.MustCompile(`^/repos/([^/]+)/([^/]+)/actions/runs$`), + toolName: "actions_list", + extractArgs: func(m []string) map[string]interface{} { + return map[string]interface{}{"owner": m[1], "repo": m[2], "method": "list_workflow_runs"} + }, + }, + + // Search APIs + { + pattern: regexp.MustCompile(`^/search/code$`), + toolName: "search_code", + extractArgs: func(_ []string) map[string]interface{} { + return map[string]interface{}{} + }, + }, + { + pattern: regexp.MustCompile(`^/search/issues$`), + toolName: "search_issues", + extractArgs: func(_ []string) map[string]interface{} { + return map[string]interface{}{} + }, + }, + { + pattern: regexp.MustCompile(`^/search/repositories$`), + toolName: "search_repositories", + extractArgs: func(_ []string) map[string]interface{} { + return map[string]interface{}{} + }, + }, + + // User API (/user) is intentionally not mapped — it cannot be correctly labeled + // by the guard (no recognized tool name with equivalent semantics) and may contain + // private account data (e.g., email). Unknown paths are blocked by the handler. + + // Generic repo-scoped fallback (must be last) + { + pattern: regexp.MustCompile(`^/repos/([^/]+)/([^/]+)(?:/.*)?$`), + toolName: "get_file_contents", + extractArgs: func(m []string) map[string]interface{} { + return repoArgs(m[1], m[2]) + }, + }, +} + +// MatchRoute matches a REST API path to a guard tool name. +// The path should NOT include the /api/v3 prefix. +func MatchRoute(path string) *RouteMatch { + // Strip query string + if idx := strings.IndexByte(path, '?'); idx >= 0 { + path = path[:idx] + } + + for _, r := range routes { + matches := r.pattern.FindStringSubmatch(path) + if matches != nil { + args := r.extractArgs(matches) + m := &RouteMatch{ + ToolName: r.toolName, + Args: args, + } + if owner, ok := args["owner"].(string); ok { + m.Owner = owner + } + if repo, ok := args["repo"].(string); ok { + m.Repo = repo + } + logRouter.Printf("matched %s → tool=%s owner=%s repo=%s", path, m.ToolName, m.Owner, m.Repo) + return m + } + } + + logRouter.Printf("no route match for %s", path) + return nil +} + +// StripGHHostPrefix removes the /api/v3 prefix that gh adds when using GH_HOST. +func StripGHHostPrefix(path string) string { + if strings.HasPrefix(path, ghHostPathPrefix) { + return strings.TrimPrefix(path, ghHostPathPrefix) + } + return path +} diff --git a/test/integration/binary_test.go b/test/integration/binary_test.go index 58f184be..d040ec8a 100644 --- a/test/integration/binary_test.go +++ b/test/integration/binary_test.go @@ -589,15 +589,27 @@ func createMinimalMockMCPBackend(t *testing.T) *httptest.Server { return httptest.NewServer(mux) } -// findBinary locates the awmg binary +// findBinary locates the awmg binary. +// Supports AWMG_BINARY_PATH env var override for container/CI environments. func findBinary(t *testing.T) string { t.Helper() + if p := os.Getenv("AWMG_BINARY_PATH"); p != "" { + absPath, err := filepath.Abs(p) + if err == nil { + if _, err := os.Stat(absPath); err == nil { + return absPath + } + } + t.Fatalf("AWMG_BINARY_PATH=%q does not exist", p) + } + // Look for binary in common locations locations := []string{ "./awmg", // Current directory "../../awmg", // From test/integration "../../../awmg", // Alternative path + "/app/awmg", // Container image path } // Also check in PATH diff --git a/test/integration/proxy_test.go b/test/integration/proxy_test.go new file mode 100644 index 00000000..8a70449a --- /dev/null +++ b/test/integration/proxy_test.go @@ -0,0 +1,586 @@ +package integration + +import ( + "bytes" + "context" + "encoding/json" + "io" + "net/http" + "os" + "os/exec" + "strings" + "testing" + "time" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" +) + +// skipIfNoGitHubToken skips the test if no GitHub token is available. +func skipIfNoGitHubToken(t *testing.T) string { + t.Helper() + // Try several token sources in order + for _, env := range []string{"GITHUB_TOKEN", "GH_TOKEN", "GITHUB_PERSONAL_ACCESS_TOKEN"} { + if tok := os.Getenv(env); tok != "" { + return tok + } + } + // Try gh auth token + out, err := exec.Command("gh", "auth", "token").Output() + if err == nil { + tok := strings.TrimSpace(string(out)) + if tok != "" { + return tok + } + } + t.Skip("Skipping proxy integration test: no GitHub token available (set GITHUB_TOKEN or run `gh auth login`)") + return "" +} + +// findWasmGuard locates the GitHub guard WASM binary. +// Supports AWMG_WASM_GUARD_PATH env var override for container/CI environments. +func findWasmGuard(t *testing.T) string { + t.Helper() + if p := os.Getenv("AWMG_WASM_GUARD_PATH"); p != "" { + if _, err := os.Stat(p); err == nil { + return p + } + t.Fatalf("AWMG_WASM_GUARD_PATH=%q does not exist", p) + } + locations := []string{ + "../../guards/github-guard/rust-guard/target/wasm32-wasip1/release/github_guard.wasm", + "guards/github-guard/rust-guard/target/wasm32-wasip1/release/github_guard.wasm", + "/guards/github/00-github-guard.wasm", // container image path + } + for _, loc := range locations { + if _, err := os.Stat(loc); err == nil { + return loc + } + } + t.Skip("Skipping proxy integration test: WASM guard not found (run `make build` in guards/github-guard)") + return "" +} + +// proxyTestEnv holds the running proxy server info for tests. +type proxyTestEnv struct { + cmd *exec.Cmd + port string + baseURL string + token string + cancel context.CancelFunc + logDir string + stdout bytes.Buffer + stderr bytes.Buffer +} + +// startProxy starts the awmg proxy with the given policy and returns the test env. +func startProxy(t *testing.T, policyJSON string, port string) *proxyTestEnv { + t.Helper() + + binaryPath := findBinary(t) + wasmPath := findWasmGuard(t) + token := skipIfNoGitHubToken(t) + + logDir, err := os.MkdirTemp("", "awmg-proxy-test-*") + require.NoError(t, err) + + ctx, cancel := context.WithTimeout(context.Background(), 120*time.Second) + + listenAddr := "127.0.0.1:" + port + + args := []string{ + "proxy", + "--guard-wasm", wasmPath, + "--policy", policyJSON, + "--github-token", token, + "--listen", listenAddr, + "--log-dir", logDir, + "--guards-mode", "filter", + } + + cmd := exec.CommandContext(ctx, binaryPath, args...) + + env := &proxyTestEnv{ + cmd: cmd, + port: port, + baseURL: "http://" + listenAddr, + token: token, + cancel: cancel, + logDir: logDir, + } + + cmd.Stdout = &env.stdout + cmd.Stderr = &env.stderr + + err = cmd.Start() + require.NoError(t, err, "Failed to start proxy") + + // Wait for the proxy to be healthy + healthURL := env.baseURL + "/api/v3/health" + if !waitForServer(t, healthURL, 15*time.Second) { + t.Logf("STDOUT: %s", env.stdout.String()) + t.Logf("STDERR: %s", env.stderr.String()) + t.Fatal("Proxy did not start in time") + } + + t.Logf("✓ Proxy started at %s with policy: %s", listenAddr, policyJSON) + return env +} + +// stop cleans up the proxy process and temp files. +func (e *proxyTestEnv) stop(t *testing.T) { + t.Helper() + if e.cmd.Process != nil { + e.cmd.Process.Kill() + } + e.cancel() + os.RemoveAll(e.logDir) +} + +// ghAPI calls a GitHub REST API endpoint through the proxy using raw HTTP. +func (e *proxyTestEnv) ghAPI(t *testing.T, method, path string) (int, []byte) { + t.Helper() + + url := e.baseURL + "/api/v3" + path + req, err := http.NewRequest(method, url, nil) + require.NoError(t, err) + + req.Header.Set("Authorization", "token "+e.token) + req.Header.Set("Accept", "application/vnd.github+json") + + client := &http.Client{Timeout: 30 * time.Second} + resp, err := client.Do(req) + require.NoError(t, err) + defer resp.Body.Close() + + body, err := io.ReadAll(resp.Body) + require.NoError(t, err) + + return resp.StatusCode, body +} + +// ghGraphQL sends a GraphQL query through the proxy. +func (e *proxyTestEnv) ghGraphQL(t *testing.T, query string, variables map[string]interface{}) (int, []byte) { + t.Helper() + + payload := map[string]interface{}{"query": query} + if variables != nil { + payload["variables"] = variables + } + body, err := json.Marshal(payload) + require.NoError(t, err) + + url := e.baseURL + "/api/v3/graphql" + req, err := http.NewRequest("POST", url, bytes.NewReader(body)) + require.NoError(t, err) + + req.Header.Set("Authorization", "token "+e.token) + req.Header.Set("Accept", "application/vnd.github+json") + req.Header.Set("Content-Type", "application/json") + + client := &http.Client{Timeout: 30 * time.Second} + resp, err := client.Do(req) + require.NoError(t, err) + defer resp.Body.Close() + + respBody, err := io.ReadAll(resp.Body) + require.NoError(t, err) + + return resp.StatusCode, respBody +} + +// ghCLI runs a gh CLI command through the proxy using GH_HOST. +func (e *proxyTestEnv) ghCLI(t *testing.T, args ...string) (string, string, error) { + t.Helper() + + ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second) + defer cancel() + + cmd := exec.CommandContext(ctx, "gh", args...) + cmd.Env = append(os.Environ(), + "GH_HOST="+e.baseURL[len("http://"):], // strip scheme — gh adds it + "GH_TOKEN="+e.token, + // Disable gh's own TLS since we're using plain HTTP + "GH_PROTOCOL=http", + ) + + var stdout, stderr bytes.Buffer + cmd.Stdout = &stdout + cmd.Stderr = &stderr + + err := cmd.Run() + return stdout.String(), stderr.String(), err +} + +// parseJSONArray parses a JSON array response body. +func parseJSONArray(t *testing.T, body []byte) []interface{} { + t.Helper() + var arr []interface{} + if err := json.Unmarshal(body, &arr); err != nil { + // May be an object (e.g., GraphQL response), not an array + return nil + } + return arr +} + +// parseJSONObject parses a JSON object response body. +func parseJSONObject(t *testing.T, body []byte) map[string]interface{} { + t.Helper() + var obj map[string]interface{} + if err := json.Unmarshal(body, &obj); err != nil { + t.Logf("Warning: failed to parse JSON object: %v (body: %.200s)", err, string(body)) + return nil + } + return obj +} + +// ============================================================================ +// Test Suite: Repo-Scoped AllowOnly Policy +// ============================================================================ + +// TestProxyRepoScope validates that a repo-scoped allow-only policy correctly +// allows access to the scoped repo and blocks access to other repos. +// Note: Policy repos must be lowercase per guard validation rules. +func TestProxyRepoScope(t *testing.T) { + if testing.Short() { + t.Skip("Skipping proxy integration test in short mode") + } + + // Policy: only allow access to octocat/hello-world (lowercase per guard validation) + policy := `{"allow-only":{"repos":["octocat/hello-world"],"min-integrity":"none"}}` + env := startProxy(t, policy, "18901") + defer env.stop(t) + + // --- Scoped repo: octocat/Hello-World (should be ALLOWED) --- + + t.Run("ScopedRepo/ListIssues", func(t *testing.T) { + status, body := env.ghAPI(t, "GET", "/repos/octocat/Hello-World/issues?per_page=5&state=all") + assert.Equal(t, 200, status, "Expected 200 for scoped repo issues") + t.Logf("Scoped issues response (%.300s)", string(body)) + }) + + t.Run("ScopedRepo/GetContents", func(t *testing.T) { + status, body := env.ghAPI(t, "GET", "/repos/octocat/Hello-World/contents/README") + assert.Equal(t, 200, status, "Expected 200 for scoped repo contents") + t.Logf("Scoped contents response (%.300s)", string(body)) + }) + + t.Run("ScopedRepo/ListCommits", func(t *testing.T) { + status, body := env.ghAPI(t, "GET", "/repos/octocat/Hello-World/commits?per_page=5") + assert.Equal(t, 200, status, "Expected 200 for scoped repo commits") + t.Logf("Scoped commits: %.300s", string(body)) + }) + + t.Run("ScopedRepo/ListBranches", func(t *testing.T) { + status, body := env.ghAPI(t, "GET", "/repos/octocat/Hello-World/branches?per_page=10") + assert.Equal(t, 200, status, "Expected 200 for scoped repo branches") + t.Logf("Scoped branches: %.300s", string(body)) + }) + + // --- Out-of-scope repo: cli/cli (should be BLOCKED or filtered empty) --- + + t.Run("OutOfScope/ListIssues", func(t *testing.T) { + status, body := env.ghAPI(t, "GET", "/repos/cli/cli/issues?per_page=5") + // The proxy should either return 403, empty array, or filter all items + if status == 200 { + arr := parseJSONArray(t, body) + assert.Empty(t, arr, "Out-of-scope repo should return empty issues array") + } else { + assert.Contains(t, []int{403, 200}, status, "Expected 403 or 200 with empty results") + } + t.Logf("Out-of-scope issues: status=%d body=%.200s", status, string(body)) + }) + + t.Run("OutOfScope/GetContents", func(t *testing.T) { + status, body := env.ghAPI(t, "GET", "/repos/cli/cli/contents/README.md") + // Out-of-scope: expect blocked (403) or empty + t.Logf("Out-of-scope contents: status=%d body=%.200s", status, string(body)) + if status == 200 { + arr := parseJSONArray(t, body) + if arr != nil { + assert.Empty(t, arr, "Out-of-scope contents should be empty") + } + } + }) + + // --- Global APIs (should be BLOCKED or empty) --- + + t.Run("Global/User", func(t *testing.T) { + status, body := env.ghAPI(t, "GET", "/user") + t.Logf("GET /user: status=%d body=%.200s", status, string(body)) + }) + + t.Run("Global/SearchIssues", func(t *testing.T) { + status, body := env.ghAPI(t, "GET", "/search/issues?q=repo:octocat/Hello-World+is:issue&per_page=5") + t.Logf("Search issues: status=%d body=%.200s", status, string(body)) + assert.Equal(t, 200, status) + }) +} + +// ============================================================================ +// Test Suite: Owner-Scoped AllowOnly Policy +// ============================================================================ + +// TestProxyOwnerScope validates that an owner-scoped allow-only policy allows +// access to any repo under the specified owner. +func TestProxyOwnerScope(t *testing.T) { + if testing.Short() { + t.Skip("Skipping proxy integration test in short mode") + } + + // Policy: allow all repos under 'octocat' owner + policy := `{"allow-only":{"repos":["octocat/*"],"min-integrity":"none"}}` + env := startProxy(t, policy, "18902") + defer env.stop(t) + + t.Run("ScopedOwner/HelloWorld/ListIssues", func(t *testing.T) { + status, body := env.ghAPI(t, "GET", "/repos/octocat/Hello-World/issues?per_page=5&state=all") + assert.Equal(t, 200, status) + t.Logf("octocat/Hello-World issues: status=%d body=%.300s", status, string(body)) + }) + + t.Run("ScopedOwner/Spoon-Knife/ListCommits", func(t *testing.T) { + status, body := env.ghAPI(t, "GET", "/repos/octocat/Spoon-Knife/commits?per_page=5") + assert.Equal(t, 200, status) + t.Logf("octocat/Spoon-Knife commits: status=%d body=%.300s", status, string(body)) + }) + + t.Run("OutOfScope/CliCli/ListIssues", func(t *testing.T) { + status, body := env.ghAPI(t, "GET", "/repos/cli/cli/issues?per_page=5") + if status == 200 { + arr := parseJSONArray(t, body) + assert.Empty(t, arr, "Out-of-scope repo should return empty array") + } + t.Logf("cli/cli issues: status=%d body=%.200s", status, string(body)) + }) +} + +// ============================================================================ +// Test Suite: Integrity Filtering +// ============================================================================ + +// TestProxyIntegrityFiltering validates that min-integrity filtering works — +// items authored by non-collaborators are filtered out when min-integrity is set. +func TestProxyIntegrityFiltering(t *testing.T) { + if testing.Short() { + t.Skip("Skipping proxy integration test in short mode") + } + + // Policy: allow octocat/hello-world but require approved integrity. + policy := `{"allow-only":{"repos":["octocat/hello-world"],"min-integrity":"approved"}}` + env := startProxy(t, policy, "18903") + defer env.stop(t) + + t.Run("ApprovedIntegrity/ListIssues", func(t *testing.T) { + status, body := env.ghAPI(t, "GET", "/repos/octocat/Hello-World/issues?per_page=30&state=all") + assert.Equal(t, 200, status) + arr := parseJSONArray(t, body) + // With approved integrity, many community issues should be filtered + t.Logf("Issues returned with min-integrity=approved: %d (from 30 requested)", len(arr)) + }) + + t.Run("ApprovedIntegrity/ListCommits", func(t *testing.T) { + status, body := env.ghAPI(t, "GET", "/repos/octocat/Hello-World/commits?per_page=10") + assert.Equal(t, 200, status) + arr := parseJSONArray(t, body) + t.Logf("Commits returned with min-integrity=approved: %d", len(arr)) + }) +} + +// ============================================================================ +// Test Suite: GraphQL via Proxy +// ============================================================================ + +// TestProxyGraphQL validates that GraphQL queries are correctly routed and filtered. +func TestProxyGraphQL(t *testing.T) { + if testing.Short() { + t.Skip("Skipping proxy integration test in short mode") + } + + policy := `{"allow-only":{"repos":["octocat/hello-world"],"min-integrity":"none"}}` + env := startProxy(t, policy, "18904") + defer env.stop(t) + + t.Run("ScopedRepo/IssueList", func(t *testing.T) { + query := `query { + repository(owner: "octocat", name: "Hello-World") { + issues(first: 5) { + nodes { + title + number + author { login } + } + } + } + }` + status, body := env.ghGraphQL(t, query, nil) + assert.Equal(t, 200, status) + t.Logf("GraphQL issues response: %.500s", string(body)) + + obj := parseJSONObject(t, body) + if obj != nil { + assert.NotContains(t, obj, "errors", "Should not have GraphQL errors") + } + }) + + t.Run("ScopedRepo/WithVariables", func(t *testing.T) { + query := `query($owner: String!, $name: String!) { + repository(owner: $owner, name: $name) { + name + description + defaultBranchRef { name } + } + }` + vars := map[string]interface{}{ + "owner": "octocat", + "name": "Hello-World", + } + status, body := env.ghGraphQL(t, query, vars) + assert.Equal(t, 200, status) + t.Logf("GraphQL repo info: %.500s", string(body)) + }) + + t.Run("OutOfScope/DifferentRepo", func(t *testing.T) { + query := `query { + repository(owner: "cli", name: "cli") { + issues(first: 5) { + nodes { + title + number + } + } + } + }` + status, body := env.ghGraphQL(t, query, nil) + t.Logf("Out-of-scope GraphQL: status=%d body=%.500s", status, string(body)) + // Should be blocked or return empty/error + }) + + t.Run("Global/Viewer", func(t *testing.T) { + query := `query { viewer { login name } }` + status, body := env.ghGraphQL(t, query, nil) + t.Logf("Viewer query: status=%d body=%.200s", status, string(body)) + }) +} + +// ============================================================================ +// Test Suite: gh CLI Through Proxy +// ============================================================================ + +// TestProxyGhCLI validates that the actual gh CLI works through the proxy. +// This is the highest-fidelity test — it uses real gh commands. +func TestProxyGhCLI(t *testing.T) { + if testing.Short() { + t.Skip("Skipping proxy gh CLI integration test in short mode") + } + + // Check that gh is available + if _, err := exec.LookPath("gh"); err != nil { + t.Skip("Skipping: gh CLI not found in PATH") + } + + policy := `{"allow-only":{"repos":["octocat/hello-world"],"min-integrity":"none"}}` + env := startProxy(t, policy, "18905") + defer env.stop(t) + + t.Run("ScopedRepo/ApiRepoInfo", func(t *testing.T) { + stdout, stderr, err := env.ghCLI(t, "api", "/repos/octocat/Hello-World") + if err != nil { + t.Logf("gh api failed: %v\nstderr: %s", err, stderr) + // gh may not support GH_PROTOCOL=http well; log and continue + t.Skip("gh CLI may not support plain HTTP proxy") + } + assert.Contains(t, stdout, "Hello-World", "Should return Hello-World repo info") + t.Logf("gh api response: %.200s", stdout) + }) + + t.Run("ScopedRepo/ApiIssues", func(t *testing.T) { + stdout, stderr, err := env.ghCLI(t, "api", "/repos/octocat/Hello-World/issues?per_page=3") + if err != nil { + t.Logf("gh api issues failed: %v\nstderr: %s", err, stderr) + t.Skip("gh CLI may not support plain HTTP proxy") + } + t.Logf("gh api issues: %.200s", stdout) + }) +} + +// ============================================================================ +// Test Suite: Proxy Health and Basic Operation +// ============================================================================ + +// TestProxyHealthAndPassthrough validates basic proxy operation. +func TestProxyHealthAndPassthrough(t *testing.T) { + if testing.Short() { + t.Skip("Skipping proxy integration test in short mode") + } + + // Open policy — allow everything (for testing passthrough) + policy := `{"allow-only":{"repos":"public","min-integrity":"none"}}` + env := startProxy(t, policy, "18906") + defer env.stop(t) + + t.Run("HealthCheck", func(t *testing.T) { + resp, err := http.Get(env.baseURL + "/api/v3/health") + require.NoError(t, err) + defer resp.Body.Close() + assert.Equal(t, 200, resp.StatusCode) + }) + + t.Run("Passthrough/GetUser", func(t *testing.T) { + status, body := env.ghAPI(t, "GET", "/user") + assert.Equal(t, 200, status) + // Note: the guard may transform the response format (wrap objects in arrays) + t.Logf("Authenticated user response: status=%d body=%.300s", status, string(body)) + }) + + t.Run("Passthrough/GetPublicRepo", func(t *testing.T) { + status, body := env.ghAPI(t, "GET", "/repos/octocat/Hello-World") + assert.Equal(t, 200, status) + t.Logf("Public repo response: status=%d body=%.300s", status, string(body)) + }) + + t.Run("Passthrough/ListIssuesWithData", func(t *testing.T) { + status, body := env.ghAPI(t, "GET", "/repos/octocat/Hello-World/issues?per_page=5&state=all") + assert.Equal(t, 200, status) + t.Logf("Issues response: status=%d body=%.300s", status, string(body)) + }) +} + +// ============================================================================ +// Test Suite: Multiple Repos in Policy +// ============================================================================ + +// TestProxyMultiRepoPolicy validates that policies with multiple repo patterns work. +func TestProxyMultiRepoPolicy(t *testing.T) { + if testing.Short() { + t.Skip("Skipping proxy integration test in short mode") + } + + // Policy: allow two specific repos + policy := `{"allow-only":{"repos":["octocat/hello-world","octocat/spoon-knife"],"min-integrity":"none"}}` + env := startProxy(t, policy, "18907") + defer env.stop(t) + + t.Run("FirstRepo/HelloWorld", func(t *testing.T) { + status, _ := env.ghAPI(t, "GET", "/repos/octocat/Hello-World/commits?per_page=3") + assert.Equal(t, 200, status) + }) + + t.Run("SecondRepo/SpoonKnife", func(t *testing.T) { + status, _ := env.ghAPI(t, "GET", "/repos/octocat/Spoon-Knife/commits?per_page=3") + assert.Equal(t, 200, status) + }) + + t.Run("OutOfScope/CliCli", func(t *testing.T) { + status, body := env.ghAPI(t, "GET", "/repos/cli/cli/commits?per_page=3") + if status == 200 { + arr := parseJSONArray(t, body) + assert.Empty(t, arr, "Out-of-scope repo commits should be filtered") + } + t.Logf("Out-of-scope commits: status=%d", status) + }) +} + +// ============================================================================ +// Helpers (proxy-specific) +// ============================================================================