Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Feb 4, 2026

The nightly MCP stress test workflow was blocked with 0% test coverage because it attempted to launch MCP server containers from within AWF, where Docker-in-Docker was removed in v0.9.1.

Changes

Workflow configuration (.github/workflows/nightly-mcp-stress-test.md)

  • Add sandbox.mcp configuration with gateway container and 20 MCP servers
  • Remove Go setup step (no longer building gateway in AWF)
  • Restrict filesystem mount to /tmp/mcp-test-fs subdirectory

Agent instructions (.github/agentics/nightly-mcp-stress-test.md)

  • Remove gateway build/launch commands (make build, ./awmg)
  • Update test approach to use MCP tools through pre-configured infrastructure
  • Remove gateway lifecycle management steps

Architecture

Before:

AWF Container
└─ Agent → build gateway → launch containers → ❌ Docker unavailable

After:

MCP Gateway Container (outside AWF)
└─ Launches 20 MCP servers via Docker ✅

AWF Container
└─ Agent → HTTP/MCP → Gateway ✅

The gateway now runs as a trusted external service where Docker is available, while the agent communicates with it via HTTP from within AWF's security boundary.

Original prompt

This section details on the original issue you should resolve

<issue_title>[mcp-stress-test] Nightly MCP Stress Test Blocked: Docker-in-Docker Not Available in AWF Environment</issue_title>
<issue_description>## Critical Blocker for Nightly Stress Test Workflow

The nightly MCP server stress test workflow cannot execute due to a fundamental environment constraint: Docker-in-Docker support is not available in the AWF firewall container.

Test Session Details

  • Test Session: stress-test-20260204-033819
  • Test Date: 2026-02-04T03:42:00Z
  • Workflow: .github/workflows/nightly-mcp-stress-test.md
  • Status:BLOCKED - Cannot Execute

Problem Summary

The stress test attempts to launch 20 MCP servers as Docker containers, but all 20 servers fail immediately because Docker commands are blocked by AWF.

Error Message from MCP Gateway:

ERROR: Docker-in-Docker support was removed in AWF v0.9.1

Docker commands are no longer available inside the firewall container.

If you need to:
- Use MCP servers: Migrate to stdio-based MCP servers (see docs)
- Run Docker: Execute Docker commands outside AWF wrapper
- Build images: Run Docker build before invoking AWF

See PR github/gh-aw-mcpg#205: https://github.com/github/gh-aw-firewall/pull/205

Root Cause

  1. AWF Security Policy: Docker-in-Docker explicitly disabled in AWF v0.9.1 (PR [duplicate-code] Config Validation Logic Duplication (Medium Severity) #205)
  2. Test Design: All 20 MCP servers configured as container: "mcp/*" or container: "ghcr.io/*"
  3. Gateway Behavior: Gateway uses docker run to launch container-based servers
  4. Environment: Workflow runs inside AWF firewall container with no Docker access
  5. Result: Zero servers can launch → zero servers can be tested

Impact

Test Coverage: 0/20 servers tested (0%)

All 20 attempted servers failed with identical Docker availability errors:

  • github (ghcr.io/github/github-mcp-server:v0.30.2)
  • filesystem (mcp/filesystem)
  • memory (mcp/memory)
  • sqlite (mcp/sqlite)
  • postgres (mcp/postgres)
  • brave-search (mcp/brave-search)
  • fetch (mcp/fetch)
  • puppeteer (mcp/puppeteer)
  • slack (mcp/slack)
  • gdrive (mcp/gdrive)
  • google-maps (mcp/google-maps)
  • everart (mcp/everart)
  • sequential-thinking (mcp/sequential-thinking)
  • aws-kb-retrieval (mcp/aws-kb-retrieval)
  • linear (mcp/linear)
  • sentry (mcp/sentry)
  • raygun (mcp/raygun)
  • git (mcp/git)
  • time (mcp/time)
  • axiom (mcp/axiom)

What Actually Worked ✅

The MCP Gateway behaved correctly:

  • Binary compiled successfully
  • Configuration parsed correctly (20 servers loaded)
  • Server started and bound to port 3000
  • Detected AWF environment correctly
  • Provided clear, actionable error messages

This is not a gateway bug - it's an environment incompatibility between the test design and AWF constraints.

Resolution Options

Option 1: Run Workflow Outside AWF (Recommended)

Pros:

  • No code changes needed
  • Tests gateway as designed (with container launching)
  • Quick to implement

Cons:

  • Less security isolation
  • May require different workflow runner

Implementation:

  • Modify workflow to run on standard GitHub runner (not AWF container)
  • OR: Run workflow on self-hosted runner with Docker access

Option 2: Use HTTP-Based MCP Servers

Pros:

  • Servers run outside workflow (no Docker needed)
  • Tests gateway's HTTP proxy capabilities
  • Maintains security boundary

Cons:

  • Requires pre-deployed MCP servers
  • Doesn't test gateway's container launching
  • Complex infrastructure setup

Implementation:

  • Deploy MCP servers externally (e.g., cloud instances)
  • Configure stress test with type: "http" and url instead of container

Option 3: Use Stdio-Based Non-Container Servers

Pros:

  • Can run inside AWF
  • Tests gateway stdio capabilities
  • No Docker dependency

Cons:

  • Requires rewriting/rebuilding MCP servers as binaries
  • Most MCP servers distributed as containers only
  • Significant development effort

Implementation:

  • Build or find stdio-compatible MCP server binaries
  • Deploy binaries into workflow environment
  • Configure with command instead of container

Option 4: Hybrid Approach

Pros:

  • Partial test coverage better than none
  • Incremental improvement possible
  • Flexible

Cons:

  • Incomplete coverage
  • Maintains complexity

Implementation:

  • Identify which servers can run as stdio processes
  • Test subset (e.g., 5-10 servers)
  • Document remaining servers as "requires Docker"

Option 5: Disable Stress Test

Pros:

  • Acknowledges limitation clearly
  • Frees up workflow resources
  • Simple

Cons:

  • Zero multi-server test coverage
  • No regression detection for scaling issues

Implementation:

  • Disable .github/workflows/nightly-mcp-stress-test.md workflow
  • Document as known limitation in README

Recommendations

Immediate Actions

  1. Document blocker (this issue)
  2. 🔲 Disable workflow until resol...

Custom agent used: agentic-workflows
GitHub Agentic Workflows (gh-aw) - Create, debug, and upgrade AI-powered workflows with intelligent prompt routing


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 3 commits February 4, 2026 03:53
- Add sandbox.mcp configuration with 20 MCP servers
- Remove gateway build/launch from agent instructions
- Update test approach to use MCP tools directly through gateway
- Remove unnecessary Go setup step
- Simplify test instructions to use pre-configured infrastructure

Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
- Change filesystem mount from /tmp to /tmp/mcp-test-fs
- Prevents filesystem server from accessing sensitive host files
- Maintains required write access in a restricted path

Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
- Use specific format 'stress-test-YYYYMMDD-HHMMSS' instead of generic '{timestamp}'
- Improves clarity for agents following the instructions

Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix Docker-in-Docker support for nightly MCP stress test Fix nightly MCP stress test: use external gateway instead of Docker-in-Docker Feb 4, 2026
Copilot AI requested a review from lpcox February 4, 2026 03:57
@lpcox lpcox marked this pull request as ready for review February 4, 2026 04:10
Copilot AI review requested due to automatic review settings February 4, 2026 04:10
@lpcox lpcox merged commit e0a0306 into main Feb 4, 2026
@lpcox lpcox deleted the copilot/fix-docker-in-docker-support branch February 4, 2026 04:10
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes the nightly MCP stress test workflow which was blocked due to Docker-in-Docker being removed from AWF v0.9.1. The solution moves the MCP Gateway to run as an external service (outside AWF) where Docker is available, while the agent communicates with it via MCP tools from within the secure AWF environment.

Changes:

  • Added sandbox.mcp configuration with the MCP Gateway container and 20 MCP server definitions to the workflow
  • Removed Go setup step and gateway build/launch commands from agent instructions
  • Updated testing approach to use pre-configured MCP infrastructure instead of manually building and launching the gateway

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
.github/workflows/nightly-mcp-stress-test.md Added sandbox.mcp configuration with gateway container (v0.0.94) and 20 MCP servers including GitHub, filesystem, memory, sqlite, postgres, and 15 other services; removed Go setup step
.github/agentics/nightly-mcp-stress-test.md Removed gateway build/launch instructions, updated testing approach to use pre-configured MCP servers, streamlined test execution steps
Comments suppressed due to low confidence (1)

.github/agentics/nightly-mcp-stress-test.md:160

  • Incomplete merge or editing error in Step 4 section. Lines 148-158 contain orphaned content from the old version including:
  • Incomplete sentence at line 148 ("Create a comprehensive test report documenting your findings.")
  • Orphaned bash command with unclosed code block (lines 150-152)
  • Numbered step "3. Analyze gateway performance" (lines 154-158) without corresponding steps 1 and 2
  • Duplicate heading "## Step 5: Generate Test Report" at line 160 (with emoji 📊), when Step 4 already has the same title with emoji 📝 at line 146

This appears to be leftover content from the old version that should have been removed. The section should flow directly from Step 4's title to the "Summary Statistics" subsection without these orphaned fragments.

## Step 4: Generate Test Report 📝

Create a comprehensive test report documenting your findings.
   
   # Parse for errors
   grep -i error /tmp/mcp-stress-test/logs/*.log > /tmp/mcp-stress-results/errors.txt
  1. Analyze gateway performance:
    • Check for memory leaks
    • Measure startup time for each server
    • Count total requests and failures
    • Identify slowest servers

Step 5: Generate Test Report 📊

</details>



---

💡 <a href="/github/gh-aw-mcpg/new/main/.github/instructions?filename=*.instructions.md" class="Link--inTextBlock" target="_blank" rel="noopener noreferrer">Add Copilot custom instructions</a> for smarter, more guided reviews. <a href="https://docs.github.com/en/copilot/customizing-copilot/adding-repository-custom-instructions-for-github-copilot" class="Link--inTextBlock" target="_blank" rel="noopener noreferrer">Learn how to get started</a>.

Comment on lines +114 to +124
### Example Test Pattern

**Authentication Required:**
- Error message contains "authentication", "unauthorized", "token", "API key"
- HTTP 401 status code
- Tool invocation fails due to missing credentials
For the GitHub server (which has authentication configured):
```bash
# You can directly use MCP tools configured in the workflow
# The MCP gateway handles the routing automatically
# Example: Use bash to log your testing approach
echo "Testing github server..."
```

**Protocol Error:**
- Invalid JSON-RPC response
- MCP protocol violation
- Malformed request/response
Then attempt to use a GitHub MCP tool. If it works, record success. If it fails, record the error and category.
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The testing instructions lack concrete guidance on how to discover and invoke MCP tools from the configured servers. While lines 79-81 state what the agent should do (discover tools, invoke them, record results), and lines 87-92 suggest which tools to try, the instructions don't explain:

  1. How to programmatically discover what tools each server provides (is there a list_tools function? Should the agent try to introspect?)
  2. The exact syntax/mechanism for invoking MCP tools from bash scripts or directly
  3. How to capture and parse tool responses and errors systematically

Compare this to the removed old version which had explicit curl commands with JSON-RPC requests. The new approach assumes the agent knows how to interact with pre-configured MCP servers, but provides no concrete examples or API reference.

Consider adding a concrete example showing how to invoke at least one MCP tool and check its response, to serve as a pattern the agent can follow for the other 19 servers.

See below for a potential fix:

For the GitHub server (which has authentication configured), follow this concrete pattern:

1. **Set the MCP Gateway URL**

   ```bash
   # MCP gateway base URL (injected by the workflow or use a default)
   MCP_GATEWAY_URL="${MCP_GATEWAY_URL:-http://127.0.0.1:3000}"

   echo "Testing github server via MCP gateway at ${MCP_GATEWAY_URL}..."
  1. Discover available tools for the github server

    This uses a JSON-RPC tools/list call to ask the gateway which tools are exposed for the github server.

    curl -sS "${MCP_GATEWAY_URL}" \
      -H 'Content-Type: application/json' \
      -d '{
        "jsonrpc": "2.0",
        "id": "list-github-tools",
        "method": "tools/list",
        "params": { "server": "github" }
      }' | tee github-tools-response.json
    
    echo "Discovered tools for github server:"
    jq '.result.tools // .error' github-tools-response.json
    • If .result.tools is present, the server responded successfully with a tool list.
    • If .error is present instead, record the error message and categorize it (e.g., auth, timeout, protocol).
  2. Invoke a specific GitHub MCP tool

    After listing tools, pick a simple, read-only tool (replace get_repo and its params with a real tool and arguments from the previous step):

    curl -sS "${MCP_GATEWAY_URL}" \
      -H 'Content-Type: application/json' \
      -d '{
        "jsonrpc": "2.0",
        "id": "call-github-tool",
        "method": "tools/call",
        "params": {
          "server": "github",
          "tool": "get_repo",
          "arguments": {
            "owner": "octocat",
            "repo": "hello-world"
          }
        }
      }' | tee github-tool-call-response.json
    
    echo "Tool call result (or error) for github server:"
    jq '{result: .result, error: .error}' github-tool-call-response.json
    • If .result is non-null, record this server as Success and capture a brief summary of the output.
    • If .error is non-null, record this server as Failure and categorize the error (auth, protocol, timeout, etc.).

Then repeat this pattern for each of the other MCP servers:

  • Use tools/list with server: "<server-name>" to discover available tools.
  • Select a simple, read-only tool from the list.
  • Use tools/call with the chosen tool name and minimal arguments.
  • Parse responses and errors with jq and record the outcome for your final report.

For servers without authentication:

  • Attempt to use a tool with tools/call as above.
  • If it fails due to missing authentication, document the required token or permission in your notes.
  • Move to the next server and continue testing.

Copilot uses AI. Check for mistakes.
@@ -18,18 +18,80 @@ network:
- defaults
- go
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Network configuration includes "go" but Go setup step was removed from the workflow. Since the workflow no longer builds the gateway locally (it uses a pre-built container), the "go" network allowlist entry is unnecessary. The agent won't be running any Go commands or accessing Go module repositories.

Consider removing "go" from the network allowed list as it's no longer needed after removing the Go setup step.

Suggested change
- go

Copilot uses AI. Check for mistakes.
Comment on lines +38 to +39
mounts:
- "/tmp/mcp-test-fs:/workspace:rw"
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mounts field is used in the filesystem server configuration but is not documented in the MCP server schema definition (agentic-workflow.json:5740-5880 stdio_mcp_tool). The field is supported by the gateway's StdinServerConfig (internal/config/config_stdin.go:54) and used in the example config.json file, but missing from the JSON schema that validates workflow configurations.

While this won't cause runtime errors (the gateway supports it), it creates a documentation gap where the schema doesn't reflect the actual supported fields. Consider adding a mounts field definition to the stdio_mcp_tool schema in .github/aw/schemas/agentic-workflow.json.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[mcp-stress-test] Nightly MCP Stress Test Blocked: Docker-in-Docker Not Available in AWF Environment

2 participants