Skip to content

MCP Gateway: port 80 health check fails with no retry on transient container startup delay #26696

@bryanchen-d

Description

@bryanchen-d

Summary

The "Start MCP Gateway" step performs a single-shot port 80 connectivity check after launching the MCP gateway container. If the container takes slightly longer to bind port 80, the check fails immediately with Port 80 does not appear to be listening and the entire agent job fails. There is no retry or backoff.

Failing run

Timeline

01:46:17.941Z  launcher: Launching new server: serverID=github, command=docker, inContainer=true
01:46:17.943Z  Command: docker
01:46:17.981Z  Checking network connectivity to gateway port...
01:46:18.016Z  Port 80 does not appear to be listening
01:46:18.020Z  ##[error]Process completed with exit code 1.

Note that MCP server connections succeeded (5 tools registered from vscode-errors-gh) — only the gateway port 80 health check failed. The time between container launch (docker command at .943Z) and the health check (.981Z) is ~38ms — very tight for a Docker container to bind a port.

Expected behavior

The port 80 health check should retry with exponential backoff (e.g. 3-5 attempts over a few seconds) before failing the step. A transient container startup delay should not kill the entire workflow run.

Additional context

This is a workflow_dispatch fan-out workflow that dispatches ~8 runs simultaneously. The failure is transient — other runs dispatched at the same time succeeded. The only difference is container startup timing on the runner.

Metadata

Metadata

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions