Summary
The Codex smoke test (smoke-codex.lock.yml) was failing at "Validate safe outputs were invoked" with dynamic_tool_count=0. After extensive debugging across multiple CI runs, we identified two root causes and fixed them in PR github/gh-aw-firewall#2123.
Root Cause 1: Gateway auth header mismatch
The MCP gateway generates its own session-based HMAC-signed auth tokens (output in gateway-output.json), which differ from the raw MCP_GATEWAY_API_KEY / gateway-api-key output value.
The compiled lock file was using bearer_token_env_var = "AWF_GATEWAY_TOKEN" in the Codex config.toml, which sends the raw API key as a Bearer token. The gateway rejects this with 401 {"error":"unauthorized","message":"invalid API key"}.
Evidence: The check_mcp_servers.sh pre-check passes because it reads headers.Authorization from the gateway output file (which contains the correct HMAC-signed token). But Codex was sending the raw key directly.
The rmcp error data did not match any variant of untagged enum JsonRpcMessage seen in Codex logs was a consequence — rmcp tried to parse the 401 JSON error body as a JsonRpcMessage and failed.
Root Cause 2: Config path mismatch
convert_gateway_config_codex.cjs writes the correct Codex config (with gateway-generated auth headers and 172.30.0.1-resolved URLs) to:
${RUNNER_TEMP}/gh-aw/mcp-config/config.toml
But CODEX_HOME is set to:
These are different paths, so the converter's output never reaches Codex. The converter's config is correct — it uses http_headers = { Authorization = "..." } with the gateway-generated auth — but Codex never sees it.
Note: We can't simply set CODEX_HOME to ${RUNNER_TEMP}/gh-aw/mcp-config because RUNNER_TEMP is mounted read-only inside the AWF container, and Codex needs to write logs/cache/shell_snapshots to CODEX_HOME.
Fix
Instead of manually constructing config.toml with bearer_token_env_var, we:
- Copy the converter's output from
${RUNNER_TEMP}/gh-aw/mcp-config/config.toml to $CODEX_HOME/config.toml
- Prepend the
[shell_environment_policy] section (which the converter doesn't generate)
This gives Codex the correct gateway-generated auth headers while keeping CODEX_HOME writable.
Suggestions for gh-aw
-
Consider having convert_gateway_config_codex.cjs write directly to CODEX_HOME (or accept an output path argument), so downstream consumers don't need to know about the RUNNER_TEMP intermediate path.
-
Document that gateway-api-key output is not directly usable as a Bearer token for the gateway. The raw key and the gateway's session auth tokens are different things. Tools like check_mcp_servers.sh correctly use the gateway output file, but it's easy to assume the raw key works.
-
Consider outputting the gateway auth header (not just the raw key) as a step output, so lock files can use it directly without reading the gateway output JSON.
Affected Versions
- gh-aw-actions/setup with mcpg v0.2.26
- Codex CLI 0.118.0
- AWF firewall (smoke-codex workflow)
Related
Summary
The Codex smoke test (
smoke-codex.lock.yml) was failing at "Validate safe outputs were invoked" withdynamic_tool_count=0. After extensive debugging across multiple CI runs, we identified two root causes and fixed them in PR github/gh-aw-firewall#2123.Root Cause 1: Gateway auth header mismatch
The MCP gateway generates its own session-based HMAC-signed auth tokens (output in
gateway-output.json), which differ from the rawMCP_GATEWAY_API_KEY/gateway-api-keyoutput value.The compiled lock file was using
bearer_token_env_var = "AWF_GATEWAY_TOKEN"in the Codexconfig.toml, which sends the raw API key as a Bearer token. The gateway rejects this with401 {"error":"unauthorized","message":"invalid API key"}.Evidence: The
check_mcp_servers.shpre-check passes because it readsheaders.Authorizationfrom the gateway output file (which contains the correct HMAC-signed token). But Codex was sending the raw key directly.The rmcp error
data did not match any variant of untagged enum JsonRpcMessageseen in Codex logs was a consequence — rmcp tried to parse the 401 JSON error body as a JsonRpcMessage and failed.Root Cause 2: Config path mismatch
convert_gateway_config_codex.cjswrites the correct Codex config (with gateway-generated auth headers and172.30.0.1-resolved URLs) to:But
CODEX_HOMEis set to:These are different paths, so the converter's output never reaches Codex. The converter's config is correct — it uses
http_headers = { Authorization = "..." }with the gateway-generated auth — but Codex never sees it.Note: We can't simply set
CODEX_HOMEto${RUNNER_TEMP}/gh-aw/mcp-configbecauseRUNNER_TEMPis mounted read-only inside the AWF container, and Codex needs to write logs/cache/shell_snapshots toCODEX_HOME.Fix
Instead of manually constructing
config.tomlwithbearer_token_env_var, we:${RUNNER_TEMP}/gh-aw/mcp-config/config.tomlto$CODEX_HOME/config.toml[shell_environment_policy]section (which the converter doesn't generate)This gives Codex the correct gateway-generated auth headers while keeping
CODEX_HOMEwritable.Suggestions for gh-aw
Consider having
convert_gateway_config_codex.cjswrite directly toCODEX_HOME(or accept an output path argument), so downstream consumers don't need to know about theRUNNER_TEMPintermediate path.Document that
gateway-api-keyoutput is not directly usable as a Bearer token for the gateway. The raw key and the gateway's session auth tokens are different things. Tools likecheck_mcp_servers.shcorrectly use the gateway output file, but it's easy to assume the raw key works.Consider outputting the gateway auth header (not just the raw key) as a step output, so lock files can use it directly without reading the gateway output JSON.
Affected Versions
Related