Skip to content

feat: fail fast when DOCKER_HOST points to an external daemon (workflow-scope DinD)#1909

Merged
lpcox merged 4 commits intomainfrom
copilot/fix-dind-container-networking
Apr 11, 2026
Merged

feat: fail fast when DOCKER_HOST points to an external daemon (workflow-scope DinD)#1909
lpcox merged 4 commits intomainfrom
copilot/fix-dind-container-networking

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 11, 2026

Workflow-scope DinD (DOCKER_HOST=tcp://localhost:2375) silently breaks AWF because Docker Compose routes container creation through the DinD daemon's network namespace, where AWF's fixed subnet (172.30.0.0/24) and iptables DNAT rules are unreachable from the runner.

Changes

  • Fail-fast check (src/cli.ts): checkDockerHost() inspects DOCKER_HOST at startup. Any value other than the two canonical local sockets (unix:///var/run/docker.sock, unix:///run/docker.sock) exits immediately with a clear error:

    ❌ DOCKER_HOST is set to an external daemon (tcp://localhost:2375). AWF requires the
    local Docker daemon (default socket). Workflow-scope DinD is incompatible with AWF's
    network isolation model. See the "Workflow-Scope DinD Incompatibility" section in
    docs/usage.md for details and workarounds.
    
  • Documentation (docs/usage.md): New Workflow-Scope DinD Incompatibility subsection under Limitations — explains root cause, the error message, and the --enable-dind workaround for agents that genuinely need Docker access.

  • Tests (src/cli.test.ts): 7 unit tests for checkDockerHost() covering absent/undefined env var, both valid local socket paths, TCP, TLS TCP, and non-standard unix socket.

Add checkDockerHost() to src/cli.ts that inspects DOCKER_HOST on startup.
If it points at a non-default socket (e.g. tcp://localhost:2375 for a DinD
sidecar), AWF exits immediately with a clear error explaining why it is
incompatible and pointing at the new docs section.

Also add a "Workflow-Scope DinD Incompatibility" section to docs/usage.md
documenting the root cause, the error message users will see, and the
--enable-dind workaround for agents that genuinely need Docker access.

Agent-Logs-Url: https://github.com/github/gh-aw-firewall/sessions/d99ee10d-b3d6-4811-a197-9eb8bb15da2a
Copilot AI changed the title [WIP] Fix Docker-in-Docker workflow scope for AWF feat: fail fast when DOCKER_HOST points to an external daemon (workflow-scope DinD) Apr 11, 2026
Copilot AI requested a review from lpcox April 11, 2026 15:21
@lpcox lpcox marked this pull request as ready for review April 11, 2026 15:38
@lpcox lpcox requested a review from Mossaka as a code owner April 11, 2026 15:38
Copilot AI review requested due to automatic review settings April 11, 2026 15:38
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an early-startup guard to prevent AWF from running when DOCKER_HOST is configured to use workflow-scope DinD (or any non-default Docker socket), since that breaks AWF’s network/iptables isolation assumptions.

Changes:

  • Add checkDockerHost() and invoke it in the main CLI action to fail fast on unsupported DOCKER_HOST values.
  • Add unit tests covering allowed/blocked DOCKER_HOST cases.
  • Document the workflow-scope DinD incompatibility and the --enable-dind workaround.
Show a summary per file
File Description
src/cli.ts Introduces checkDockerHost() and a startup fail-fast check before running the main workflow.
src/cli.test.ts Adds targeted unit tests for the new checkDockerHost() validation behavior.
docs/usage.md Documents why workflow-scope DinD breaks AWF and provides guidance/workarounds.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 3/3 changed files
  • Comments generated: 2

Comment thread src/cli.ts
Comment on lines +907 to +914
return {
valid: false,
error:
`DOCKER_HOST is set to an external daemon (${dockerHost}). ` +
'AWF requires the local Docker daemon (default socket). ' +
'Workflow-scope DinD is incompatible with AWF\'s network isolation model. ' +
'See the "Workflow-Scope DinD Incompatibility" section in docs/usage.md for details and workarounds.',
};
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkDockerHost() rejects non-default unix:// sockets (e.g. unix:///tmp/custom-docker.sock) but the returned error always says "external daemon", which is inaccurate/misleading for local unix sockets. Either (a) change the wording to something like "unsupported DOCKER_HOST value" and explicitly list the only supported socket values, or (b) loosen the check to allow any unix:// socket and only block remote schemes (tcp://, ssh://, etc.).

Copilot uses AI. Check for mistakes.
Comment thread src/cli.test.ts Outdated
}
});

it('should return invalid for a TLS TCP daemon', () => {
Copy link

Copilot AI Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test name implies TLS based only on port 2376, but TLS is controlled by DOCKER_TLS_VERIFY/cert env vars, not the port number. Consider renaming to avoid encoding an assumption (e.g., "should return invalid for a TCP daemon on a non-default port").

Suggested change
it('should return invalid for a TLS TCP daemon', () => {
it('should return invalid for a TCP daemon on a non-default port', () => {

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown
Contributor

Documentation Preview

Documentation build failed for this PR. View logs.

Built from commit da58570

@github-actions
Copy link
Copy Markdown
Contributor

⚠️ Coverage Regression Detected

This PR decreases test coverage. Please add tests to maintain coverage levels.

Overall Coverage

Metric Base PR Delta
Lines 85.33% 85.35% 📈 +0.02%
Statements 85.18% 85.20% 📈 +0.02%
Functions 87.45% 87.50% 📈 +0.05%
Branches 77.69% 77.67% 📉 -0.02%
📁 Per-file Coverage Changes (2 files)
File Lines (Before → After) Statements (Before → After)
src/cli.ts 61.0% → 61.1% (+0.10%) 61.4% → 61.5% (+0.09%)
src/docker-manager.ts 85.9% → 86.2% (+0.32%) 85.5% → 85.8% (+0.31%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@lpcox lpcox merged commit 8c9668c into main Apr 11, 2026
30 of 32 checks passed
@lpcox lpcox deleted the copilot/fix-dind-container-networking branch April 11, 2026 16:52
@github-actions
Copy link
Copy Markdown
Contributor

Smoke Test Results — run 24287094163

✅ GitHub MCP: "fix: remove duplicate paragraph..." / "fix: loosen checkDockerHost..."
✅ Playwright: github.com title contains "GitHub"
✅ File write: /tmp/gh-aw/agent/smoke-test-claude-24287094163.txt created and verified
✅ Bash: file content confirmed

Overall: PASS

💥 [THE END] — Illustrated by Smoke Claude

@github-actions
Copy link
Copy Markdown
Contributor

🤖 Smoke Test Results

Test Status
GitHub MCP (#1913 fix: remove duplicate paragraph...)
GitHub.com connectivity (HTTP 200)
File write/read (smoke-test-copilot-24287094158.txt)

Overall: PASS · PR by @app/copilot-swe-agent · Assignees: @lpcox @Copilot

📰 BREAKING: Report filed by Smoke Copilot

@github-actions
Copy link
Copy Markdown
Contributor

Smoke Test Results
PR Titles: feat: fail fast when DOCKER_HOST points to an external daemon (workflow-scope DinD); fix: loosen checkDockerHost to accept any unix:// socket; fix misleading test name

  1. GitHub MCP last 2 merged PRs ✅
  2. safeinputs-gh PR query ❌
  3. Playwright github.com title contains GitHub ✅
  4. Tavily search returned results ❌
  5. File write in /tmp/gh-aw/agent ✅
  6. Bash cat verification ✅
  7. Discussion query + mystical comment ❌
  8. npm ci && npm run build ✅
    Overall: FAIL

🔮 The oracle has spoken through Smoke Codex

@github-actions
Copy link
Copy Markdown
Contributor

Chroot Version Comparison Results

Runtime Host Version Chroot Version Match?
Python Python 3.12.13 Python 3.12.3 ❌ NO
Node.js v24.14.1 v20.20.2 ❌ NO
Go go1.22.12 go1.22.12 ✅ YES

Overall: ❌ Not all tests passed — Python and Node.js versions differ between host and chroot environment.

Tested by Smoke Chroot

@github-actions
Copy link
Copy Markdown
Contributor

Smoke Test: GitHub Actions Services Connectivity ✅

All checks passed:

Check Result
Redis PING (host.docker.internal:6379) PONG
PostgreSQL ready (host.docker.internal:5432) accepting connections
PostgreSQL SELECT 1 (db: smoketest, user: postgres) ✅ returned 1

Note: redis-cli was not pre-installed; Redis was tested via nc (raw RESP protocol).

🔌 Service connectivity validated by Smoke Services

@github-actions
Copy link
Copy Markdown
Contributor

🏗️ Build Test Suite Results

Ecosystem Project Build/Install Tests Status
Bun elysia 1/1 passed ✅ PASS
Bun hono 1/1 passed ✅ PASS
C++ fmt N/A ✅ PASS
C++ json N/A ✅ PASS
Deno oak N/A 1/1 passed ✅ PASS
Deno std N/A 1/1 passed ✅ PASS
.NET hello-world N/A ✅ PASS
.NET json-parse N/A ✅ PASS
Go color 1/1 passed ✅ PASS
Go env 1/1 passed ✅ PASS
Go uuid 1/1 passed ✅ PASS
Java gson 1/1 passed ✅ PASS
Java caffeine 1/1 passed ✅ PASS
Node.js clsx all passed ✅ PASS
Node.js execa all passed ✅ PASS
Node.js p-limit all passed ✅ PASS
Rust fd 1/1 passed ✅ PASS
Rust zoxide 1/1 passed ✅ PASS

Overall: 8/8 ecosystems passed — ✅ PASS

Notes
  • Java: Maven required -Dmaven.repo.local=/tmp/gh-aw/agent/m2repo workaround because the default ~/.m2/repository directory was not writable in this runner environment. Both projects compiled and tested successfully once the alternate repo path was used.

Generated by Build Test Suite for issue #1909 · ● 644.3K ·

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[awf] docker-manager: workflow-scope DinD (DOCKER_HOST) breaks AWF container networking

3 participants