Skip to content

feat(docker-manager): add --diagnostic-logs flag for container failure diagnostics#1906

Merged
lpcox merged 8 commits intomainfrom
copilot/awf-implement-diagnostic-logs-flag
Apr 11, 2026
Merged

feat(docker-manager): add --diagnostic-logs flag for container failure diagnostics#1906
lpcox merged 8 commits intomainfrom
copilot/awf-implement-diagnostic-logs-flag

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 11, 2026

When AWF containers crash before reaching steady state (e.g. Squid exits code 1 in DinD environments), application-level logs are never written, leaving engineers with no visibility into the root cause. This adds opt-in diagnostic collection triggered on non-zero exit.

Changes

New: --diagnostic-logs flag

  • Off by default; enabled via --diagnostic-logs CLI flag or features.awf-diagnostic-logs: true workflow frontmatter
  • Collected before docker compose down -v so container state is still accessible

collectDiagnosticLogs(workDir) (src/docker-manager.ts)

Runs against awf-squid, awf-agent, awf-api-proxy, awf-iptables-init (containers that never started are silently skipped):

  • <container>.log — stdout+stderr via docker logs
  • <container>.state — exit code + error string via docker inspect --format '{{.State.ExitCode}} {{.State.Error}}'
  • <container>.mounts.json — mount metadata via docker inspect --format '{{json .Mounts}}'
  • docker-compose.yml — sanitized copy with env var values matching \w*(?:TOKEN|KEY|SECRET)\w* redacted

Explicitly NOT collected: raw env vars, full docker inspect output, host filesystem contents.

Preservation (cleanup())

  • With --audit-dir: diagnostics land at <audit-dir>/diagnostics/ (single upload path for CI artifacts)
  • Without: moved to /tmp/awf-diagnostics-<timestamp>/

Workflow integration (src/cli-workflow.ts)

collectDiagnosticLogs added as optional WorkflowDependencies field; called between runAgentCommand and performCleanup only when diagnosticLogs && exitCode !== 0.

# Example usage in workflow frontmatter
features:
  awf-diagnostic-logs: true
# CLI usage
awf --diagnostic-logs --allow-domains example.com -- my-agent-command
# On failure: [INFO] Diagnostic logs collected at: /tmp/awf-<ts>/diagnostics/

Copilot AI changed the title [WIP] Implement --diagnostic-logs flag for container failure diagnostics feat(docker-manager): add --diagnostic-logs flag for container failure diagnostics Apr 11, 2026
Copilot AI requested a review from lpcox April 11, 2026 15:29
@lpcox lpcox marked this pull request as ready for review April 11, 2026 15:37
@lpcox lpcox requested a review from Mossaka as a code owner April 11, 2026 15:37
Copilot AI review requested due to automatic review settings April 11, 2026 15:37
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds an opt-in diagnostic logging feature to capture container failure details (logs/state/mounts + sanitized compose) before teardown, improving debuggability when containers fail to reach steady state.

Changes:

  • Introduces --diagnostic-logs / features.awf-diagnostic-logs config to trigger diagnostics on non-zero exit.
  • Implements collectDiagnosticLogs(workDir) and preserves diagnostics during cleanup() (to audit dir or /tmp).
  • Adds workflow and CLI wiring plus unit tests for diagnostics collection and preservation.
Show a summary per file
File Description
src/types.ts Documents and exposes diagnosticLogs config flag on WrapperConfig.
src/docker-manager.ts Implements diagnostics collection and preservation during cleanup.
src/docker-manager.test.ts Adds unit tests covering diagnostics collection + cleanup preservation behavior.
src/cli.ts Adds --diagnostic-logs option and passes config/dependency through.
src/cli-workflow.ts Calls diagnostics collection before cleanup on non-zero exit.
src/cli-workflow.test.ts Tests call ordering and conditional invocation of diagnostics collection.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 6/6 changed files
  • Comments generated: 5

Comment thread src/docker-manager.ts Outdated
Comment thread src/docker-manager.ts Outdated
Comment thread src/docker-manager.ts Outdated
const result = await execa('docker', ['logs', container], { reject: false });
const combined = [result.stdout, result.stderr].filter(Boolean).join('\n').trim();
if (combined) {
fs.writeFileSync(path.join(diagnosticsDir, `${container}.log`), combined + '\n');
Comment thread src/docker-manager.ts Outdated
Comment thread src/cli-workflow.ts Outdated
lpcox and others added 4 commits April 11, 2026 08:48
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 11, 2026

⚠️ Coverage Regression Detected

This PR decreases test coverage. Please add tests to maintain coverage levels.

Overall Coverage

Metric Base PR Delta
Lines 85.85% 85.85% ➡️ +0.00%
Statements 85.76% 85.73% 📉 -0.03%
Functions 87.54% 87.70% 📈 +0.16%
Branches 78.56% 78.60% 📈 +0.04%
📁 Per-file Coverage Changes (2 files)
File Lines (Before → After) Statements (Before → After)
src/docker-manager.ts 86.3% → 86.2% (-0.08%) 85.9% → 85.7% (-0.13%)
src/cli-workflow.ts 92.0% → 92.6% (+0.59%) 92.0% → 92.6% (+0.59%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

@github-actions
Copy link
Copy Markdown
Contributor

⚠️ Coverage Regression Detected

This PR decreases test coverage. Please add tests to maintain coverage levels.

Overall Coverage

Metric Base PR Delta
Lines 85.85% 85.86% 📈 +0.01%
Statements 85.76% 85.74% 📉 -0.02%
Functions 87.54% 87.70% 📈 +0.16%
Branches 78.56% 78.60% 📈 +0.04%
📁 Per-file Coverage Changes (2 files)
File Lines (Before → After) Statements (Before → After)
src/docker-manager.ts 86.3% → 86.2% (-0.05%) 85.9% → 85.8% (-0.10%)
src/cli-workflow.ts 92.0% → 92.6% (+0.59%) 92.0% → 92.6% (+0.59%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

@github-actions
Copy link
Copy Markdown
Contributor

⚠️ Coverage Regression Detected

This PR decreases test coverage. Please add tests to maintain coverage levels.

Overall Coverage

Metric Base PR Delta
Lines 85.85% 85.84% ➡️ -0.01%
Statements 85.76% 85.72% 📉 -0.04%
Functions 87.54% 87.70% 📈 +0.16%
Branches 78.56% 78.60% 📈 +0.04%
📁 Per-file Coverage Changes (2 files)
File Lines (Before → After) Statements (Before → After)
src/cli-workflow.ts 92.0% → 89.7% (-2.35%) 92.0% → 89.7% (-2.35%)
src/docker-manager.ts 86.3% → 86.2% (-0.05%) 85.9% → 85.8% (-0.10%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

@github-actions
Copy link
Copy Markdown
Contributor

🔥 Smoke Test Results

Test Status
GitHub MCP (list_pull_requests#1904 "perf: optimize firewall-issue-dispatcher token usage")
GitHub.com connectivity (HTTP 200)
File write/read (smoke-test-copilot-24285974367.txt)

Overall: PASS

Author: @app/copilot-swe-agent | Assignees: @lpcox @Copilot

📰 BREAKING: Report filed by Smoke Copilot

@github-actions
Copy link
Copy Markdown
Contributor

Smoke Test Results

  • ✅ GitHub MCP: perf: optimize firewall-issue-dispatcher token usage / fix: enable bash tool and add GraphQL pagination in firewall-issue-dispatcher
  • ✅ Playwright: github.com title contains "GitHub"
  • ✅ File write: /tmp/gh-aw/agent/smoke-test-claude-24285974365.txt created
  • ✅ Bash: file content verified

Overall: PASS

💥 [THE END] — Illustrated by Smoke Claude

@github-actions
Copy link
Copy Markdown
Contributor

PR titles (latest merged):

  • fix: improve issue URL format in firewall-issue-dispatcher prompt
  • perf: optimize firewall-issue-dispatcher token usage

Results: 1✅ 2❌ 3✅ 4❌ 5✅ 6✅ 7❌ 8✅
Overall status: FAIL

🔮 The oracle has spoken through Smoke Codex

@github-actions
Copy link
Copy Markdown
Contributor

Chroot Version Comparison Results

Runtime Host Version Chroot Version Match?
Python Python 3.12.13 Python 3.12.3
Node.js v24.14.1 v20.20.2
Go go1.22.12 go1.22.12

Result: ❌ Not all tests passed — Python and Node.js versions differ between host and chroot environments.

Tested by Smoke Chroot

@github-actions
Copy link
Copy Markdown
Contributor

Smoke Test: GitHub Actions Services Connectivity ✅

All checks passed:

Check Result
Redis PING (host.docker.internal:6379) PONG
PostgreSQL pg_isready (host.docker.internal:5432) ✅ accepting connections
PostgreSQL SELECT 1 (db: smoketest, user: postgres) ✅ returned 1

Note: redis-cli was unavailable (no sudo), so Redis was tested via TCP socket (nc).

🔌 Service connectivity validated by Smoke Services

@github-actions
Copy link
Copy Markdown
Contributor

🏗️ Build Test Suite Results

Ecosystem Project Build/Install Tests Status
Bun elysia 1/1 passed ✅ PASS
Bun hono 1/1 passed ✅ PASS
C++ fmt N/A ✅ PASS
C++ json N/A ✅ PASS
Deno oak N/A 1/1 passed ✅ PASS
Deno std N/A 1/1 passed ✅ PASS
.NET hello-world N/A ✅ PASS
.NET json-parse N/A ✅ PASS
Go color 1/1 passed ✅ PASS
Go env 1/1 passed ✅ PASS
Go uuid 1/1 passed ✅ PASS
Java gson 1/1 passed ✅ PASS
Java caffeine 1/1 passed ✅ PASS
Node.js clsx all passed ✅ PASS
Node.js execa all passed ✅ PASS
Node.js p-limit all passed ✅ PASS
Rust fd 1/1 passed ✅ PASS
Rust zoxide 1/1 passed ✅ PASS

Overall: 8/8 ecosystems passed — ✅ PASS

Generated by Build Test Suite for issue #1906 · ● 1.4M ·

@github-actions github-actions Bot mentioned this pull request Apr 11, 2026
@lpcox lpcox merged commit 4da2ae8 into main Apr 11, 2026
52 of 55 checks passed
@lpcox lpcox deleted the copilot/awf-implement-diagnostic-logs-flag branch April 11, 2026 16:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[awf] docker-manager: implement --diagnostic-logs flag for container failure diagnostics

3 participants