Skip to content

fix: default sandbox agent type to AWF when not explicit; reject empty type in strict mode#29663

Merged
pelikhan merged 3 commits intomainfrom
copilot/investigate-smoke-gemini-issues
May 2, 2026
Merged

fix: default sandbox agent type to AWF when not explicit; reject empty type in strict mode#29663
pelikhan merged 3 commits intomainfrom
copilot/investigate-smoke-gemini-issues

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 2, 2026

Summary

Fixes the bug where sandbox.agent specified as an object without id or type (e.g. { version: "v0.25.29" }) caused isSandboxEnabled() to return false, silently disabling the AWF firewall and causing the agent to run on the host runner — breaking MCP connectivity.

Root Cause

getAgentType() returns "" when neither ID nor Type is set on AgentSandboxConfig. isSupportedSandboxType("") returns false, so isSandboxEnabled() returned false even though the user clearly intended to use AWF (e.g. by pinning a specific AWF version). applySandboxDefaults only defaulted to AWF for a nil agent, not for a non-nil agent with an empty type.

Changes

pkg/workflow/sandbox.go

applySandboxDefaults: after handling the Agent == nil case, also default Agent.Type to awf when the agent object is present but has no id/type. This covers patterns like:

sandbox:
  agent:
    version: "v0.25.29"   # was: silently treated as disabled; now: defaults to awf

pkg/workflow/strict_mode_sandbox_validation.go

validateStrictSandboxCustomization: in strict mode, reject any non-disabled sandbox.agent that has no explicit id. Ambiguous configurations are not acceptable in strict mode — users must write id: awf explicitly.

.github/workflows/hourly-ci-cleaner.md

Added the missing id: awf to the sandbox.agent block (was using mounts without an explicit id).

Tests

  • TestApplySandboxDefaults: added cases for version-only, empty-ID, and disabled agent configs
  • TestValidateStrictSandboxCustomization: added cases for agent without id and empty agent in strict mode; confirmed disabled agent is exempt

Copilot AI and others added 2 commits May 2, 2026 01:21
…y type in strict mode

When `sandbox.agent` is specified as an object without `id` or `type` (e.g., only
`version`), `getAgentType()` returned `""` and `isSandboxEnabled()` returned false,
silently disabling the AWF firewall.

Two changes:

1. `applySandboxDefaults` (sandbox.go): after setting a nil agent to AWF, also default
   the type to AWF when the agent exists but has no id/type. This covers patterns like
   `sandbox: { agent: { version: "v0.25.29" } }` where the user intended AWF.

2. `validateStrictSandboxCustomization` (strict_mode_sandbox_validation.go): in strict
   mode, reject `sandbox.agent` objects that have no explicit `id`. Ambiguous
   configurations are not accepted when strict: true.

Also fixes `.github/workflows/hourly-ci-cleaner.md` (which used `sandbox.agent.mounts`
without `id: awf`) and adds test coverage for both new behaviours.

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/5fc3fd85-6e8d-44a9-a20b-95bf0cd1689f

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI requested a review from pelikhan May 2, 2026 01:28
@pelikhan pelikhan marked this pull request as ready for review May 2, 2026 01:28
Copilot AI review requested due to automatic review settings May 2, 2026 01:28
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a sandbox enablement bug where sandbox.agent specified as an object without id/type could be treated as “sandbox disabled”, unintentionally running the agent on the host runner and breaking MCP connectivity.

Changes:

  • Default sandbox.agent to AWF when the agent object is present but no id/type is provided.
  • Add strict-mode validation intended to reject ambiguous sandbox.agent configurations.
  • Update workflow config and add/extend unit tests covering the new behaviors.
Show a summary per file
File Description
pkg/workflow/sandbox.go Extends defaulting logic to treat “version-only agent object” as AWF-enabled.
pkg/workflow/sandbox_test.go Adds tests for version-only / empty agent object defaulting and disabled preservation.
pkg/workflow/strict_mode_sandbox_validation.go Adds strict-mode rejection for ambiguous sandbox.agent configurations.
pkg/workflow/strict_mode_sandbox_validation_test.go Adds strict-mode tests for missing-id / empty agent object and disabled exception.
.github/workflows/hourly-ci-cleaner.md Adds explicit id: awf under sandbox.agent.
.github/workflows/hourly-ci-cleaner.lock.yml Regenerates lock metadata to reflect updated workflow content.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 6/6 changed files
  • Comments generated: 2

Comment thread pkg/workflow/sandbox.go
Comment on lines +177 to +181
// the type to awf so the sandbox is always enabled. This prevents a bare
// sandbox.agent object from silently disabling the firewall by leaving the type empty.
// Note: this block is only reached when Agent != nil and Disabled == false (the
// Disabled case returned early above).
if !isSupportedSandboxType(getAgentType(sandboxConfig.Agent)) {
Comment on lines +43 to +55
// In strict mode, sandbox.agent must carry an explicit type/id so the sandbox
// configuration is unambiguous. A bare object (e.g. { version: "v0.25.29" }
// with no id) would silently default to AWF in non-strict builds but that
// implicit defaulting is not acceptable in strict mode.
if !agent.Disabled {
if !isSupportedSandboxType(getAgentType(agent)) {
return fmt.Errorf(
"strict mode: 'sandbox.agent' must specify an explicit 'id' (e.g., id: awf). " +
"A sandbox agent without an 'id' is ambiguous and not allowed in strict mode. " +
"Add 'id: awf' to your sandbox.agent configuration. " +
"See: https://github.github.com/gh-aw/reference/sandbox/",
)
}
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 2, 2026

🧪 Test Quality Sentinel Report

Test Quality Score: 80/100

Excellent

Metric Value
New/modified tests analyzed 6 (table rows)
✅ Design tests (behavioral contracts) 6 (100%)
⚠️ Implementation tests (low value) 0 (0%)
Tests with error/edge cases 6 (100%)
Duplicate test clusters 0
Test inflation detected ⚠️ Yes — sandbox_test.go (+51 lines vs +13 production lines, ratio 3.9:1)
🚨 Coding-guideline violations None

Test Classification Details

View all 6 test cases
Test File Classification Issues Detected
TestApplySandboxDefaults / "version-only agent defaults to AWF" pkg/workflow/sandbox_test.go ✅ Design None
TestApplySandboxDefaults / "empty ID agent defaults to AWF" pkg/workflow/sandbox_test.go ✅ Design None
TestApplySandboxDefaults / "disabled agent is preserved" pkg/workflow/sandbox_test.go ✅ Design None
TestValidateStrictSandboxCustomization / "sandbox.agent without id is rejected in strict mode" pkg/workflow/strict_mode_sandbox_validation_test.go ✅ Design None
TestValidateStrictSandboxCustomization / "empty sandbox.agent is rejected in strict mode" pkg/workflow/strict_mode_sandbox_validation_test.go ✅ Design None
TestValidateStrictSandboxCustomization / "disabled sandbox.agent is not rejected here..." pkg/workflow/strict_mode_sandbox_validation_test.go ✅ Design None

Test Inflation Note

sandbox_test.go gained 51 lines against 13 lines of new production code (ratio 3.9:1). This triggered the mechanical inflation penalty but is not a quality concern here — the production change is dense logic (defaulting algorithm) requiring multiple table rows to fully specify its behavior. The extra test lines are all meaningful scenario coverage, not duplication.


Language Support

Tests analyzed:

  • 🐹 Go (*_test.go): 6 table-driven sub-cases across 2 files — all unit (//go:build !integration) ✅

Verdict

Check passed. 0% of new tests are implementation tests (threshold: 30%). Build tags are correct, no mock libraries, assertions include descriptive messages. The 3 new rows in TestApplySandboxDefaults directly verify the AWF-defaulting behavioral contract introduced by this fix, and the 3 new rows in TestValidateStrictSandboxCustomization verify the strict-mode rejection contract for ambiguous agent configs.


📖 Understanding Test Classifications

Design Tests (High Value) verify what the system does:

  • Assert on observable outputs, return values, or state changes
  • Cover error paths and boundary conditions
  • Would catch a behavioral regression if deleted
  • Remain valid even after internal refactoring

Implementation Tests (Low Value) verify how the system does it:

  • Assert on internal function calls (mocking internals)
  • Only test the happy path with typical inputs
  • Break during legitimate refactoring even when behavior is correct
  • Give false assurance: they pass even when the system is wrong

Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.

References: §25240361044

🧪 Test quality analysis by Test Quality Sentinel · ● 532K ·

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Test Quality Sentinel: 80/100. Test quality is excellent — 0% of new tests are implementation tests (threshold: 30%). All 6 new table-driven test cases verify behavioral contracts directly. No coding-guideline violations detected.

…ct mode explicit id

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 2, 2026

Commit pushed: d17923a

🏗️ ADR gate enforced by Design Decision Gate 🏗️

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 2, 2026

🏗️ Design Decision Gate — ADR Required

This PR makes significant changes to core business logic (115 new lines in pkg/) but does not have a linked Architecture Decision Record (ADR).

The Design Decision Gate has analyzed the PR diff and generated a draft ADR to help you get started:

📄 Draft ADR: docs/adr/29663-sandbox-agent-default-awf-and-strict-mode-explicit-id.md

Decision Summary (from the draft)

The draft captures two tightly coupled decisions introduced by this PR:

  1. Fail-safe default: When sandbox.agent is a non-nil, non-disabled object with no explicit id/type, applySandboxDefaults now defaults Agent.Type to awf. This prevents the silent firewall bypass where a version-only agent config (e.g. { version: "v0.25.29" }) caused isSandboxEnabled() to return false.

  2. Strict mode tightening: validateStrictSandboxCustomization now rejects any non-disabled sandbox.agent that lacks an explicit id, requiring authors to write id: awf unambiguously in strict-mode workflows.

What to do next

  1. Review the draft ADR committed to your branch at docs/adr/29663-sandbox-agent-default-awf-and-strict-mode-explicit-id.md
  2. Complete any missing context — add details the AI couldn't infer (e.g., the specific smoke-gemini incident trigger, timeline, affected workflows)
  3. Finalize and commit the ADR on your branch
  4. Reference the ADR in this PR body by adding a line such as:

    ADR: ADR-29663: Default Sandbox Agent Type to AWF for Ambiguous Configurations and Require Explicit ID in Strict Mode

Once an ADR is linked in the PR body, this gate will re-run and verify the implementation matches the decision.

Why ADRs Matter

"AI made me procrastinate on key design decisions. Because refactoring was cheap, I could always say 'I'll deal with this later.' Deferring decisions corroded my ability to think clearly."

ADRs create a searchable, permanent record of why the codebase looks the way it does — especially for security-boundary decisions like this one. Future contributors need to understand why the sandbox defaults the way it does.


📋 Michael Nygard ADR Format Reference

An ADR must contain these four sections to be considered complete:

  • Context — What is the problem? What forces are at play?
  • Decision — What did you decide? Why?
  • Alternatives Considered — What else could have been done?
  • Consequences — What are the trade-offs (positive and negative)?

All ADRs are stored in docs/adr/ as Markdown files numbered by PR number (e.g., 29663-sandbox-agent-default-awf.md for PR #29663).

🔒 This PR cannot merge until an ADR is linked in the PR body.

References: §25240361053

🏗️ ADR gate enforced by Design Decision Gate 🏗️ · ● 120.4K ·

@pelikhan pelikhan merged commit 4b203c7 into main May 2, 2026
@pelikhan pelikhan deleted the copilot/investigate-smoke-gemini-issues branch May 2, 2026 01:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants