Skip to content

fix: sync test expectations with COPILOT_API_KEY injection and BYOK default model#28136

Merged
pelikhan merged 2 commits intomainfrom
copilot/fix-github-actions-workflow-test
Apr 23, 2026
Merged

fix: sync test expectations with COPILOT_API_KEY injection and BYOK default model#28136
pelikhan merged 2 commits intomainfrom
copilot/fix-github-actions-workflow-test

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 23, 2026

Two recent production changes to the Copilot engine execution path broke unit and golden tests that weren't updated alongside them:

  1. COPILOT_API_KEY: dummy-byok-key-for-offline-mode is now unconditionally injected into the agent env (after all other env merges) to trigger AWF's runtime BYOK detection path
  2. COPILOT_MODEL fallback changed from || ''|| 'claude-sonnet-4.6' (CopilotBYOKDefaultModel), since BYOK providers require a non-empty model

Changes

  • threat_detection_test.go: Update 2 TestCopilotDetectionDefaultModel cases that asserted || '' to use constants.CopilotBYOKDefaultModel instead
  • Golden files (basic-copilot.golden, with-imports.golden): Regenerate to reflect both new env vars in the compiled output:
    + COPILOT_API_KEY: dummy-byok-key-for-offline-mode
      COPILOT_GITHUB_TOKEN: ${{ secrets.COPILOT_GITHUB_TOKEN }}
    - COPILOT_MODEL: ${{ vars.GH_AW_MODEL_DETECTION_COPILOT || '' }}
    + COPILOT_MODEL: ${{ vars.GH_AW_MODEL_DETECTION_COPILOT || 'claude-sonnet-4.6' }}

…d model defaults

- Update TestCopilotDetectionDefaultModel to expect `|| 'claude-sonnet-4.6'` instead of `|| ''`
- Update golden files (basic-copilot, with-imports) to include COPILOT_API_KEY env var
  and updated COPILOT_MODEL fallback default

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/77b5bfb4-f827-4c3e-8e8c-137dfb3332fb

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
@pelikhan pelikhan marked this pull request as ready for review April 23, 2026 16:37
Copilot AI review requested due to automatic review settings April 23, 2026 16:37
@pelikhan pelikhan merged commit 7dc3c76 into main Apr 23, 2026
28 of 38 checks passed
@pelikhan pelikhan deleted the copilot/fix-github-actions-workflow-test branch April 23, 2026 16:37
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 23, 2026

Smoke CI completed successfully!

@github-actions
Copy link
Copy Markdown
Contributor

✅ smoke-ci: safeoutputs CLI comment + comment-memory run (24846997117)

Generated by Smoke CI for issue #28136 ·

@github-actions
Copy link
Copy Markdown
Contributor

Comment Memory

`````` CI lights the path\nGreen checks bloom at dawn\nQuiet bots still sing ``````

Note

This comment is managed by comment memory.

What this comment does

It stores persistent context for this thread in the <gh-aw-comment-memory> block at the top of this comment.
Edit only the text in that block; workflow metadata and the footer are regenerated automatically.

Generated by Smoke CI for issue #28136 ·

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates unit and golden tests to match recent Copilot execution-path changes: unconditional COPILOT_API_KEY injection for offline/BYOK detection and a non-empty default COPILOT_MODEL fallback (claude-sonnet-4.6).

Changes:

  • Updated TestCopilotDetectionDefaultModel expectations to use constants.CopilotBYOKDefaultModel in the env-var fallback expression.
  • Regenerated/updated golden workflow outputs to include COPILOT_API_KEY and the new COPILOT_MODEL fallback behavior.
Show a summary per file
File Description
pkg/workflow/threat_detection_test.go Aligns threat-detection unit test expectations with the new default model fallback expression.
pkg/workflow/testdata/TestWasmGolden_CompileFixtures/basic-copilot.golden Updates compiled YAML golden output to include injected COPILOT_API_KEY and updated COPILOT_MODEL fallback.
pkg/workflow/testdata/TestWasmGolden_CompileFixtures/with-imports.golden Updates compiled YAML golden output (with imports) to include injected COPILOT_API_KEY and updated COPILOT_MODEL fallback.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 3/3 changed files
  • Comments generated: 2

Comment on lines 1088 to +1090
// Detection uses env var fallback (same pattern as main agent), allowing
// the Copilot CLI to pick its native default (currently claude-sonnet-4.6)
expectedModel: "${{ vars." + constants.EnvVarModelDetectionCopilot + " || '' }}",
expectedModel: "${{ vars." + constants.EnvVarModelDetectionCopilot + " || '" + constants.CopilotBYOKDefaultModel + "' }}",
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment/docs around this assertion still say detection uses an empty-string fallback so the CLI picks its native default, but the expectation was changed to an explicit fallback model (CopilotBYOKDefaultModel). Please update the nearby comment text to match the new behavior (and ideally reference the constant rather than hard-coding a model name) to avoid misleading future readers.

Copilot uses AI. Check for mistakes.
Comment on lines 411 to +414
COPILOT_AGENT_RUNNER_TYPE: STANDALONE
COPILOT_API_KEY: dummy-byok-key-for-offline-mode
COPILOT_GITHUB_TOKEN: ${{ secrets.COPILOT_GITHUB_TOKEN }}
COPILOT_MODEL: ${{ vars.GH_AW_MODEL_DETECTION_COPILOT || '' }}
COPILOT_MODEL: ${{ vars.GH_AW_MODEL_DETECTION_COPILOT || 'claude-sonnet-4.6' }}
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wasm golden test compiles all fixtures under testdata/wasm_golden/fixtures (including smoke-copilot.md). The corresponding golden output (pkg/workflow/testdata/TestWasmGolden_CompileFixtures/smoke-copilot.golden) still has the old COPILOT_MODEL fallback (|| '') and is missing COPILOT_API_KEY, so TestWasmGolden_CompileFixtures is likely to keep failing. Regenerate/update that golden file as well to reflect the new injected env var and default model fallback.

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown
Contributor

🧪 Test Quality Sentinel Report

Test Quality Score: 90/100

Excellent test quality

Metric Value
New/modified tests analyzed 2 (table rows within TestCopilotDetectionDefaultModel)
✅ Design tests (behavioral contracts) 2 (100%)
⚠️ Implementation tests (low value) 0 (0%)
Tests with error/edge cases 2 (100%)
Duplicate test clusters 0
Test inflation detected No
🚨 Coding-guideline violations None

Test Classification Details

Test File Classification Issues Detected
copilot engine without model uses native CLI default via env var pkg/workflow/threat_detection_test.go:1082 ✅ Design None — verifies correct COPILOT_MODEL env var value in generated steps
copilot engine with threat detection engine config without model uses native CLI default via env var pkg/workflow/threat_detection_test.go:1124 ✅ Design None — verifies BYOK default model propagation in detection step

Analysis Summary

This PR makes a focused, surgical fix: two expectedModel values within the table-driven TestCopilotDetectionDefaultModel function are updated from an empty-string fallback (|| '') to the correct BYOK default model (|| 'claude-sonnet-4.6' via constants.CopilotBYOKDefaultModel).

The golden test data files (basic-copilot.golden, with-imports.golden) are also updated to reflect two corresponding changes: the addition of COPILOT_API_KEY: dummy-byok-key-for-offline-mode and the corrected COPILOT_MODEL fallback.

Strengths:

  • TestCopilotDetectionDefaultModel is a well-structured table-driven test with 5 cases covering: no model specified, explicit model, detection-specific engine config with/without model, and non-Copilot engine — comprehensive behavioral coverage
  • The file has the required //go:build !integration build tag ✅
  • No mock libraries used ✅
  • Error messages in t.Errorf are descriptive and include generated output for debugging ✅
  • Changes use named constants (constants.CopilotBYOKDefaultModel, constants.EnvVarModelDetectionCopilot) for maintainability ✅

Minor note (-10 pts): The test assertions use t.Errorf directly rather than testify's assert.* with a descriptive message parameter (project guideline encourages consistent use of testify assertions). This is a pre-existing pattern in this function, not introduced by this PR.


Language Support

Tests analyzed:

  • 🐹 Go (*_test.go): 2 modified table rows in 1 file — unit (//go:build !integration)
  • 🟨 JavaScript (*.test.cjs, *.test.js): 0 tests

Verdict

Check passed. 0% of new/modified tests are implementation tests (threshold: 30%). The changes correctly sync test expectations with the production behavior change (BYOK default model and API key injection).


📖 Understanding Test Classifications

Design Tests (High Value) verify what the system does:

  • Assert on observable outputs, return values, or state changes
  • Cover error paths and boundary conditions
  • Would catch a behavioral regression if deleted
  • Remain valid even after internal refactoring

Implementation Tests (Low Value) verify how the system does it:

  • Assert on internal function calls (mocking internals)
  • Only test the happy path with typical inputs
  • Break during legitimate refactoring even when behavior is correct
  • Give false assurance: they pass even when the system is wrong

Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.

References: §24847007611

🧪 Test quality analysis by Test Quality Sentinel · ● 1.1M ·

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Test Quality Sentinel: 90/100. Test quality is excellent — 0% of modified tests are implementation tests (threshold: 30%). The changes correctly sync two table-row expectations in TestCopilotDetectionDefaultModel to match the new BYOK default model behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants