Fix non-deterministic engine validation error message by Copilot · Pull Request #14313 · github/gh-aw

Copilot · 2026-02-07T08:08:08Z

TestValidateEngine was failing intermittently because the error message format depended on Go map iteration order, which is non-deterministic. The test expected 'claude', 'codex', 'copilot', or 'custom' but the actual output could be any permutation like 'codex', 'copilot', 'custom', 'claude'.

Changes

Sort engine names before formatting to ensure deterministic output
Add formatListWithOr helper for proper grammatical conjunction (handles 0/1/2/3+ items with oxford comma)
Update test assertion to use prefix matching instead of exact equality, allowing "Did you mean?" suggestions

Result

Error messages now consistently display as:

invalid engine value 'gpt4'. Must be 'claude', 'codex', 'copilot', or 'custom'

The formatListWithOr helper can be reused for other CLI list formatting needs.

Original prompt

This section details on the original issue you should resolve

<issue_title>[CI Failure Doctor] CMD Tests fail because validateEngine list formatting is nondeterministic</issue_title>
<issue_description># 🏥 CI Failure Investigation - Run github/gh-aw#34185

Summary

The Integration: CMD Tests job running go test ./cmd/gh-aw failed due to TestValidateEngine asserting that the validation error string matches a hard-coded list of engine IDs, but the CLI code formatted that list in a nondeterministic order that changed when the registry map iteration order shifted.

Failure Details

Run: 21775107387

Commit: 5c5b3b0

Trigger: push

Root Cause Analysis

validateEngine called workflow.GetGlobalEngineRegistry().GetSupportedEngines() and printed the result via strings.Join without sorting or inserting an "or" before the final entry. Because GetSupportedEngines iterates a map, the engine names sometimes come back in a different order (the failed log shows codex, copilot, custom, claude), which no longer matched the string that TestValidateEngine was hard-coded to expect ('claude', 'codex', 'copilot', or 'custom'). The exact string comparison therefore started failing even though the underlying validation logic was still correct.

Failed Jobs and Errors

Integration: CMD Tests (job 62830258415) – TestValidateEngine failed with invalid engine value 'gpt4'. Must be 'codex', 'copilot', 'custom', 'claude', but the test asserted the message had to say 'claude', 'codex', 'copilot', or 'custom'.

Investigation Findings

Logs show TestValidateEngine failing in cmd/gh-aw/main_entry_test.go because err.Error() no longer matched the literal string the test constructs.

The engine registry list is generated from a map, so iteration order is not stable. The error message also lacked the conjunction "or", so the test could not pass even when the names were the same but ordered differently.

Manual go test ./cmd/gh-aw invocation currently hits proxy.golang.org and fails with 403 Forbidden while downloading modules, so the job cannot be rerun locally until the proxy access issue is resolved.

Recommended Actions

Sort the supported engine list and humanize the joined string (including an "or" before the final entry) before embedding it in the error message so that the message is deterministic and matches what TestValidateEngine expects. (Fix applied in cmd/gh-aw/main.go.)

Once the environment can reach proxy.golang.org, rerun go test ./cmd/gh-aw (or the full Integration: CMD Tests job) to confirm the fix and that no additional regressions occur.

Prevention Strategies

Format validation error messages that depend on registry contents with deterministic sorting and natural-language connectors instead of relying on map ordering; tests that compare exact strings can remain stable if the formatter is predictable.

AI Team Self-Improvement

When comparing error output to golden strings, always sort any slice derived from a map and render it with human-readable punctuation (e.g., add "or" before the final entry) so the text stays stable even if the map order changes.

Historical Context

TestValidateEngine has previously failed whenever the engine registry order shifted, so this failure matches a known pattern: the tests assert a hard-coded string while the CLI code reconstructs that string from a map. This investigation reinforces that the formatting must be deterministic rather than relying on go map iteration order.

AI generated by CI Failure Doctor

To add this workflow in your repository, run gh aw add githubnext/agentics/workflows/ci-doctor.md@ea350161ad5dcc9624cf510f134c6a9e39a6f94d. See usage guide.

expires on Feb 8, 2026, 6:02 AM UTC

Comments on the Issue (you are @copilot in this section)

Fixes [CI Failure Doctor] CMD Tests fail because validateEngine list formatting is nondeterministic #14295

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

- Add formatListWithOr helper function to format lists with "or" conjunction - Sort engine names before formatting error message for deterministic output - Update validateEngine to use sorted engines with proper "or" formatting - Update test to check for prefix match instead of exact match to allow "Did you mean" suggestions - Add comprehensive tests for formatListWithOr helper function Fixes issue where TestValidateEngine failed due to non-deterministic map iteration order when listing supported engines. Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

github-actions · 2026-02-07T12:19:21Z

🔍 PR Triage Results

Category: bug | Risk: low | Priority: 42/100

Scores Breakdown

Impact: 20/50 - Fixes test flakiness in CLI command validation. Low direct user impact but improves CI reliability and developer experience by eliminating non-deterministic test failures.
Urgency: 8/30 - Test stability fix (4.1 hours old). Addresses intermittent CI failures. Currently unstable. Draft status.
Quality: 14/20 - Good description with clear root cause analysis. Adds reusable formatListWithOr helper function. Draft status. CI needs stabilization.

📋 Recommended Action: batch_review

This test stability fix should be reviewed together with PRs #14312 and #14301 as part of the code quality improvements batch (batch-code-quality-001). The PR adds a reusable helper for formatting lists that could benefit other parts of the codebase.

Batch Context: Group with other test fixes and code quality improvements for efficient review.

Triaged by PR Triage Agent on 2026-02-07

AI generated by PR Triage Agent

Copilot

Pull request overview

This PR fixes intermittent failures in the cmd/gh-aw CLI integration tests by making the engine validation error message deterministic, eliminating nondeterministic Go map iteration ordering from user-facing output.

Changes:

Sort supported engine IDs before rendering them in validateEngine error messages.
Add a formatListWithOr helper to produce grammatically correct lists (including Oxford comma).
Update TestValidateEngine to assert using prefix matching so optional “Did you mean” suggestions don’t break the test.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File	Description
cmd/gh-aw/main.go	Sorts engine list and uses `formatListWithOr` to build deterministic, readable validation errors.
cmd/gh-aw/main_entry_test.go	Changes the assertion to prefix-match the base error message to allow appended suggestions.
cmd/gh-aw/format_list_test.go	Adds unit tests for `formatListWithOr` list formatting behavior.

Comments suppressed due to low confidence (1)

cmd/gh-aw/main.go:73

return fmt.Errorf("%s", errMsg) is equivalent to returning errors.New(errMsg) (or fmt.Errorf("%s", ...) without adding value). Consider using errors.New(errMsg) to make it clear this is a static message and avoid the redundant formatting call.

		errMsg := fmt.Sprintf("invalid engine value '%s'. Must be %s", engine, formattedList)

		if len(suggestions) > 0 {
			errMsg = fmt.Sprintf("invalid engine value '%s'. Must be %s.\n\nDid you mean: %s?",
				engine, formattedList, suggestions[0])
		}

		return fmt.Errorf("%s", errMsg)
	}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-07T13:16:34Z

cmd/gh-aw/main_entry_test.go

+				// Check that error message contains the expected format
+				// Error may include "Did you mean" suggestions, so we check if it starts with the base message
+				expectedMsg := fmt.Sprintf("invalid engine value '%s'. Must be 'claude', 'codex', 'copilot', or 'custom'", tt.engine)
+				if tt.errMessage != "" && !strings.HasPrefix(err.Error(), expectedMsg) {
+					t.Errorf("validateEngine(%q) error message = %v, want to start with %v", tt.engine, err.Error(), expectedMsg)
 				}


tt.errMessage is no longer used to form the expected assertion (it only gates the check). This makes the table data misleading and harder to maintain. Consider either (a) removing errMessage from the test cases and keying off expectErr, or (b) using tt.errMessage to build the expected prefix (e.g., include the full base message there) so each case asserts what it declares.

Initial plan

5b8ec73

Copilot AI assigned Copilot and pelikhan Feb 7, 2026

Copilot started work on behalf of pelikhan February 7, 2026 08:08 View session

Copilot AI changed the title ~~[WIP] Fix nondeterministic list formatting in validateEngine~~ Fix non-deterministic engine validation error message Feb 7, 2026

Copilot AI requested a review from pelikhan February 7, 2026 08:18

Copilot finished work on behalf of pelikhan February 7, 2026 08:18

github-actions bot mentioned this pull request Feb 7, 2026

[PR Triage Report] PR Triage Report - February 7, 2026 #14326

Closed

github-actions bot added pr-action:batch-review pr-priority:medium pr-risk:low pr-type:bug pr-batch:batch-code-quality-001 labels Feb 7, 2026

github-actions bot mentioned this pull request Feb 7, 2026

Implement Phase 2 validation helpers to consolidate duplicate patterns #14312

Merged

pelikhan approved these changes Feb 7, 2026

View reviewed changes

pelikhan marked this pull request as ready for review February 7, 2026 13:13

Copilot AI review requested due to automatic review settings February 7, 2026 13:13

pelikhan merged commit e64069c into main Feb 7, 2026
1 check passed

pelikhan deleted the copilot/fix-cmd-tests-nondeterminism branch February 7, 2026 13:13

Copilot started reviewing on behalf of pelikhan February 7, 2026 13:13 View session

Copilot AI reviewed Feb 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix non-deterministic engine validation error message#14313

Fix non-deterministic engine validation error message#14313
pelikhan merged 2 commits intomainfrom
copilot/fix-cmd-tests-nondeterminism

Copilot AI commented Feb 7, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 7, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Result

Summary

Failure Details

Root Cause Analysis

Failed Jobs and Errors

Investigation Findings

Recommended Actions

Prevention Strategies

AI Team Self-Improvement

Historical Context

Comments on the Issue (you are @copilot in this section)

Uh oh!

github-actions bot commented Feb 7, 2026

🔍 PR Triage Results

Scores Breakdown

📋 Recommended Action: batch_review

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Feb 7, 2026 •

edited

Loading