feat: run all integration tests + clipboard copy tests#739
Conversation
- Remove category filter from CI — runs ALL integration tests - Add ClipboardCopyTests.cs to main Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Design-Level Findings (outside the diff)These findings relate to code not changed in this PR but affected by the changes:
|
There was a problem hiding this comment.
Code Review — PR #739
Findings
1. 🟡 MODERATE — || true now masks ALL test failures, including previously-enforced ones
📄 .github/workflows/polypilot-integration.yml, line 568
Before this PR, the macOS job ran two dotnet test invocations: the first (fast ScheduledTasks tests) ran without || true, meaning real regressions would fail CI. The second (slow ScheduledTaskExecution) used || true intentionally because those are timing-sensitive.
After this PR, ALL tests run in a single invocation with || true, so a genuine regression in the fast, reliable ScheduledTasks tests will now be silently swallowed. This defeats the purpose of having a CI gate.
Failing scenario: A future commit introduces a bug that breaks all ScheduledTasks tests. CI still reports green. Nobody notices until a user hits the bug.
Suggestion: Separate the || true so that only the known-flaky categories are masked:
# Reliable tests — fail CI if broken
POLYPILOT_AGENT_PORT=$PORT dotnet test PolyPilot.IntegrationTests \
--filter "Category!=ScheduledTaskExecution&Category!=ClipboardCopy" \
--nologo --verbosity normal 2>&1
# Flaky/environment-dependent tests — informational only
POLYPILOT_AGENT_PORT=$PORT dotnet test PolyPilot.IntegrationTests \
--filter "Category=ScheduledTaskExecution|Category=ClipboardCopy" \
--nologo --verbosity normal 2>&1 || true2. 🟡 MODERATE — Clipboard tests silently pass with no assertions when no messages exist
📄 PolyPilot.IntegrationTests/ClipboardCopyTests.cs, lines 59–63 (also lines 96–100, 124–129)
Tests 2–4 (CopyButton_ClickShowsSuccessIndicator, CopyButton_SuccessIndicatorResetsAfterDelay, CopyButton_ShowsCheckmarkSvgWhenCopied) all skip silently via return when no .copy-icon-btn exists. In CI with a fresh app (no chat history), these tests will always silently pass. This means the clipboard copy feature has zero effective CI coverage.
Combined with finding #1, a broken clipboard feature won't produce any test failure signal.
Suggestion: Use Skip (xUnit v3) or Assert.Skip(...) instead of return so these show as skipped in the test report rather than passing silently. Alternatively, create a test fixture that sends a message first to guarantee copy buttons exist.
3. 🟢 MINOR — CopyButton_ExistsOnMessages asserts nothing
📄 PolyPilot.IntegrationTests/ClipboardCopyTests.cs, line 37
Assert.True(true, "App loaded successfully with copy button support");This unconditionally passes. The test title claims it checks whether copy buttons "exist on messages", but it never actually asserts that. The hasCopyComponent check on line 33 also has a logic bug (see finding #5).
Suggestion: Either make this test assert something meaningful (e.g., that the page loads without errors), or remove it — a test that always passes provides false confidence.
4. 🟡 MODERATE — Cross-platform inconsistency: only macOS runs all tests
📄 .github/workflows/polypilot-integration.yml
The diff only modifies the macOS (Mac Catalyst) job step. The Linux job (line 263–275) and Windows job (line 697–710) still use --filter "Category=ScheduledTasks". This means the new ClipboardCopy tests will only run on macOS, not on Linux or Windows.
If this is intentional (clipboard behavior differs per platform), it should be documented. If not, the Linux and Windows steps should be updated to match.
5. 🟢 MINOR — typeof document.querySelector(...) is always 'object', never 'undefined'
📄 PolyPilot.IntegrationTests/ClipboardCopyTests.cs, lines 32–34
var hasCopyComponent = await CdpEvalAsync(
"typeof document.querySelector('.copy-icon-btn') !== 'undefined' ? 'true' : 'false'");document.querySelector() returns either null or an Element. typeof null === 'object' and typeof Element === 'object'. So typeof ... !== 'undefined' is always true, making this check meaningless. It will always return 'true' regardless of whether the element exists.
Fix: document.querySelector('.copy-icon-btn') !== null ? 'true' : 'false'
6. 🟢 MINOR — Tight polling window in CopyButton_ClickShowsSuccessIndicator risks flakiness
📄 PolyPilot.IntegrationTests/ClipboardCopyTests.cs, lines 71–81
The polling loop runs 5 iterations × 200ms = ~1 second total. But each ExistsAsync call makes an HTTP POST → CDP eval round trip, so actual wall time per iteration may be 200ms+. Since the copied class is only applied for 1.2 seconds, the race window is very tight: if the first couple of polls catch pre-transition state, you may exhaust the window before seeing copied.
Suggestion: Increase to ~10 iterations or 500ms intervals to provide more headroom.
7. 🟢 MINOR — Unpinned action SHA in agentics-maintenance.yml
📄 .github/workflows/agentics-maintenance.yml, line 120
uses: github/gh-aw-actions/setup-cli@v0.62.2All other action references in this file are pinned to commit SHAs (e.g., actions/checkout@de0fac2e...). This one uses a mutable tag, which is a supply-chain inconsistency. Since the file is auto-generated, this may be upstream's responsibility, but worth flagging.
Summary
| # | Severity | Issue |
|---|---|---|
| 1 | 🟡 MODERATE | || true masks all test failures including previously-enforced ones |
| 2 | 🟡 MODERATE | Clipboard tests silently pass as green when they actually skip |
| 3 | 🟢 MINOR | CopyButton_ExistsOnMessages unconditionally passes |
| 4 | 🟡 MODERATE | Only macOS runs all tests; Linux/Windows still filtered |
| 5 | 🟢 MINOR | typeof querySelector() check is always true |
| 6 | 🟢 MINOR | Tight polling loop risks flaky test |
| 7 | 🟢 MINOR | Unpinned action SHA (auto-generated file) |
The biggest concern is the combination of findings #1 and #2: the new clipboard tests will silently pass (no messages in fresh CI app) AND even if they did fail, || true would mask the failure. This means this PR adds tests that provide no effective CI signal.
Generated by Expert Code Review (auto) for issue #739 · ● 10.2M
| # Run ALL integration tests (all categories) | ||
| POLYPILOT_AGENT_PORT=$PORT dotnet test PolyPilot.IntegrationTests \ | ||
| --filter "Category=ScheduledTaskExecution" \ | ||
| --nologo --verbosity normal 2>&1 || true |
There was a problem hiding this comment.
🔴 CRITICAL — || true now masks all integration test failures (Flagged by: 3/3 reviewers)
Previously, the fast ScheduledTasks step ran without || true and would fail CI on regressions. The slow ScheduledTaskExecution step used || true as an intentional best-effort gate. This PR collapses both into a single || true step, meaning every test failure across all categories — including the new ClipboardCopy tests — is silently swallowed. A completely broken test suite reports green.
Suggested fix: Split into a reliable step (no || true) and a best-effort step:
# Fast/reliable tests — must pass
POLYPILOT_AGENT_PORT=$PORT dotnet test PolyPilot.IntegrationTests \
--filter "Category!=ScheduledTaskExecution" \
--nologo --verbosity normal 2>&1
# Slow execution tests — best-effort
POLYPILOT_AGENT_PORT=$PORT dotnet test PolyPilot.IntegrationTests \
--filter "Category=ScheduledTaskExecution" \
--nologo --verbosity normal 2>&1 || true| await ClickAsync(".copy-icon-btn"); | ||
|
|
||
| // Wait for copied state | ||
| await WaitForAsync(".copy-icon-btn.copied", TimeSpan.FromSeconds(3)); |
There was a problem hiding this comment.
🟡 MODERATE — WaitForAsync return value discarded; reset test always passes (Flagged by: 2/3 reviewers)
WaitForAsync returns bool but the result is ignored. If the clipboard action fails (e.g., native API throws in headless CI), the .copied class never appears, WaitForAsync times out silently, stillCopied is false, and Assert.False(stillCopied) trivially passes. The test asserts "the indicator reset" without ever verifying it appeared.
Suggested fix:
var appeared = await WaitForAsync(".copy-icon-btn.copied", TimeSpan.FromSeconds(3));
Assert.True(appeared, "Copy button should enter 'copied' state after click");
await Task.Delay(2000);
var stillCopied = await ExistsAsync(".copy-icon-btn.copied");
Assert.False(stillCopied, "Copy success indicator should reset after ~1.2 seconds");| await main(); | ||
|
|
||
| - name: Install gh-aw | ||
| uses: github/gh-aw-actions/setup-cli@v0.62.2 |
There was a problem hiding this comment.
🟡 MODERATE — setup-cli not SHA-pinned unlike all other actions (Flagged by: 3/3 reviewers)
Every other action in this file is pinned to a full commit SHA (actions/checkout@de0fac2e..., actions/github-script@ed597411..., github/gh-aw-actions/setup@20045bbd...). This one uses a mutable tag. The run_operation job carries actions: write, contents: write, and pull-requests: write permissions. If the v0.62.2 tag is force-pushed, injected code runs with repo-write capability.
Note: This file is auto-generated by gh aw compile, so the fix should go upstream in the generator.
Suggested fix: Pin to the commit SHA:
uses: github/gh-aw-actions/setup-cli@<full-commit-sha> # v0.62.2|
|
||
| // Click copy | ||
| await ClickAsync(".copy-icon-btn"); | ||
| await Task.Delay(300); |
There was a problem hiding this comment.
🟢 MINOR — Fixed 300ms delay is a race condition (Flagged by: 2/3 reviewers)
The chain of async operations (CDP HTTP → JS click → MAUI native clipboard call → Blazor StateHasChanged → render → another CDP round-trip) can exceed 300ms on a loaded CI runner, causing spurious failures.
Suggested fix: Replace the fixed delay with WaitForAsync:
await ClickAsync(".copy-icon-btn");
await WaitForAsync(".copy-icon-btn.copied", TimeSpan.FromSeconds(2));
var afterSvg = await CdpEvalAsync(
"document.querySelector('.copy-icon-btn.copied svg polyline') ? 'checkmark' : 'no-checkmark'");| if (!hasCopyBtn) | ||
| { | ||
| Output.WriteLine("No messages with copy buttons found — skipping click test"); | ||
| return; // Skip gracefully if no messages exist |
There was a problem hiding this comment.
🟡 MODERATE — Tests silently pass when no copy buttons exist (Flagged by: 3/3 reviewers)
All 4 clipboard tests guard with if (!hasCopyBtn) { return; }, which xUnit treats as a pass. In a fresh CI environment without pre-existing chat messages, every test will hit this path and pass while validating nothing. Combined with Finding 1 (|| true), there is zero effective CI signal from these tests.
Additionally, tests share the AppFixture via [Collection("PolyPilot")] but never navigate to a chat view. If ScheduledTaskTests runs first and leaves the app on the scheduled-tasks page, copy buttons won't be found.
Suggested fix: Either deterministically create a message before testing, or use Assert.Skip() (xUnit v3) / Skip.If() so skipped tests are clearly visible in results rather than reported as passes. At minimum, navigate to the chat page at the start of each test.
| var hasCopyComponent = await CdpEvalAsync( | ||
| "typeof document.querySelector('.copy-icon-btn') !== 'undefined' ? 'true' : 'false'"); | ||
| Output.WriteLine($"Copy component available: {hasCopyComponent}"); | ||
|
|
||
| // This test passes if the app loaded — the copy buttons appear when messages exist | ||
| Assert.True(true, "App loaded successfully with copy button support"); |
There was a problem hiding this comment.
🟢 MINOR — typeof check is always true + Assert.True(true) is vacuous (Flagged by: 3/3 reviewers after follow-up)
Two issues in CopyButton_ExistsOnMessages:
-
typeof querySelectorbug (line 33):typeof document.querySelector(...)returns'object'both when the element is found and when it returnsnull(sincetypeof null === 'object'). The expression!== 'undefined'is alwaystrue, makinghasCopyComponentmeaningless. Should bedocument.querySelector('.copy-icon-btn') !== null. -
Vacuous assertion (line 37):
Assert.True(true, ...)cannot fail under any circumstance. The test provides zero coverage.
Suggested fix: Either assert something meaningful (e.g., verify the copy button CSS exists in the stylesheet) or remove this test to avoid false coverage confidence.
Removes the ScheduledTasks-only filter and adds ClipboardCopyTests.