feat: scheduled task integration tests and UI improvements#587
feat: scheduled task integration tests and UI improvements#587
Conversation
da50ce2 to
c28de97
Compare
|
/review |
|
✅ Expert Code Review completed successfully! |
🔍 Multi-Model Code Review — PR #587feat: scheduled task integration tests and UI improvements CI Status:
|
| Finding | Verdict | Reason |
|---|---|---|
HandleScheduleCommand missing MessageCount update |
✅ False positive | Switch case uses break, falls through to common session.MessageCount = session.History.Count at line 2307 |
DemoService.cs MessageCount removal breaks demo mode |
✅ Safe removal | CompleteResponse is called via OnTurnEnd callback (line 1325-1333), which adds the assistant message and updates MessageCount. The removed line was redundant. |
HandleScheduleCommand History.Add without HistoryLock |
✅ Not a bug | UI thread writes are serialized by Blazor's SynchronizationContext per documented convention. All existing slash command handlers follow the same pattern. |
Regex ReDoS risk in SlashCommandPattern |
✅ Safe | Patterns are anchored (^...$) and use linear alternation — no catastrophic backtracking vectors. |
/schedule 0m creates zero-interval task |
✅ Handled | TryParseSlashFrequency checks value <= 0, correctly rejecting 0. |
✅ What's Good
- Thorough test coverage — 4 new unit tests for slash command parsing with edge cases, 1 integration test for existing-session execution, 5 new scenario definitions
- Proper test isolation —
SetTasksFilePathForTesting+finallycleanup in the new integration test - Clean UI refactor — Overflow menu migrated from CSS
:hoverto explicit state management with proper backdrop dismiss, improving accessibility (aria-haspopup="menu",aria-expanded) and mobile usability - Slash command implementation — Complete lifecycle: parsing, validation, creation, listing, help text, autocomplete registration
- Overflow protection —
checked()arithmetic inTryParseSlashFrequencyfor large intervals
Recommendation: ✅ Approve
The two MODERATE findings are about the robustness of a best-effort sequencing mechanism in WsBridgeServer. No data loss or stuck sessions are possible — the SessionBusyException catch + re-queue provides a safety net. The suggestions above would improve observability and reduce unnecessary churn under load, but are not blocking.
The PR is well-structured, well-tested, and correctly handles all the tricky conventions (MessageCount fallthrough, History threading, file system isolation).
Reviewed by 3 independent AI reviewers with adversarial consensus. Single-reviewer findings were cross-validated against the codebase before inclusion or discarding.
There was a problem hiding this comment.
PR #587 Expert Review — 3 Independent Reviewers · Adversarial Consensus
Methodology: 3 independent reviewers applied all 12 review dimensions against the full diff. Findings adjudicated by adversarial consensus (3/3 → include; 2/3 → include at median severity; 1/3 → validated by the other 2 reviewers before inclusion/discard).
Dimension Results
| # | Dimension | Verdict |
|---|---|---|
| 1 | IsProcessing State Safety | ✅ LGTM |
| 2 | SDK Event Handling | ✅ LGTM |
| 3 | Thread Safety & Concurrency | 🟡 MODERATE — see inline |
| 4 | Multi-Agent Orchestration | ✅ LGTM |
| 5 | Session Persistence & Data Safety | ✅ LGTM |
| 6 | Bridge Protocol & Remote Mode | 🟡 MODERATE — see inline |
| 7 | MAUI / Blazor Platform Safety | ✅ LGTM |
| 8 | Test Coverage & Quality | �� MODERATE — see inline |
| 9 | Performance & Render Pipeline | ✅ LGTM |
| 10 | Watchdog & Timeout Logic | ✅ LGTM |
| 11 | Connection & Server Management | ✅ LGTM |
| 12 | Code Quality & Conventions | ✅ LGTM |
Consensus Findings
Finding 1 (Dim 8, 3/3 reviewers): Scenario_ScheduledTaskDesktopEntrypoint_HasMarkupCoverage reads SessionSidebar.razor via a hardcoded 4-level relative path from AppDomain.CurrentDomain.BaseDirectory. Breaks in publish/CI layouts with a raw FileNotFoundException. Also violates the "Behavioral Tests Over Structural" principle — all other scenario coverage tests in this PR use Assert.True(true, "See …"). See inline on ScenarioReferenceTests.cs:257.
Finding 2 (Dims 3+6, 2/3 reviewers): WaitForBridgeSendToStartAsync polls IsProcessing 10 × 10 ms and returns void — the caller (DrainPendingPromptsAsync) cannot distinguish "confirmed started" from "timed out", defeating the send-sequencing guard under UI-thread pressure. CancellationToken not forwarded. See inline on WsBridgeServer.cs:337.
Discarded (Single Reviewer Only)
Assert.True(true, …)stubs always pass — flagged by Reviewer 3; Reviewers 1+2 confirmed this is the established cross-reference documentation convention throughout the test suite. → Discarded.DemoService.MessageCountremoval side effect — flagged by Reviewer 1; Reviewer 2 tracedCopilotService.OnTurnEndas the correct owner of that update (confirmed by diff). → Discarded.
Notable Positives
TryCreateFromSlashCommand: 3 unit tests (Theory + 2 Fact) with good boundary coverage ✅checked(value * 60)overflow protection in frequency parsing ✅Service_ExecuteTask_ExistingSession: integration test usesConnectionMode.Demo+SetTasksFilePathForTestingisolation ✅- Click-to-open overflow menu correctly closes on
LocationChanged✅ NavigateHeaderOverflowbundles close +SaveUiState+ navigate cleanly ✅SessionBusyExceptionguard is architecturally correct (UI-thread atomic check) ✅static readonly Regexfields inScheduledTask.csare safe —Regexis a pure BCL type, not a platform API ✅/scheduleautocomplete added to both command list andwithArgslist ✅BridgePromptQueueTests.WaitForAsynccancellation fix is correct ✅
3 independent reviewers with adversarial consensus. CI check status not available at review time.
Generated by Expert Code Review for issue #587
| { | ||
| var sidebarPath = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "..", "..", "..", "..", | ||
| "PolyPilot", "Components", "Layout", "SessionSidebar.razor"); | ||
| var markup = File.ReadAllText(sidebarPath); |
There was a problem hiding this comment.
[🟡 MODERATE] Test Coverage & Quality — Fragile Structural Test
File.ReadAllText(sidebarPath) uses a hardcoded 4-level .. ascent from AppDomain.CurrentDomain.BaseDirectory. This holds for standard Debug builds (PolyPilot.Tests/bin/Debug/net10.0/) but silently breaks in other layouts (dotnet publish, CI artifact staging, non-standard cwd), producing an opaque FileNotFoundException with no indication of which invariant failed.
Additionally, this is a structural test (grepping source for string content) used as primary scenario coverage, conflicting with the "Behavioral Tests Over Structural" principle. All other scenario coverage tests in this PR correctly use the Assert.True(true, "See …") cross-reference pattern already established throughout ScenarioReferenceTests.cs.
Evidence: Path.Combine(BaseDirectory, "..", "..", "..", "..", "PolyPilot", ...) resolves correctly for bin/Debug/net10.0/ but fails in publish/artifact layouts — xunit surfaces a raw FileNotFoundException instead of a readable assertion failure.
Recommendation: Use the established cross-reference pattern:
[Fact]
public void Scenario_ScheduledTaskDesktopEntrypoint_HasMarkupCoverage()
{
Assert.True(true, "See SessionSidebar.razor: href=\"/scheduled-tasks\" confirmed in NavigateHeaderOverflow routing");
}Or, if source-read is intentional: Assert.True(File.Exists(sidebarPath), $"Source file not found: {sidebarPath}");
Flagged by: 3/3 reviewers
| } | ||
| } | ||
|
|
||
| private async Task WaitForBridgeSendToStartAsync(string sessionName, CancellationToken ct = default) |
There was a problem hiding this comment.
[🟡 MODERATE] Bridge Protocol & Thread Safety — Silent Timeout in WaitForBridgeSendToStartAsync
This method polls IsProcessing 10 × 10 ms = 100 ms total, then returns void regardless of whether IsProcessing was confirmed true or the poll simply timed out. Two issues:
1. Silent timeout — caller cannot distinguish confirmed vs. timed-out:
DrainPendingPromptsAsync uses this call to prevent back-to-back sends. IsProcessing is set inside SendPromptAsync via InvokeOnUIAsync (UI thread dispatch). Under any transient UI-thread pressure, all 10 polls can read false before the write is visible — the drain proceeds to the next queued message, defeating the guard.
2. CancellationToken not forwarded:
The call site passes no token (WaitForBridgeSendToStartAsync(pending.SessionName)), leaving the poll uninterruptible when the drain is cancelled.
Recommendation: Return bool and accept the cancellation token:
private async Task(bool) WaitForBridgeSendToStartAsync(
string sessionName, CancellationToken ct = default)
{
for (var i = 0; i < 20; i++) // 200 ms total
{
if (_copilot?.GetSession(sessionName)?.IsProcessing == true)
return true;
await Task.Delay(10, ct).ConfigureAwait(false);
}
return false; // timeout — log a warning at the call site
}Flagged by: 2/3 reviewers
|
/review |
|
✅ Expert Code Review completed successfully! |
There was a problem hiding this comment.
Expert Code Review - PR #587
3 independent reviewers with adversarial consensus. Findings at 2/3+ agreement.
| Dimension | Verdict |
|---|---|
| IsProcessing State Safety | LGTM |
| SDK Event Handling | LGTM |
| Thread Safety | LGTM |
| Multi-Agent Orchestration | LGTM |
| Session Persistence | LGTM |
| Bridge Protocol | MODERATE x1 |
| MAUI Platform Safety | LGTM |
| Test Coverage | MAJOR x1 + MODERATE x1 |
| Performance | LGTM |
| Watchdog | LGTM |
| Connection Management | LGTM |
| Code Quality | LGTM |
Checklist:
- IsProcessing State Safety
- SDK Event Handling
- Thread Safety
- Multi-Agent Orchestration
- Session Persistence
- Bridge Protocol - WaitForBridgeSendToStartAsync 100ms silent timeout
- MAUI Platform Safety
- Test Coverage - 4 no-op stubs + fragile path
- Performance, Watchdog, Connection, Code Quality
MAJOR - No-op Scenario Stubs (inline on ScenarioReferenceTests.cs:291)
Four new Scenario_*_HasUnitTestCoverage methods call Assert.True(true) unconditionally. Flagged 2/3.
MODERATE - WaitForBridgeSendToStartAsync Silent Timeout (inline on WsBridgeServer.cs:339)
100ms polling window expires silently; drain loop may overlap next prompt. Flagged 2/3.
MODERATE - Fragile Relative Path (inline on ScenarioReferenceTests.cs:255)
Four levels of .. vs established 3-level pattern. Flagged 2/3.
Key LGTM Decisions
IsProcessing: DemoService targets internal AgentSessionInfo; ClearProcessingState() fires via OnTurnEnd->CompleteResponse. 3/3.
Platform Safety: static readonly Regex is pure .NET BCL, no platform API dependency. 3/3.
Thread Safety: HandleScheduleCommand on Blazor UI thread; History.Add() follows established pattern. 2/3.
No BLOCKING issues. CI status: not available.
Generated by Expert Code Review for issue #587
| [Fact] | ||
| public void Scenario_ScheduledTaskTargetExistingSession_HasUnitTestCoverage() | ||
| { | ||
| Assert.True(true, "See ScheduledTaskTests.Service_EvaluateTasksAsync_ExecutesDueTasks and Service_ExecuteTask_ExistingSession_UsesThatSessionWithoutCreatingAnotherSession"); |
There was a problem hiding this comment.
[🟠 MAJOR] Test Coverage — No-op Scenario Cross-Reference Stubs
Flagged by: 2/3 reviewers
This method (and three others added in this PR: Scenario_ScheduledTaskSlashCommand_HasUnitTestCoverage, Scenario_ScheduledTaskPersistsAfterRelaunch_HasUnitTestCoverage, Scenario_ScheduledTaskCloseSessionDisablesTask_HasUnitTestCoverage) are permanent no-ops: Assert.True(true, ...) passes unconditionally regardless of whether the referenced test methods exist or pass. If Service_ExecuteTask_ExistingSession_UsesThatSessionWithoutCreatingAnotherSession is renamed or deleted, this test remains green — silently hiding lost coverage.
This violates the project's Behavioral Tests Over Structural principle. The cross-reference system exists precisely to catch when scenario JSON and unit tests diverge.
Recommendation: Replace with a reflection-based method-existence check:
[Fact]
public void Scenario_ScheduledTaskTargetExistingSession_HasUnitTestCoverage()
{
Assert.NotNull(typeof(ScheduledTaskTests)
.GetMethod("Service_ExecuteTask_ExistingSession_UsesThatSessionWithoutCreatingAnotherSession"));
}This fails if the referenced test is renamed or deleted, restoring the linkage intent. Alternatively, remove these stubs entirely — four permanently-green tests that verify nothing add noise, not safety.
|
|
||
| private async Task WaitForBridgeSendToStartAsync(string sessionName, CancellationToken ct = default) | ||
| { | ||
| for (var attempt = 0; attempt < 10; attempt++) |
There was a problem hiding this comment.
[🟡 MODERATE] Bridge Protocol — WaitForBridgeSendToStartAsync Silent Timeout
Flagged by: 2/3 reviewers
WaitForBridgeSendToStartAsync polls IsProcessing == true for a maximum of 10 × 10ms = 100ms. When the poll expires, it returns silently — no log, no error — and the drain loop immediately dispatches the next queued prompt.
Concrete scenario: Bridge drain replays 3 queued messages. First DispatchBridgePromptAsync returns before SendPromptAsync sets IsProcessing (possible under load or with slow SDK init). WaitForBridgeSendToStartAsync polls for 100ms and gives up. Drain loop calls DispatchBridgePromptAsync again. If IsProcessing still is not set, the if (info?.IsProcessing == true) throw new SessionBusyException check passes — both prompts execute concurrently, corrupting replay ordering.
Recommendation: Add a diagnostic log when the poll expires so timing regressions are observable in production:
// After the loop:
BridgeLog($"[BRIDGE] WaitForBridgeSendToStartAsync timed out for '{sessionName}' — next drain may overlap");Consider a longer timeout (500ms with shorter poll interval) or tying the wait to the SendPromptAsync task itself for a stronger guarantee.
| [Fact] | ||
| public void Scenario_ScheduledTaskDesktopEntrypoint_HasMarkupCoverage() | ||
| { | ||
| var sidebarPath = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "..", "..", "..", "..", |
There was a problem hiding this comment.
[🟡 MODERATE] Test Coverage — Fragile Relative Path in Markup Coverage Test
Flagged by: 2/3 reviewers
This test reads SessionSidebar.razor via four levels of .. traversal from AppDomain.CurrentDomain.BaseDirectory — one more than the established ScenariosDir pattern (three levels, defined in TestSetup.cs line 19). If the test output directory depth changes (target framework rename such as net10.0 → net10.0-maccatalyst, a flat CI publish, or a new test runner), File.ReadAllText throws FileNotFoundException with no helpful diagnostic message. Passing CI then depends on invisible layout assumptions.
The test also asserts source-file content (the link href string exists in markup) rather than runtime behavior — a /scheduled-tasks link that routes to a blank page still passes.
Recommendation: Align with the established ScenariosDir pattern or define a shared SourceDir constant:
// Consistent with TestSetup.cs ScenariosDir (3 levels up):
private static string SourceDir => Path.GetFullPath(
Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "..", "..", "..", ".."));
private static string SidebarPath => Path.Combine(
SourceDir, "PolyPilot", "Components", "Layout", "SessionSidebar.razor");This centralizes the depth assumption and makes failures easy to diagnose.
|
/review |
|
❌ Expert Code Review failed. Please review the logs for details. |
|
/review |
|
✅ Expert Code Review completed successfully! |
|
/review |
|
✅ Expert Code Review completed successfully! |
|
/review |
|
✅ Expert Code Review completed successfully! |
This comment has been minimized.
This comment has been minimized.
|
/review |
|
✅ Expert Code Review completed successfully! |
There was a problem hiding this comment.
Expert Code Review — PR #587
feat: add scheduled task integration tests and UI scaffolding
Methodology
3 independent reviewers conducted parallel analysis, with adversarial consensus applied to all findings. Findings required agreement from ≥2 reviewers to be included. 2 single-reviewer findings were discarded after follow-up consensus rounds.
Findings Summary
| # | Severity | File | Finding |
|---|---|---|---|
| 1 | 🟡 MODERATE | WsBridgeServer.cs:337 |
WaitForBridgeSendToStartAsync is dead code in the drain path — adds ~100ms latency with no sync benefit |
| 2 | 🟡 MODERATE | ScenarioReferenceTests.cs:253 |
Structural test as sole scenario coverage with fragile 4-level path traversal |
| 3 | 🟡 MODERATE | ScheduledTaskTests.cs:839 |
Hardcoded path restore in finally block — should capture/restore pre-test value |
Discarded Findings (single reviewer only)
- Regex
\\d+NIT inSlashCommandAutocompleteTests.cs— 1/3 only, both follow-up reviewers disagreed - Dual hover/click
@@button NIT — 1/3 only, not validated by others
Assessment
Overall: This is a solid PR that adds comprehensive scheduled task integration tests (23 tests), UI scaffolding for the /schedule slash command, overflow menu improvements, and bridge server drain-loop fixes. The test coverage is thorough and the scenarios JSON is well-structured.
The 3 findings above are improvement suggestions around test isolation patterns and a minor dead-code path — none are correctness bugs. The core functionality is sound.
CI Status: Not checked (gh CLI not authenticated in this environment).
Verdict: REQUEST_CHANGES per review protocol (🟡 MODERATE findings present). All findings are addressable with minor changes.
Warning
⚠️ Firewall blocked 1 domain
The following domain was blocked by the firewall during workflow execution:
192.0.2.1
To allow these domains, add them to the network.allowed list in your workflow frontmatter:
network:
allowed:
- defaults
- "192.0.2.1"See Network Configuration for more information.
Generated by Expert Code Review for issue #587
| } | ||
| } | ||
|
|
||
| private async Task WaitForBridgeSendToStartAsync(string sessionName, CancellationToken ct = default) |
There was a problem hiding this comment.
[🟡 MODERATE] Bridge Protocol — Dead-code poll loop in drain path
Flagged by: 2/3 reviewers
WaitForBridgeSendToStartAsync polls 10×10ms for IsProcessing == true, but in the drain path the preceding await DispatchBridgePromptAsync already awaits full turn completion (SDK sessions await ResponseCompletion.Task). By the time this poll runs, IsProcessing has already cycled true→false. This adds ~100ms latency per drained prompt with no synchronization benefit.
Additionally, the method exits silently with no diagnostic log tag when the wait expires — per PolyPilot conventions, all processing-state-related transitions should have a log entry.
Recommendation: Either remove from the drain loop (the sequential await already guarantees ordering), or add a bool return + [BRIDGE-WAIT-TIMEOUT] diagnostic log when the wait expires. If kept for the live send handler path, document the intent.
| /// Structural coverage: the desktop sidebar keeps the Scheduled Tasks overflow link wired to /scheduled-tasks. | ||
| /// </summary> | ||
| [Fact] | ||
| public void Scenario_ScheduledTaskDesktopEntrypoint_HasMarkupCoverage() |
There was a problem hiding this comment.
[🟡 MODERATE] Test Coverage — Structural test as sole scenario coverage
Flagged by: 3/3 reviewers (after adversarial consensus)
This test reads raw .razor markup from disk and asserts on string literals — a structural test pattern. Per PolyPilot conventions (copilot-instructions § "Behavioral Tests Over Structural"), structural tests should be supplementary guards, not primary coverage. Every other scenario cross-reference test in this file uses the Assert.True(true, "See ...") documentation pattern pointing to behavioral tests.
The 4-level ../../../.. path traversal from AppDomain.CurrentDomain.BaseDirectory is also fragile — it differs from the 3-level ScenariosDir pattern used elsewhere and may break in published/CI layouts with different output directory depths.
Recommendation: Either (a) convert to the documentation pattern (Assert.True(true, "See ...")) backed by a behavioral test, or (b) anchor the file path to the established ScenariosDir base and mark as supplementary coverage alongside a behavioral test.
| finally | ||
| { | ||
| try { File.Delete(tempFile); } catch { } | ||
| ScheduledTaskService.SetTasksFilePathForTesting( |
There was a problem hiding this comment.
[🟡 MODERATE] Test Quality — Hardcoded path restore in finally block
Flagged by: 2/3 reviewers
The finally block restores SetTasksFilePathForTesting to a hardcoded Path.Combine(TestSetup.TestBaseDir, "scheduled-tasks.json") rather than capturing and restoring the pre-test value. If a test fixture or earlier test sets a different override path, this silently clobbers it.
Note: One reviewer observed this is a systemic pattern across all scheduled task tests, not unique to this one — so a fix here should ideally be applied uniformly.
Recommendation: Capture the prior path at test entry and restore it:
var prior = ScheduledTaskService.GetTasksFilePathForTesting(); // add getter if needed
try { ... }
finally { ScheduledTaskService.SetTasksFilePathForTesting(prior); }Alternatively, use TestSetup.TestBaseDir with a GUID-based unique filename to stay consistent.
c28de97 to
1f644b1
Compare
|
/review |
|
✅ Expert Code Review completed successfully! |
|
/review |
|
✅ Expert Code Review completed successfully! |
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Expert Review — PR #587: Scheduled Task Integration Tests & UI Scaffolding
Methodology: 3 independent reviewers with adversarial consensus (Reviewer 3 timed out after 25min; findings validated across Reviewers 1 and 2 with targeted follow-up challenges).
Findings by Severity
| # | Severity | File | Finding | Consensus |
|---|---|---|---|---|
| 1 | 🟡 MODERATE | SessionSidebar.razor.css |
Missing .sidebar-header:has(.header-overflow.open) z-index elevation — click-opened panel can render behind toolbar on touch devices |
2/2 |
| 2 | 🟡 MODERATE | WsBridgeServer.cs |
WaitForBridgeSendToStartAsync return value discarded + missing cancellation token propagation during drain/shutdown |
2/2 |
| 3 | 🟡 MODERATE | ScheduledTask.cs |
/schedule 2d produces "Every 2880 minutes" display; daily/weekly stores dead IntervalMinutes enshrined in tests |
2/2 |
| 4 | 🟢 MINOR | ScheduledTaskTests.cs |
Null-forgiving previousPath! masks type-contract mismatch between getter (string?) and setter (string) |
2/2 |
Verified Safe (No Action Needed)
- ✅ DemoService
MessageCountremoval —FlushCurrentResponseviaCompleteResponsealready handles this (2/2 verified) - ✅
SessionBusyExceptionguard inDispatchBridgePromptAsync— redundant but harmless pre-check, no TOCTOU race (both run atomically on UI thread) - ✅
/scheduleslash command control flow —HandleScheduleCommandreturns correctly propagate through thebreakin the switch case - ✅ Regex patterns —
^...$-anchored with bounded alternations, no catastrophic backtracking risk, overflow caught bycheckedarithmetic
Discarded (Single Reviewer Only)
BuildSuggestedTaskNameuses interpreted regex instead of compiled (1/2 — negligible perf impact, .NET static cache handles it)Scenario_ScheduledTaskDesktopEntrypoint_HasMarkupCoverageis structural no-opAssert.True(true)(1/2 — consistent with some existing scenario reference tests)
CI & Test Assessment
- New tests are well-structured with proper isolation (
SetTasksFilePathForTestingcapture/restore pattern) - Slash command parsing has good edge case coverage (valid inputs, invalid inputs,
everyprefix variant) - Scenario JSON definitions cover the new UI flows comprehensively
BridgePromptQueueTeststimeout increased from 8s→15s with proper cancellation — good robustness improvement
Overall
Solid PR with good test coverage. The three moderate findings are all addressable without structural changes — the CSS z-index fix is a one-liner, the bridge cancellation token is a simple parameter pass-through, and the interval display is a ScheduleDescription improvement. No critical issues found.
Warning
⚠️ Firewall blocked 1 domain
The following domain was blocked by the firewall during workflow execution:
192.0.2.1
To allow these domains, add them to the network.allowed list in your workflow frontmatter:
network:
allowed:
- defaults
- "192.0.2.1"See Network Configuration for more information.
Generated by Expert Code Review for issue #587
| try | ||
| { | ||
| await DispatchBridgePromptAsync(pending.SessionName, pending.Message, pending.AgentMode); | ||
| await WaitForBridgeSendToStartAsync(pending.SessionName); |
There was a problem hiding this comment.
🟡 MODERATE — WaitForBridgeSendToStartAsync return value discarded + missing cancellation token (Flagged by 2/2 reviewers)
Two issues with this call:
-
Return value discarded:
WaitForBridgeSendToStartAsyncreturnsTask(bool)(falseon timeout), but the result is never checked. Consider changing the return type toTask(since no caller uses the bool) or adding a comment documenting the intent. -
Missing cancellation token: This calls with
CancellationToken.None(the default). During shutdown/reconnect, the method polls for the full 200ms per queued prompt. With multiple queued prompts, this blocks drain teardown unnecessarily.
Suggested fix:
await WaitForBridgeSendToStartAsync(pending.SessionName, _cts?.Token ?? default);| { | ||
| schedule = ScheduleType.Daily; | ||
| intervalMinutes = 24 * 60; | ||
| } | ||
| else | ||
| { | ||
| intervalMinutes = checked(value * 24 * 60); |
There was a problem hiding this comment.
🟡 MODERATE — Multi-day/week intervals produce confusing user-visible display (Flagged by 2/2 reviewers)
/schedule 2d Send recap takes this branch: Schedule = Interval with intervalMinutes = 2880. ScheduleDescription renders this as "Every 2880 minutes" — confusing for users. Same issue for 2w → "Every 20160 minutes".
Additionally, the daily/weekly keyword cases (lines 353-360) set IntervalMinutes = 1440/10080, but GetNextRunTimeUtc ignores IntervalMinutes entirely for Daily/Weekly schedule types — this is dead data that the tests then enshrine as a contract.
Suggested fixes:
- For
2d/2w: useScheduleType.Daily/Weeklywith a count multiplier, or improveScheduleDescriptionto show "Every 2 days" for large interval values - For
daily/weeklykeywords: don't setIntervalMinutesto a misleading value, or add a comment documenting that it is unused for these schedule types
| finally | ||
| { | ||
| try { File.Delete(tempFile); } catch { } | ||
| ScheduledTaskService.SetTasksFilePathForTesting(previousPath!); |
There was a problem hiding this comment.
🟢 MINOR — Null-forgiving operator masks type-contract mismatch (Flagged by 2/2 reviewers)
GetTasksFilePathForTesting() returns string? but SetTasksFilePathForTesting() takes string (non-nullable). The ! suppresses the warning. Safe in practice because [ModuleInitializer] always runs first, but the type-contract gap could cause issues if SetTasksFilePathForTesting ever needs to handle null explicitly.
Suggestion: Change SetTasksFilePathForTesting parameter to string? to match the getter's return type.
57faad0 to
9bb955a
Compare
|
/review |
|
✅ Expert Code Review completed successfully! |
🔍 Design-Level Observations (MINOR)These were flagged during the multi-model review but deemed low-severity after adversarial consensus.
Discarded Findings (1/3 only)
|
There was a problem hiding this comment.
Expert Code Review — PR #587
Methodology: 3 independent reviewers with adversarial consensus. Findings only included when 2+ reviewers agreed (after follow-up validation for contested items). 4 findings discarded as single-reviewer-only.
Findings by Severity
| # | Severity | File | Finding | Consensus |
|---|---|---|---|---|
| 1 | 🟡 MODERATE | WsBridgeServer.cs:342 |
WaitForBridgeSendToStartAsync polls 20×10ms but IsProcessing may already be cleared by the time polling starts — adds unnecessary 200ms latency per drained prompt on fast/error paths |
2/3 initial |
| 2 | 🟡 MODERATE | WsBridgeServer.cs:319 |
DrainPendingPromptsAsync dequeues before checking cancellation — a prompt can be silently lost if CT fires between dequeue and dispatch |
3/3 after follow-up |
| 3 | 🟢 MINOR | BridgePromptQueueTests.cs:35 |
ct.ThrowIfCancellationRequested() can throw even when condition was met (tiny race window, test-only code) |
3/3 after follow-up |
| 4 | 🟢 MINOR | Dashboard.razor |
HandleScheduleCommand does sync disk I/O via AddTask→SaveTasks on UI thread (follows existing codebase pattern) |
3/3 after follow-up |
| 5 | �� MINOR | ScheduledTask.cs |
1d→Daily vs 2d→Interval schedule type asymmetry (intentional design, but undocumented UX boundary) |
2/3 after follow-up |
Verified Safe ✅
- DemoService
MessageCountremoval — Dead-code cleanup;CopilotServicemanagesMessageCountindependently SessionBusyExceptionguard — No TOCTOU race (both check and send run in sameInvokeOnUIAsynccallback on UI thread)HandleScheduleCommandpost-switch flow — Usesbreak(notreturn), soMessageCountupdate,_needsScrollToBottom, andSafeRefreshAsyncalways execute- Thread safety of
History.Add— Runs on UI thread via Blazor event handler, consistent with existing patterns static readonly Regexfields — No MAUI platform API calls; safe on all platforms.RegexOptions.Compiledfalls back to interpreted on iOS/AOT- CancellationToken propagation in drain loop — Correctly re-throws
OperationCanceledException, propagates CT to all sub-calls - Header overflow menu — Backdrop pattern,
@onclick:stopPropagation, andLocationChangedcleanup are correct - Slash command autocomplete —
/schedulecorrectly added in alphabetical order withhasArgs: true
Test Coverage Assessment
New tests provide good behavioral coverage: slash command parsing (4 [InlineData] + 2 [Fact]), interval formatting (3 [Fact]), existing session execution, scenario references. The Service_ExecuteTask_ExistingSession test properly uses capture-restore for file isolation.
CI Status
CI checks not inspectable from this environment. Verify tests pass before merging.
Prior Review Status
This is the 4th commit iteration addressing previous review feedback (bridge timeout observability, test path capture-restore, z-index/CancellationToken/interval display fixes). The iteration history shows responsive and thorough addressal of prior findings.
Verdict: No critical issues found. Two moderate findings in WsBridgeServer are low practical impact (one theoretical with no real CT callers, one performance nit). Three minor findings are informational. Overall this is a well-structured PR with good test coverage.
Generated by Expert Code Review for issue #587
| } | ||
| } | ||
|
|
||
| private async Task<bool> WaitForBridgeSendToStartAsync(string sessionName, CancellationToken ct = default) |
There was a problem hiding this comment.
🟡 MODERATE — WaitForBridgeSendToStartAsync may false-timeout on fast/failed sessions, adding 200ms latency per drained prompt
Flagged by: 2/3 reviewers
DispatchBridgePromptAsync awaits InvokeOnUIAsync, which fully awaits SendPromptAsync before returning. By the time this method polls IsProcessing, the turn may have already completed (especially in demo mode or fast SDK responses), causing all 20 polls to miss and logging a misleading "timed out" message. With N queued prompts that complete quickly, drain latency grows by N × 200ms.
Suggested fix: Since DispatchBridgePromptAsync returns after SendPromptAsync sets IsProcessing (or errored), reduce to 3–5 attempts (30–50ms total):
for (var attempt = 0; attempt < 5; attempt++)
{
if (_copilot?.GetSession(sessionName)?.IsProcessing == true) return true;
await Task.Delay(10, ct).ConfigureAwait(false);
}| { | ||
| while (_pendingBridgePrompts.TryDequeue(out var pending)) | ||
| { | ||
| ct.ThrowIfCancellationRequested(); |
There was a problem hiding this comment.
🟡 MODERATE — Dequeue-before-cancel-check can silently lose a prompt on cancellation
Flagged by: 3/3 reviewers (after follow-up)
The loop dequeues first via TryDequeue, then checks ct.ThrowIfCancellationRequested(). If cancellation fires in that window, the prompt is permanently lost — dequeued from the queue but never dispatched or re-enqueued.
In practice, current call sites pass default cancellation tokens so this cannot fire today. But if a future caller (e.g., shutdown, reconnect) passes a real CT, message loss becomes possible.
Suggested fix: Check cancellation before dequeue, or catch OperationCanceledException after dequeue and re-enqueue the pending item:
while (_pendingBridgePrompts.TryDequeue(out var pending))
{
try
{
ct.ThrowIfCancellationRequested();
// ... dispatch ...
}
catch (OperationCanceledException)
{
_pendingBridgePrompts.Enqueue(pending); // re-enqueue
throw;
}
}| while (!condition() && sw.ElapsedMilliseconds < maxMs) | ||
| while (!condition() && sw.ElapsedMilliseconds < maxMs && !ct.IsCancellationRequested) | ||
| await Task.Delay(pollMs, ct); | ||
| ct.ThrowIfCancellationRequested(); |
There was a problem hiding this comment.
🟢 MINOR — ct.ThrowIfCancellationRequested() can throw even when the condition was met
Flagged by: 3/3 reviewers (after follow-up)
If cancellation fires in the same poll cycle that condition() becomes true, the loop exits (because !condition() is false), but this line unconditionally throws OperationCanceledException — failing the test even though the condition was satisfied. The race window is tiny (nanoseconds) but could cause flaky CI failures under load.
Suggested fix: Check the condition before throwing:
while (!condition() && sw.ElapsedMilliseconds < maxMs && !ct.IsCancellationRequested)
await Task.Delay(pollMs, ct);
if (condition()) return; // success — don't throw on coincidental cancel
ct.ThrowIfCancellationRequested();
throw new TimeoutException(...);
🔍 Multi-Model Code Review — PR #587 (Round 3)Scheduled Task Integration Tests & UI Improvements Validation
Previous Findings Status (Round 1→2)All 6 original findings from Round 1 were addressed in Round 2:
Round 2 regression (duplicate prompt on cancel) was also fixed with Round 3 Findings🟢 MINOR — FIFO ordering violated on cancel re-enqueue (known/accepted)File:
Investigated & Dismissed
Test Coverage
Recommendation✅ Approve — All previous findings addressed. No new bugs found in Round 3. The one minor FIFO ordering note was previously accepted. 3,526 tests passing. The untested |
- Add ScheduledTask model with cron-like scheduling support - Add scheduled task scenarios JSON for integration testing - Add ScheduledTaskTests with comprehensive coverage - Add Dashboard UI scaffolding for scheduled tasks - Add WsBridgeServer scheduled task command handling - Update SessionSidebar with scheduled task indicators - Update ScenarioReferenceTests for new scenarios - Minor test fixes in BridgePromptQueueTests and SlashCommandAutocompleteTests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…robustness - WsBridgeServer: WaitForBridgeSendToStartAsync now returns bool, polls 20×10ms (200ms), logs diagnostic warning on timeout - ScenarioReferenceTests: Replace fragile 4-level relative path test with established Assert.True cross-reference pattern - ScenarioReferenceTests: Upgrade 4 no-op Assert.True stubs to reflection- based method-existence checks (Assert.NotNull + typeof().GetMethod) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Address review finding: hardcoded path restore in finally block now uses capture-restore pattern via new GetTasksFilePathForTesting() getter. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…nterval display, null safety) - CSS: add .header-overflow.open to sidebar-header z-index elevation for touch devices that use click instead of hover - WsBridgeServer: propagate CancellationToken through drain loop, use WaitForBridgeSendToStartAsync return value, log on timeout - ScheduledTask: add FormatInterval() for friendly display — '2d' now shows 'Every 2 days' instead of 'Every 2880 minutes' - Tests: replace null-forgiving previousPath! with proper null check, add 3 tests for hours/days/weeks interval formatting Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- /schedule list now shows numbered indices (1, 2, 3...) - /schedule delete <#> removes a task by its list index - Updated help text and usage to document delete subcommand - Added test for usage text including delete subcommand Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- DrainPendingPromptsAsync: re-enqueue dequeued prompt on cancellation to prevent data loss - WaitForBridgeSendToStartAsync: document 200ms poll ceiling and non-fatal timeout - /schedule list: show short GUID prefix for stable task identification (TOCTOU mitigation) - HandleScheduleCommand: add TODO for session-name correlation key fragility - ScheduledTaskTests: dispose ScheduledTaskService to prevent timer leak across tests - BridgePromptQueueTests: swap CT/condition check order to preserve timeout diagnostics Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- DrainPendingPromptsAsync: track dispatched flag, only re-enqueue if dispatch didn't complete (prevents duplicate delivery when OCE from WaitForBridgeSendToStartAsync) - /schedule delete: echo short GUID prefix in delete confirmation message Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
b79d443 to
fafdc31
Compare
The `/fix` workflow now dispatches both `verify-build` AND `polypilot-integration` after pushing fixes. **polypilot-integration** now accepts `pr_number` and `ref` inputs, checks out the PR branch, and posts a results table back to the PR: - Linux/GTK: build → launch under xvfb → MauiDevFlow agent → smoke tests - Mac Catalyst: build verification - Windows: build → launch → DevFlow → smoke tests Testing this against PR #587 (scheduled task integration tests). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Cross-Platform Verification — PR #587Build Results
❌ Platform verification failed
|
Cross-Platform Verification — PR #587Build Results
❌ Platform verification failed
|
Cross-Platform Verification — PR #587Build Results
❌ Platform verification failed
|
Adds end-to-end integration tests for the scheduled tasks feature (PR #587). **Test script** (`.github/integration-tests/scheduled-tasks.sh`): 1. Navigate to `/scheduled-tasks` page 2. Create a new interval task via the form 3. Verify task card appears with correct schedule/prompt 4. Toggle disable/enable 5. Validate invalid cron rejection 6. Delete the task and verify removal **Usage:** ```bash gh workflow run polypilot-integration.yml -F scenario=scheduled-tasks -F pr_number=587 -F ref=scheduled-task-integration-tests ``` Runs on both Linux/GTK and Windows via MauiDevFlow CDP. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
Extends the scheduled tasks feature (PR #380) with additional integration test scenarios, UI scaffolding, and bridge support.
Changes
Tests:
scheduled-task-scenarios.json(desktop entrypoint, target existing session, slash command, persistence after relaunch, session close disables task)ScheduledTaskTests.cscovering the added scenariosScenarioReferenceTestsupdated with structural coverage for all new scenariosBridgePromptQueueTestshardened with cancellation token support and increased timeoutSlashCommandAutocompleteTestsminor fixesUI:
SessionSidebar.razor— scheduled task indicators and overflow menu link to /scheduled-tasksSessionSidebar.razor.css— styles for scheduled task indicatorsDashboard.razor— scheduled task UI scaffolding, injectedScheduledTaskServiceBridge:
WsBridgeServer.cs— scheduled task command handling with session busy checkModel:
ScheduledTask.cs— additional model support (149 lines)Misc:
DemoService.cs— removed unused importsindex.html— minor updateTesting
All 3,351 tests pass ✅