Skip to content

feat: scheduled task integration tests and UI improvements#587

Merged
PureWeen merged 7 commits intomainfrom
scheduled-task-integration-tests
Apr 22, 2026
Merged

feat: scheduled task integration tests and UI improvements#587
PureWeen merged 7 commits intomainfrom
scheduled-task-integration-tests

Conversation

@PureWeen
Copy link
Copy Markdown
Owner

Summary

Extends the scheduled tasks feature (PR #380) with additional integration test scenarios, UI scaffolding, and bridge support.

Changes

Tests:

  • New integration test scenarios in scheduled-task-scenarios.json (desktop entrypoint, target existing session, slash command, persistence after relaunch, session close disables task)
  • New unit tests in ScheduledTaskTests.cs covering the added scenarios
  • ScenarioReferenceTests updated with structural coverage for all new scenarios
  • BridgePromptQueueTests hardened with cancellation token support and increased timeout
  • SlashCommandAutocompleteTests minor fixes

UI:

  • SessionSidebar.razor — scheduled task indicators and overflow menu link to /scheduled-tasks
  • SessionSidebar.razor.css — styles for scheduled task indicators
  • Dashboard.razor — scheduled task UI scaffolding, injected ScheduledTaskService

Bridge:

  • WsBridgeServer.cs — scheduled task command handling with session busy check

Model:

  • ScheduledTask.cs — additional model support (149 lines)

Misc:

  • DemoService.cs — removed unused imports
  • index.html — minor update

Testing

All 3,351 tests pass ✅

@PureWeen PureWeen force-pushed the scheduled-task-integration-tests branch from da50ce2 to c28de97 Compare April 17, 2026 14:10
@PureWeen
Copy link
Copy Markdown
Owner Author

/review

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 17, 2026

Expert Code Review completed successfully!

@PureWeen
Copy link
Copy Markdown
Owner Author

🔍 Multi-Model Code Review — PR #587

feat: scheduled task integration tests and UI improvements
559 additions, 13 deletions across 12 files


CI Status: ⚠️ No checks reported

No CI checks are configured for this branch. Author reports all 3,351 tests pass locally.

Prior Reviews: None (first review)


Findings

🟡 MODERATE

1. WsBridgeServer.csWaitForBridgeSendToStartAsync polls IsProcessing from background thread without memory barrier
Flagged by: 3/3 reviewers

private async Task WaitForBridgeSendToStartAsync(string sessionName, CancellationToken ct = default)
{
    for (var attempt = 0; attempt < 10; attempt++)
    {
        if (_copilot?.GetSession(sessionName)?.IsProcessing == true)  // background thread read
            return;
        await Task.Delay(10, ct).ConfigureAwait(false);
    }
}

IsProcessing is set on the UI thread via InvokeOnUI(). This method reads it from the drain loop's background thread. There is no explicit memory barrier guaranteeing the background thread observes the UI thread's write within the 100ms polling window. The Task.Delay calls likely provide implicit barriers, but this is an implementation detail — not a guaranteed contract.

Suggestion: Consider using InvokeOnUIAsync(() => state.Info.IsProcessing) for the read, or a TaskCompletionSource signaled from SendPromptAsync. At minimum, add a warning log when the wait expires so this is observable in diagnostics.


2. WsBridgeServer.cs — Silent timeout in WaitForBridgeSendToStartAsync causes SessionBusyException re-queue churn during drain
Flagged by: 2/3 reviewers

When WaitForBridgeSendToStartAsync times out after 100ms (e.g., UI thread busy during startup/resume), DrainPendingPromptsAsync proceeds to the next queued prompt. The next DispatchBridgePromptAsync call hits the new IsProcessing == true guard and throws SessionBusyException, re-queuing the prompt. This creates a dequeue→dispatch→busy→re-queue cycle for all remaining prompts until the first completes.

No data loss occurs (the catch block re-queues), but under load with a slow UI thread, every queued prompt after the first churns through this cycle. The silent timeout makes this invisible in logs.

// DrainPendingPromptsAsync:
await DispatchBridgePromptAsync(pending.SessionName, pending.Message, pending.AgentMode);
await WaitForBridgeSendToStartAsync(pending.SessionName);  // silently returns after 100ms
// next iteration: dispatch → busy → re-queue

Suggestion: Return a bool from WaitForBridgeSendToStartAsync indicating success, and break the drain loop when it returns false (defer remaining prompts to the next drain trigger). Log a warning on timeout.


🟢 MINOR

3. ScenarioReferenceTests.cs — Structural test reads .razor source file with fragile relative path
Flagged by: 3/3 reviewers

var sidebarPath = Path.Combine(AppDomain.CurrentDomain.BaseDirectory,
    "..", "..", "..", "..",
    "PolyPilot", "Components", "Layout", "SessionSidebar.razor");
var markup = File.ReadAllText(sidebarPath);

The ../../../../ traversal assumes a specific bin/Debug/net10.0/ output structure. This will break with custom output directories, CI publish pipelines, or dotnet test --output. Per project conventions, structural tests (grepping source code) should be supplementary, not primary coverage. This is already supplementary (the scenario JSON provides the real coverage), but the fragile path could cause false failures.

Suggestion: Guard with File.Exists(sidebarPath) and Skip.IfNot() to prevent hard failures in non-standard environments.


🟢 MINOR — Test Coverage Gap

4. No new tests for WsBridgeServer changes

The new WaitForBridgeSendToStartAsync method and the SessionBusyException throw in DispatchBridgePromptAsync lack test coverage. The existing BridgePromptQueueTests cover the SessionBusyException type and catch block, but don't exercise the new wait-then-busy interaction pattern.


ℹ️ Discarded After Adversarial Review

These findings were flagged by 1-2 reviewers but resolved through adversarial rounds and codebase verification:

Finding Verdict Reason
HandleScheduleCommand missing MessageCount update ✅ False positive Switch case uses break, falls through to common session.MessageCount = session.History.Count at line 2307
DemoService.cs MessageCount removal breaks demo mode ✅ Safe removal CompleteResponse is called via OnTurnEnd callback (line 1325-1333), which adds the assistant message and updates MessageCount. The removed line was redundant.
HandleScheduleCommand History.Add without HistoryLock ✅ Not a bug UI thread writes are serialized by Blazor's SynchronizationContext per documented convention. All existing slash command handlers follow the same pattern.
Regex ReDoS risk in SlashCommandPattern ✅ Safe Patterns are anchored (^...$) and use linear alternation — no catastrophic backtracking vectors.
/schedule 0m creates zero-interval task ✅ Handled TryParseSlashFrequency checks value <= 0, correctly rejecting 0.

✅ What's Good

  • Thorough test coverage — 4 new unit tests for slash command parsing with edge cases, 1 integration test for existing-session execution, 5 new scenario definitions
  • Proper test isolationSetTasksFilePathForTesting + finally cleanup in the new integration test
  • Clean UI refactor — Overflow menu migrated from CSS :hover to explicit state management with proper backdrop dismiss, improving accessibility (aria-haspopup="menu", aria-expanded) and mobile usability
  • Slash command implementation — Complete lifecycle: parsing, validation, creation, listing, help text, autocomplete registration
  • Overflow protectionchecked() arithmetic in TryParseSlashFrequency for large intervals

Recommendation: ✅ Approve

The two MODERATE findings are about the robustness of a best-effort sequencing mechanism in WsBridgeServer. No data loss or stuck sessions are possible — the SessionBusyException catch + re-queue provides a safety net. The suggestions above would improve observability and reduce unnecessary churn under load, but are not blocking.

The PR is well-structured, well-tested, and correctly handles all the tricky conventions (MessageCount fallthrough, History threading, file system isolation).

Reviewed by 3 independent AI reviewers with adversarial consensus. Single-reviewer findings were cross-validated against the codebase before inclusion or discarding.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR #587 Expert Review — 3 Independent Reviewers · Adversarial Consensus

Methodology: 3 independent reviewers applied all 12 review dimensions against the full diff. Findings adjudicated by adversarial consensus (3/3 → include; 2/3 → include at median severity; 1/3 → validated by the other 2 reviewers before inclusion/discard).


Dimension Results

# Dimension Verdict
1 IsProcessing State Safety ✅ LGTM
2 SDK Event Handling ✅ LGTM
3 Thread Safety & Concurrency 🟡 MODERATE — see inline
4 Multi-Agent Orchestration ✅ LGTM
5 Session Persistence & Data Safety ✅ LGTM
6 Bridge Protocol & Remote Mode 🟡 MODERATE — see inline
7 MAUI / Blazor Platform Safety ✅ LGTM
8 Test Coverage & Quality �� MODERATE — see inline
9 Performance & Render Pipeline ✅ LGTM
10 Watchdog & Timeout Logic ✅ LGTM
11 Connection & Server Management ✅ LGTM
12 Code Quality & Conventions ✅ LGTM

Consensus Findings

Finding 1 (Dim 8, 3/3 reviewers): Scenario_ScheduledTaskDesktopEntrypoint_HasMarkupCoverage reads SessionSidebar.razor via a hardcoded 4-level relative path from AppDomain.CurrentDomain.BaseDirectory. Breaks in publish/CI layouts with a raw FileNotFoundException. Also violates the "Behavioral Tests Over Structural" principle — all other scenario coverage tests in this PR use Assert.True(true, "See …"). See inline on ScenarioReferenceTests.cs:257.

Finding 2 (Dims 3+6, 2/3 reviewers): WaitForBridgeSendToStartAsync polls IsProcessing 10 × 10 ms and returns void — the caller (DrainPendingPromptsAsync) cannot distinguish "confirmed started" from "timed out", defeating the send-sequencing guard under UI-thread pressure. CancellationToken not forwarded. See inline on WsBridgeServer.cs:337.


Discarded (Single Reviewer Only)

  • Assert.True(true, …) stubs always pass — flagged by Reviewer 3; Reviewers 1+2 confirmed this is the established cross-reference documentation convention throughout the test suite. → Discarded.
  • DemoService.MessageCount removal side effect — flagged by Reviewer 1; Reviewer 2 traced CopilotService.OnTurnEnd as the correct owner of that update (confirmed by diff). → Discarded.

Notable Positives

  • TryCreateFromSlashCommand: 3 unit tests (Theory + 2 Fact) with good boundary coverage ✅
  • checked(value * 60) overflow protection in frequency parsing ✅
  • Service_ExecuteTask_ExistingSession: integration test uses ConnectionMode.Demo + SetTasksFilePathForTesting isolation ✅
  • Click-to-open overflow menu correctly closes on LocationChanged
  • NavigateHeaderOverflow bundles close + SaveUiState + navigate cleanly ✅
  • SessionBusyException guard is architecturally correct (UI-thread atomic check) ✅
  • static readonly Regex fields in ScheduledTask.cs are safe — Regex is a pure BCL type, not a platform API ✅
  • /schedule autocomplete added to both command list and withArgs list ✅
  • BridgePromptQueueTests.WaitForAsync cancellation fix is correct ✅

3 independent reviewers with adversarial consensus. CI check status not available at review time.

Generated by Expert Code Review for issue #587

{
var sidebarPath = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "..", "..", "..", "..",
"PolyPilot", "Components", "Layout", "SessionSidebar.razor");
var markup = File.ReadAllText(sidebarPath);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[🟡 MODERATE] Test Coverage & Quality — Fragile Structural Test

File.ReadAllText(sidebarPath) uses a hardcoded 4-level .. ascent from AppDomain.CurrentDomain.BaseDirectory. This holds for standard Debug builds (PolyPilot.Tests/bin/Debug/net10.0/) but silently breaks in other layouts (dotnet publish, CI artifact staging, non-standard cwd), producing an opaque FileNotFoundException with no indication of which invariant failed.

Additionally, this is a structural test (grepping source for string content) used as primary scenario coverage, conflicting with the "Behavioral Tests Over Structural" principle. All other scenario coverage tests in this PR correctly use the Assert.True(true, "See …") cross-reference pattern already established throughout ScenarioReferenceTests.cs.

Evidence: Path.Combine(BaseDirectory, "..", "..", "..", "..", "PolyPilot", ...) resolves correctly for bin/Debug/net10.0/ but fails in publish/artifact layouts — xunit surfaces a raw FileNotFoundException instead of a readable assertion failure.

Recommendation: Use the established cross-reference pattern:

[Fact]
public void Scenario_ScheduledTaskDesktopEntrypoint_HasMarkupCoverage()
{
    Assert.True(true, "See SessionSidebar.razor: href=\"/scheduled-tasks\" confirmed in NavigateHeaderOverflow routing");
}

Or, if source-read is intentional: Assert.True(File.Exists(sidebarPath), $"Source file not found: {sidebarPath}");

Flagged by: 3/3 reviewers

Comment thread PolyPilot/Services/WsBridgeServer.cs Outdated
}
}

private async Task WaitForBridgeSendToStartAsync(string sessionName, CancellationToken ct = default)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[🟡 MODERATE] Bridge Protocol & Thread Safety — Silent Timeout in WaitForBridgeSendToStartAsync

This method polls IsProcessing 10 × 10 ms = 100 ms total, then returns void regardless of whether IsProcessing was confirmed true or the poll simply timed out. Two issues:

1. Silent timeout — caller cannot distinguish confirmed vs. timed-out:
DrainPendingPromptsAsync uses this call to prevent back-to-back sends. IsProcessing is set inside SendPromptAsync via InvokeOnUIAsync (UI thread dispatch). Under any transient UI-thread pressure, all 10 polls can read false before the write is visible — the drain proceeds to the next queued message, defeating the guard.

2. CancellationToken not forwarded:
The call site passes no token (WaitForBridgeSendToStartAsync(pending.SessionName)), leaving the poll uninterruptible when the drain is cancelled.

Recommendation: Return bool and accept the cancellation token:

private async Task(bool) WaitForBridgeSendToStartAsync(
    string sessionName, CancellationToken ct = default)
{
    for (var i = 0; i < 20; i++) // 200 ms total
    {
        if (_copilot?.GetSession(sessionName)?.IsProcessing == true)
            return true;
        await Task.Delay(10, ct).ConfigureAwait(false);
    }
    return false; // timeout — log a warning at the call site
}

Flagged by: 2/3 reviewers

@PureWeen
Copy link
Copy Markdown
Owner Author

/review

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 17, 2026

Expert Code Review completed successfully!

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expert Code Review - PR #587

3 independent reviewers with adversarial consensus. Findings at 2/3+ agreement.

Dimension Verdict
IsProcessing State Safety LGTM
SDK Event Handling LGTM
Thread Safety LGTM
Multi-Agent Orchestration LGTM
Session Persistence LGTM
Bridge Protocol MODERATE x1
MAUI Platform Safety LGTM
Test Coverage MAJOR x1 + MODERATE x1
Performance LGTM
Watchdog LGTM
Connection Management LGTM
Code Quality LGTM

Checklist:

  • IsProcessing State Safety
  • SDK Event Handling
  • Thread Safety
  • Multi-Agent Orchestration
  • Session Persistence
  • Bridge Protocol - WaitForBridgeSendToStartAsync 100ms silent timeout
  • MAUI Platform Safety
  • Test Coverage - 4 no-op stubs + fragile path
  • Performance, Watchdog, Connection, Code Quality

MAJOR - No-op Scenario Stubs (inline on ScenarioReferenceTests.cs:291)

Four new Scenario_*_HasUnitTestCoverage methods call Assert.True(true) unconditionally. Flagged 2/3.

MODERATE - WaitForBridgeSendToStartAsync Silent Timeout (inline on WsBridgeServer.cs:339)

100ms polling window expires silently; drain loop may overlap next prompt. Flagged 2/3.

MODERATE - Fragile Relative Path (inline on ScenarioReferenceTests.cs:255)

Four levels of .. vs established 3-level pattern. Flagged 2/3.

Key LGTM Decisions

IsProcessing: DemoService targets internal AgentSessionInfo; ClearProcessingState() fires via OnTurnEnd->CompleteResponse. 3/3.
Platform Safety: static readonly Regex is pure .NET BCL, no platform API dependency. 3/3.
Thread Safety: HandleScheduleCommand on Blazor UI thread; History.Add() follows established pattern. 2/3.

No BLOCKING issues. CI status: not available.

Generated by Expert Code Review for issue #587

[Fact]
public void Scenario_ScheduledTaskTargetExistingSession_HasUnitTestCoverage()
{
Assert.True(true, "See ScheduledTaskTests.Service_EvaluateTasksAsync_ExecutesDueTasks and Service_ExecuteTask_ExistingSession_UsesThatSessionWithoutCreatingAnotherSession");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[🟠 MAJOR] Test Coverage — No-op Scenario Cross-Reference Stubs

Flagged by: 2/3 reviewers

This method (and three others added in this PR: Scenario_ScheduledTaskSlashCommand_HasUnitTestCoverage, Scenario_ScheduledTaskPersistsAfterRelaunch_HasUnitTestCoverage, Scenario_ScheduledTaskCloseSessionDisablesTask_HasUnitTestCoverage) are permanent no-ops: Assert.True(true, ...) passes unconditionally regardless of whether the referenced test methods exist or pass. If Service_ExecuteTask_ExistingSession_UsesThatSessionWithoutCreatingAnotherSession is renamed or deleted, this test remains green — silently hiding lost coverage.

This violates the project's Behavioral Tests Over Structural principle. The cross-reference system exists precisely to catch when scenario JSON and unit tests diverge.

Recommendation: Replace with a reflection-based method-existence check:

[Fact]
public void Scenario_ScheduledTaskTargetExistingSession_HasUnitTestCoverage()
{
    Assert.NotNull(typeof(ScheduledTaskTests)
        .GetMethod("Service_ExecuteTask_ExistingSession_UsesThatSessionWithoutCreatingAnotherSession"));
}

This fails if the referenced test is renamed or deleted, restoring the linkage intent. Alternatively, remove these stubs entirely — four permanently-green tests that verify nothing add noise, not safety.

Comment thread PolyPilot/Services/WsBridgeServer.cs Outdated

private async Task WaitForBridgeSendToStartAsync(string sessionName, CancellationToken ct = default)
{
for (var attempt = 0; attempt < 10; attempt++)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[🟡 MODERATE] Bridge Protocol — WaitForBridgeSendToStartAsync Silent Timeout

Flagged by: 2/3 reviewers

WaitForBridgeSendToStartAsync polls IsProcessing == true for a maximum of 10 × 10ms = 100ms. When the poll expires, it returns silently — no log, no error — and the drain loop immediately dispatches the next queued prompt.

Concrete scenario: Bridge drain replays 3 queued messages. First DispatchBridgePromptAsync returns before SendPromptAsync sets IsProcessing (possible under load or with slow SDK init). WaitForBridgeSendToStartAsync polls for 100ms and gives up. Drain loop calls DispatchBridgePromptAsync again. If IsProcessing still is not set, the if (info?.IsProcessing == true) throw new SessionBusyException check passes — both prompts execute concurrently, corrupting replay ordering.

Recommendation: Add a diagnostic log when the poll expires so timing regressions are observable in production:

// After the loop:
BridgeLog($"[BRIDGE] WaitForBridgeSendToStartAsync timed out for '{sessionName}' — next drain may overlap");

Consider a longer timeout (500ms with shorter poll interval) or tying the wait to the SendPromptAsync task itself for a stronger guarantee.

[Fact]
public void Scenario_ScheduledTaskDesktopEntrypoint_HasMarkupCoverage()
{
var sidebarPath = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "..", "..", "..", "..",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[🟡 MODERATE] Test Coverage — Fragile Relative Path in Markup Coverage Test

Flagged by: 2/3 reviewers

This test reads SessionSidebar.razor via four levels of .. traversal from AppDomain.CurrentDomain.BaseDirectory — one more than the established ScenariosDir pattern (three levels, defined in TestSetup.cs line 19). If the test output directory depth changes (target framework rename such as net10.0net10.0-maccatalyst, a flat CI publish, or a new test runner), File.ReadAllText throws FileNotFoundException with no helpful diagnostic message. Passing CI then depends on invisible layout assumptions.

The test also asserts source-file content (the link href string exists in markup) rather than runtime behavior — a /scheduled-tasks link that routes to a blank page still passes.

Recommendation: Align with the established ScenariosDir pattern or define a shared SourceDir constant:

// Consistent with TestSetup.cs ScenariosDir (3 levels up):
private static string SourceDir => Path.GetFullPath(
    Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "..", "..", "..", ".."));

private static string SidebarPath => Path.Combine(
    SourceDir, "PolyPilot", "Components", "Layout", "SessionSidebar.razor");

This centralizes the depth assumption and makes failures easy to diagnose.

@PureWeen
Copy link
Copy Markdown
Owner Author

/review

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 18, 2026

Expert Code Review failed. Please review the logs for details.

@PureWeen
Copy link
Copy Markdown
Owner Author

/review

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 18, 2026

Expert Code Review completed successfully!

@PureWeen
Copy link
Copy Markdown
Owner Author

/review

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 18, 2026

Expert Code Review completed successfully!

@PureWeen
Copy link
Copy Markdown
Owner Author

/review

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 18, 2026

Expert Code Review completed successfully!

@github-actions

This comment has been minimized.

@PureWeen
Copy link
Copy Markdown
Owner Author

/review

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 18, 2026

Expert Code Review completed successfully!

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expert Code Review — PR #587

feat: add scheduled task integration tests and UI scaffolding

Methodology

3 independent reviewers conducted parallel analysis, with adversarial consensus applied to all findings. Findings required agreement from ≥2 reviewers to be included. 2 single-reviewer findings were discarded after follow-up consensus rounds.

Findings Summary

# Severity File Finding
1 🟡 MODERATE WsBridgeServer.cs:337 WaitForBridgeSendToStartAsync is dead code in the drain path — adds ~100ms latency with no sync benefit
2 🟡 MODERATE ScenarioReferenceTests.cs:253 Structural test as sole scenario coverage with fragile 4-level path traversal
3 🟡 MODERATE ScheduledTaskTests.cs:839 Hardcoded path restore in finally block — should capture/restore pre-test value

Discarded Findings (single reviewer only)

  • Regex \\d+ NIT in SlashCommandAutocompleteTests.cs — 1/3 only, both follow-up reviewers disagreed
  • Dual hover/click @@ button NIT — 1/3 only, not validated by others

Assessment

Overall: This is a solid PR that adds comprehensive scheduled task integration tests (23 tests), UI scaffolding for the /schedule slash command, overflow menu improvements, and bridge server drain-loop fixes. The test coverage is thorough and the scenarios JSON is well-structured.

The 3 findings above are improvement suggestions around test isolation patterns and a minor dead-code path — none are correctness bugs. The core functionality is sound.

CI Status: Not checked (gh CLI not authenticated in this environment).

Verdict: REQUEST_CHANGES per review protocol (🟡 MODERATE findings present). All findings are addressable with minor changes.

Warning

⚠️ Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • 192.0.2.1

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "192.0.2.1"

See Network Configuration for more information.

Generated by Expert Code Review for issue #587

Comment thread PolyPilot/Services/WsBridgeServer.cs Outdated
}
}

private async Task WaitForBridgeSendToStartAsync(string sessionName, CancellationToken ct = default)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[🟡 MODERATE] Bridge Protocol — Dead-code poll loop in drain path

Flagged by: 2/3 reviewers

WaitForBridgeSendToStartAsync polls 10×10ms for IsProcessing == true, but in the drain path the preceding await DispatchBridgePromptAsync already awaits full turn completion (SDK sessions await ResponseCompletion.Task). By the time this poll runs, IsProcessing has already cycled true→false. This adds ~100ms latency per drained prompt with no synchronization benefit.

Additionally, the method exits silently with no diagnostic log tag when the wait expires — per PolyPilot conventions, all processing-state-related transitions should have a log entry.

Recommendation: Either remove from the drain loop (the sequential await already guarantees ordering), or add a bool return + [BRIDGE-WAIT-TIMEOUT] diagnostic log when the wait expires. If kept for the live send handler path, document the intent.

/// Structural coverage: the desktop sidebar keeps the Scheduled Tasks overflow link wired to /scheduled-tasks.
/// </summary>
[Fact]
public void Scenario_ScheduledTaskDesktopEntrypoint_HasMarkupCoverage()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[🟡 MODERATE] Test Coverage — Structural test as sole scenario coverage

Flagged by: 3/3 reviewers (after adversarial consensus)

This test reads raw .razor markup from disk and asserts on string literals — a structural test pattern. Per PolyPilot conventions (copilot-instructions § "Behavioral Tests Over Structural"), structural tests should be supplementary guards, not primary coverage. Every other scenario cross-reference test in this file uses the Assert.True(true, "See ...") documentation pattern pointing to behavioral tests.

The 4-level ../../../.. path traversal from AppDomain.CurrentDomain.BaseDirectory is also fragile — it differs from the 3-level ScenariosDir pattern used elsewhere and may break in published/CI layouts with different output directory depths.

Recommendation: Either (a) convert to the documentation pattern (Assert.True(true, "See ...")) backed by a behavioral test, or (b) anchor the file path to the established ScenariosDir base and mark as supplementary coverage alongside a behavioral test.

Comment thread PolyPilot.Tests/ScheduledTaskTests.cs Outdated
finally
{
try { File.Delete(tempFile); } catch { }
ScheduledTaskService.SetTasksFilePathForTesting(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[🟡 MODERATE] Test Quality — Hardcoded path restore in finally block

Flagged by: 2/3 reviewers

The finally block restores SetTasksFilePathForTesting to a hardcoded Path.Combine(TestSetup.TestBaseDir, "scheduled-tasks.json") rather than capturing and restoring the pre-test value. If a test fixture or earlier test sets a different override path, this silently clobbers it.

Note: One reviewer observed this is a systemic pattern across all scheduled task tests, not unique to this one — so a fix here should ideally be applied uniformly.

Recommendation: Capture the prior path at test entry and restore it:

var prior = ScheduledTaskService.GetTasksFilePathForTesting(); // add getter if needed
try { ... }
finally { ScheduledTaskService.SetTasksFilePathForTesting(prior); }

Alternatively, use TestSetup.TestBaseDir with a GUID-based unique filename to stay consistent.

@PureWeen PureWeen force-pushed the scheduled-task-integration-tests branch from c28de97 to 1f644b1 Compare April 18, 2026 19:00
@PureWeen
Copy link
Copy Markdown
Owner Author

/review

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 18, 2026

Expert Code Review completed successfully!

@PureWeen
Copy link
Copy Markdown
Owner Author

/review

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 18, 2026

Expert Code Review completed successfully!

@github-actions

This comment has been minimized.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expert Review — PR #587: Scheduled Task Integration Tests & UI Scaffolding

Methodology: 3 independent reviewers with adversarial consensus (Reviewer 3 timed out after 25min; findings validated across Reviewers 1 and 2 with targeted follow-up challenges).

Findings by Severity

# Severity File Finding Consensus
1 🟡 MODERATE SessionSidebar.razor.css Missing .sidebar-header:has(.header-overflow.open) z-index elevation — click-opened panel can render behind toolbar on touch devices 2/2
2 🟡 MODERATE WsBridgeServer.cs WaitForBridgeSendToStartAsync return value discarded + missing cancellation token propagation during drain/shutdown 2/2
3 🟡 MODERATE ScheduledTask.cs /schedule 2d produces "Every 2880 minutes" display; daily/weekly stores dead IntervalMinutes enshrined in tests 2/2
4 🟢 MINOR ScheduledTaskTests.cs Null-forgiving previousPath! masks type-contract mismatch between getter (string?) and setter (string) 2/2

Verified Safe (No Action Needed)

  • DemoService MessageCount removalFlushCurrentResponse via CompleteResponse already handles this (2/2 verified)
  • SessionBusyException guard in DispatchBridgePromptAsync — redundant but harmless pre-check, no TOCTOU race (both run atomically on UI thread)
  • /schedule slash command control flowHandleScheduleCommand returns correctly propagate through the break in the switch case
  • Regex patterns^...$-anchored with bounded alternations, no catastrophic backtracking risk, overflow caught by checked arithmetic

Discarded (Single Reviewer Only)

  • BuildSuggestedTaskName uses interpreted regex instead of compiled (1/2 — negligible perf impact, .NET static cache handles it)
  • Scenario_ScheduledTaskDesktopEntrypoint_HasMarkupCoverage is structural no-op Assert.True(true) (1/2 — consistent with some existing scenario reference tests)

CI & Test Assessment

  • New tests are well-structured with proper isolation (SetTasksFilePathForTesting capture/restore pattern)
  • Slash command parsing has good edge case coverage (valid inputs, invalid inputs, every prefix variant)
  • Scenario JSON definitions cover the new UI flows comprehensively
  • BridgePromptQueueTests timeout increased from 8s→15s with proper cancellation — good robustness improvement

Overall

Solid PR with good test coverage. The three moderate findings are all addressable without structural changes — the CSS z-index fix is a one-liner, the bridge cancellation token is a simple parameter pass-through, and the interval display is a ScheduleDescription improvement. No critical issues found.

Warning

⚠️ Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • 192.0.2.1

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "192.0.2.1"

See Network Configuration for more information.

Generated by Expert Code Review for issue #587

Comment thread PolyPilot/Services/WsBridgeServer.cs Outdated
try
{
await DispatchBridgePromptAsync(pending.SessionName, pending.Message, pending.AgentMode);
await WaitForBridgeSendToStartAsync(pending.SessionName);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MODERATE — WaitForBridgeSendToStartAsync return value discarded + missing cancellation token (Flagged by 2/2 reviewers)

Two issues with this call:

  1. Return value discarded: WaitForBridgeSendToStartAsync returns Task(bool) (false on timeout), but the result is never checked. Consider changing the return type to Task (since no caller uses the bool) or adding a comment documenting the intent.

  2. Missing cancellation token: This calls with CancellationToken.None (the default). During shutdown/reconnect, the method polls for the full 200ms per queued prompt. With multiple queued prompts, this blocks drain teardown unnecessarily.

Suggested fix:

await WaitForBridgeSendToStartAsync(pending.SessionName, _cts?.Token ?? default);

Comment on lines +380 to +386
{
schedule = ScheduleType.Daily;
intervalMinutes = 24 * 60;
}
else
{
intervalMinutes = checked(value * 24 * 60);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MODERATE — Multi-day/week intervals produce confusing user-visible display (Flagged by 2/2 reviewers)

/schedule 2d Send recap takes this branch: Schedule = Interval with intervalMinutes = 2880. ScheduleDescription renders this as "Every 2880 minutes" — confusing for users. Same issue for 2w → "Every 20160 minutes".

Additionally, the daily/weekly keyword cases (lines 353-360) set IntervalMinutes = 1440/10080, but GetNextRunTimeUtc ignores IntervalMinutes entirely for Daily/Weekly schedule types — this is dead data that the tests then enshrine as a contract.

Suggested fixes:

  • For 2d/2w: use ScheduleType.Daily/Weekly with a count multiplier, or improve ScheduleDescription to show "Every 2 days" for large interval values
  • For daily/weekly keywords: don't set IntervalMinutes to a misleading value, or add a comment documenting that it is unused for these schedule types

Comment thread PolyPilot.Tests/ScheduledTaskTests.cs Outdated
finally
{
try { File.Delete(tempFile); } catch { }
ScheduledTaskService.SetTasksFilePathForTesting(previousPath!);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 MINOR — Null-forgiving operator masks type-contract mismatch (Flagged by 2/2 reviewers)

GetTasksFilePathForTesting() returns string? but SetTasksFilePathForTesting() takes string (non-nullable). The ! suppresses the warning. Safe in practice because [ModuleInitializer] always runs first, but the type-contract gap could cause issues if SetTasksFilePathForTesting ever needs to handle null explicitly.

Suggestion: Change SetTasksFilePathForTesting parameter to string? to match the getter's return type.

@PureWeen PureWeen force-pushed the scheduled-task-integration-tests branch from 57faad0 to 9bb955a Compare April 19, 2026 04:37
@PureWeen
Copy link
Copy Markdown
Owner Author

/review

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 19, 2026

Expert Code Review completed successfully!

@github-actions
Copy link
Copy Markdown
Contributor

🔍 Design-Level Observations (MINOR)

These were flagged during the multi-model review but deemed low-severity after adversarial consensus.

  • 🟢 HandleScheduleCommand sync disk I/O on UI thread (3/3 after follow-up) — ScheduledTaskService.AddTask() calls SaveTasks() with synchronous File.WriteAllText + File.Move on the Blazor render thread. For a small JSON file this is typically <5ms, and it follows the same pattern as SaveOrganization / SaveUiState. Not a regression, but worth noting if the task list grows significantly.

  • 🟢 1d vs 2d schedule type asymmetry (2/3 after follow-up) — /schedule 1d creates ScheduleType.Daily (fires at a fixed TimeOfDay), but /schedule 2d creates ScheduleType.Interval (fires every 48h from LastRunAt). The UX contract shifts at the boundary. Same for 1w/2w. This is arguably intentional ("daily" = calendar semantics, "every 2 days" = interval), but a user who expects "every 2 days at 9am" cannot express that via /schedule. Consider documenting this distinction or unifying the behavior.

Discarded Findings (1/3 only)

  • Test file isolation for _tasksFilePath in ScheduledTaskTestsTestSetup.cs already initializes the path via [ModuleInitializer], so previousPath is non-null in practice.
  • /schedule daily anchoring TimeOfDay to current wall-clock — reasonable default; could document.
  • Scenario_ScheduledTaskDesktopEntrypoint_HasMarkupCoverage provides zero behavioral coverage — follows existing codebase pattern for structural scenario references.
  • maxMs bump to 15000 lacks explanatory comment — minor documentation concern.

Generated by Expert Code Review for issue #587 ·

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expert Code Review — PR #587

Methodology: 3 independent reviewers with adversarial consensus. Findings only included when 2+ reviewers agreed (after follow-up validation for contested items). 4 findings discarded as single-reviewer-only.


Findings by Severity

# Severity File Finding Consensus
1 🟡 MODERATE WsBridgeServer.cs:342 WaitForBridgeSendToStartAsync polls 20×10ms but IsProcessing may already be cleared by the time polling starts — adds unnecessary 200ms latency per drained prompt on fast/error paths 2/3 initial
2 🟡 MODERATE WsBridgeServer.cs:319 DrainPendingPromptsAsync dequeues before checking cancellation — a prompt can be silently lost if CT fires between dequeue and dispatch 3/3 after follow-up
3 🟢 MINOR BridgePromptQueueTests.cs:35 ct.ThrowIfCancellationRequested() can throw even when condition was met (tiny race window, test-only code) 3/3 after follow-up
4 🟢 MINOR Dashboard.razor HandleScheduleCommand does sync disk I/O via AddTask→SaveTasks on UI thread (follows existing codebase pattern) 3/3 after follow-up
5 �� MINOR ScheduledTask.cs 1d→Daily vs 2d→Interval schedule type asymmetry (intentional design, but undocumented UX boundary) 2/3 after follow-up

Verified Safe ✅

  • DemoService MessageCount removal — Dead-code cleanup; CopilotService manages MessageCount independently
  • SessionBusyException guard — No TOCTOU race (both check and send run in same InvokeOnUIAsync callback on UI thread)
  • HandleScheduleCommand post-switch flow — Uses break (not return), so MessageCount update, _needsScrollToBottom, and SafeRefreshAsync always execute
  • Thread safety of History.Add — Runs on UI thread via Blazor event handler, consistent with existing patterns
  • static readonly Regex fields — No MAUI platform API calls; safe on all platforms. RegexOptions.Compiled falls back to interpreted on iOS/AOT
  • CancellationToken propagation in drain loop — Correctly re-throws OperationCanceledException, propagates CT to all sub-calls
  • Header overflow menu — Backdrop pattern, @onclick:stopPropagation, and LocationChanged cleanup are correct
  • Slash command autocomplete/schedule correctly added in alphabetical order with hasArgs: true

Test Coverage Assessment

New tests provide good behavioral coverage: slash command parsing (4 [InlineData] + 2 [Fact]), interval formatting (3 [Fact]), existing session execution, scenario references. The Service_ExecuteTask_ExistingSession test properly uses capture-restore for file isolation.

CI Status

CI checks not inspectable from this environment. Verify tests pass before merging.

Prior Review Status

This is the 4th commit iteration addressing previous review feedback (bridge timeout observability, test path capture-restore, z-index/CancellationToken/interval display fixes). The iteration history shows responsive and thorough addressal of prior findings.


Verdict: No critical issues found. Two moderate findings in WsBridgeServer are low practical impact (one theoretical with no real CT callers, one performance nit). Three minor findings are informational. Overall this is a well-structured PR with good test coverage.

Generated by Expert Code Review for issue #587

}
}

private async Task<bool> WaitForBridgeSendToStartAsync(string sessionName, CancellationToken ct = default)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MODERATEWaitForBridgeSendToStartAsync may false-timeout on fast/failed sessions, adding 200ms latency per drained prompt

Flagged by: 2/3 reviewers

DispatchBridgePromptAsync awaits InvokeOnUIAsync, which fully awaits SendPromptAsync before returning. By the time this method polls IsProcessing, the turn may have already completed (especially in demo mode or fast SDK responses), causing all 20 polls to miss and logging a misleading "timed out" message. With N queued prompts that complete quickly, drain latency grows by N × 200ms.

Suggested fix: Since DispatchBridgePromptAsync returns after SendPromptAsync sets IsProcessing (or errored), reduce to 3–5 attempts (30–50ms total):

for (var attempt = 0; attempt < 5; attempt++)
{
    if (_copilot?.GetSession(sessionName)?.IsProcessing == true) return true;
    await Task.Delay(10, ct).ConfigureAwait(false);
}

Comment thread PolyPilot/Services/WsBridgeServer.cs Outdated
{
while (_pendingBridgePrompts.TryDequeue(out var pending))
{
ct.ThrowIfCancellationRequested();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MODERATE — Dequeue-before-cancel-check can silently lose a prompt on cancellation

Flagged by: 3/3 reviewers (after follow-up)

The loop dequeues first via TryDequeue, then checks ct.ThrowIfCancellationRequested(). If cancellation fires in that window, the prompt is permanently lost — dequeued from the queue but never dispatched or re-enqueued.

In practice, current call sites pass default cancellation tokens so this cannot fire today. But if a future caller (e.g., shutdown, reconnect) passes a real CT, message loss becomes possible.

Suggested fix: Check cancellation before dequeue, or catch OperationCanceledException after dequeue and re-enqueue the pending item:

while (_pendingBridgePrompts.TryDequeue(out var pending))
{
    try
    {
        ct.ThrowIfCancellationRequested();
        // ... dispatch ...
    }
    catch (OperationCanceledException)
    {
        _pendingBridgePrompts.Enqueue(pending); // re-enqueue
        throw;
    }
}

while (!condition() && sw.ElapsedMilliseconds < maxMs)
while (!condition() && sw.ElapsedMilliseconds < maxMs && !ct.IsCancellationRequested)
await Task.Delay(pollMs, ct);
ct.ThrowIfCancellationRequested();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 MINORct.ThrowIfCancellationRequested() can throw even when the condition was met

Flagged by: 3/3 reviewers (after follow-up)

If cancellation fires in the same poll cycle that condition() becomes true, the loop exits (because !condition() is false), but this line unconditionally throws OperationCanceledException — failing the test even though the condition was satisfied. The race window is tiny (nanoseconds) but could cause flaky CI failures under load.

Suggested fix: Check the condition before throwing:

while (!condition() && sw.ElapsedMilliseconds < maxMs && !ct.IsCancellationRequested)
    await Task.Delay(pollMs, ct);
if (condition()) return;  // success — don't throw on coincidental cancel
ct.ThrowIfCancellationRequested();
throw new TimeoutException(...);

@PureWeen
Copy link
Copy Markdown
Owner Author

PureWeen commented Apr 19, 2026

🔍 Multi-Model Code Review — PR #587 (Round 3)

Scheduled Task Integration Tests & UI Improvements
13 files changed · 3 independent reviewers · adversarial consensus


Validation

  • ✅ Local build passes (0 errors)
  • 3,526 tests pass (latest head fafdc31)
  • ⚠️ No CI checks configured for this branch

Previous Findings Status (Round 1→2)

All 6 original findings from Round 1 were addressed in Round 2:

  1. ✅ Drain cancellation data loss — re-enqueue on cancel
  2. ✅ TOCTOU /schedule delete — GUID prefix in list + delete confirmation
  3. WaitForBridgeSendToStartAsync 200ms cap — documented
  4. ✅ Session name correlation key — TODO comment
  5. ✅ Test ScheduledTaskService disposal — svc?.Dispose()
  6. WaitForAsync diagnostic masking — condition-before-CT order

Round 2 regression (duplicate prompt on cancel) was also fixed with dispatched flag.


Round 3 Findings

🟢 MINOR — FIFO ordering violated on cancel re-enqueue (known/accepted)

File: WsBridgeServer.csDrainPendingPromptsAsync
Flagged by: 2/3 reviewers

ConcurrentQueue.Enqueue appends to tail. If cancellation hits item A from queue [A,B,C], next drain becomes [B,C,A]. This was identified and accepted in Round 2 — cancellation during drain is rare, queues are typically 1–3 items, and prompts usually target different sessions where ordering doesn't matter. No action needed.


Investigated & Dismissed

Finding Flagged by Investigation Verdict
DemoService.MessageCount removal causes stale counts 2/3 Traced OnTurnEnd handler → calls CompleteResponse(state) (line 1338) → updates MessageCount (line 1709). Removal is safe. ❌ False positive
Id[..8] crash on short IDs 1/3 IDs use Guid.NewGuid().ToString("N") — always 32 hex chars ❌ Not possible
DeleteTask return value ignored 1/3 Delete targets by stable ID resolved from fresh snapshot; task existence validated by index bounds check ❌ Acceptable
IsProcessing read off UI thread 1/3 Best-effort poll; Task.Delay provides implicit barriers; already rejected in Round 1 adversarial ❌ Already rejected
Unused TimeOfDay/DaysOfWeek fields for interval schedules 1/3 Fields are ignored for ScheduleType.Interval; no behavioral impact ❌ Cosmetic

Test Coverage

  • ✅ Slash command parsing (Theory + Facts, 4 inline data + edge cases)
  • ✅ Interval formatting (hours, days, weeks)
  • ✅ Existing session execution
  • ✅ Autocomplete includes /schedule
  • ✅ Scenario reference cross-checks (6 new)
  • ⚠️ No test for /schedule delete or /schedule list handler (Dashboard.razor)
  • ⚠️ No test for BuildSuggestedTaskName truncation

Recommendation

Approve — All previous findings addressed. No new bugs found in Round 3. The one minor FIFO ordering note was previously accepted. 3,526 tests passing. The untested /schedule list/delete paths are low-risk UI handlers with straightforward logic — not blockers.

PureWeen and others added 7 commits April 21, 2026 11:33
- Add ScheduledTask model with cron-like scheduling support
- Add scheduled task scenarios JSON for integration testing
- Add ScheduledTaskTests with comprehensive coverage
- Add Dashboard UI scaffolding for scheduled tasks
- Add WsBridgeServer scheduled task command handling
- Update SessionSidebar with scheduled task indicators
- Update ScenarioReferenceTests for new scenarios
- Minor test fixes in BridgePromptQueueTests and SlashCommandAutocompleteTests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…robustness

- WsBridgeServer: WaitForBridgeSendToStartAsync now returns bool, polls
  20×10ms (200ms), logs diagnostic warning on timeout
- ScenarioReferenceTests: Replace fragile 4-level relative path test with
  established Assert.True cross-reference pattern
- ScenarioReferenceTests: Upgrade 4 no-op Assert.True stubs to reflection-
  based method-existence checks (Assert.NotNull + typeof().GetMethod)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Address review finding: hardcoded path restore in finally block now
uses capture-restore pattern via new GetTasksFilePathForTesting() getter.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…nterval display, null safety)

- CSS: add .header-overflow.open to sidebar-header z-index elevation
  for touch devices that use click instead of hover
- WsBridgeServer: propagate CancellationToken through drain loop, use
  WaitForBridgeSendToStartAsync return value, log on timeout
- ScheduledTask: add FormatInterval() for friendly display — '2d' now
  shows 'Every 2 days' instead of 'Every 2880 minutes'
- Tests: replace null-forgiving previousPath! with proper null check,
  add 3 tests for hours/days/weeks interval formatting

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- /schedule list now shows numbered indices (1, 2, 3...)
- /schedule delete <#> removes a task by its list index
- Updated help text and usage to document delete subcommand
- Added test for usage text including delete subcommand

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- DrainPendingPromptsAsync: re-enqueue dequeued prompt on cancellation to prevent data loss
- WaitForBridgeSendToStartAsync: document 200ms poll ceiling and non-fatal timeout
- /schedule list: show short GUID prefix for stable task identification (TOCTOU mitigation)
- HandleScheduleCommand: add TODO for session-name correlation key fragility
- ScheduledTaskTests: dispose ScheduledTaskService to prevent timer leak across tests
- BridgePromptQueueTests: swap CT/condition check order to preserve timeout diagnostics

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- DrainPendingPromptsAsync: track dispatched flag, only re-enqueue if dispatch
  didn't complete (prevents duplicate delivery when OCE from WaitForBridgeSendToStartAsync)
- /schedule delete: echo short GUID prefix in delete confirmation message

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen PureWeen force-pushed the scheduled-task-integration-tests branch from b79d443 to fafdc31 Compare April 21, 2026 16:35
PureWeen added a commit that referenced this pull request Apr 22, 2026
The `/fix` workflow now dispatches both `verify-build` AND
`polypilot-integration` after pushing fixes.

**polypilot-integration** now accepts `pr_number` and `ref` inputs,
checks out the PR branch, and posts a results table back to the PR:
- Linux/GTK: build → launch under xvfb → MauiDevFlow agent → smoke tests
- Mac Catalyst: build verification
- Windows: build → launch → DevFlow → smoke tests

Testing this against PR #587 (scheduled task integration tests).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

Cross-Platform Verification — PR #587

Build Results

Platform Status
Tests (macOS) ✅ success
Mac Catalyst build ❌ failure
Windows build ✅ success

❌ Platform verification failed

⚠️ Failed Platforms Need Investigation

Check the workflow run for build logs.

Previous Review History

Found 27 automated review(s) on this PR. Build verification validates that all review-driven fixes compile and pass tests across platforms.


Triggered by: verify-build run

@github-actions
Copy link
Copy Markdown
Contributor

Cross-Platform Verification — PR #587

Build Results

Platform Status
Tests (macOS) ✅ success
Mac Catalyst build ❌ failure
Windows build ✅ success

❌ Platform verification failed

⚠️ Failed Platforms Need Investigation

Check the workflow run for build logs.

Previous Review History

Found 27 automated review(s) on this PR. Build verification validates that all review-driven fixes compile and pass tests across platforms.


Triggered by: verify-build run

@github-actions
Copy link
Copy Markdown
Contributor

Cross-Platform Verification — PR #587

Build Results

Platform Status
Tests (macOS) ❌ failure
Mac Catalyst build ❌ failure
Windows build ✅ success

❌ Platform verification failed

⚠️ Failed Platforms Need Investigation

Check the workflow run for build logs.

Previous Review History

Found 27 automated review(s) on this PR. Build verification validates that all review-driven fixes compile and pass tests across platforms.


Triggered by: verify-build run

PureWeen added a commit that referenced this pull request Apr 22, 2026
Adds end-to-end integration tests for the scheduled tasks feature (PR
#587).

**Test script** (`.github/integration-tests/scheduled-tasks.sh`):
1. Navigate to `/scheduled-tasks` page
2. Create a new interval task via the form
3. Verify task card appears with correct schedule/prompt
4. Toggle disable/enable
5. Validate invalid cron rejection
6. Delete the task and verify removal

**Usage:**
```bash
gh workflow run polypilot-integration.yml -F scenario=scheduled-tasks -F pr_number=587 -F ref=scheduled-task-integration-tests
```

Runs on both Linux/GTK and Windows via MauiDevFlow CDP.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen PureWeen merged commit 98fe7eb into main Apr 22, 2026
@PureWeen PureWeen deleted the scheduled-task-integration-tests branch April 22, 2026 21:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant