feat: Scheduled tasks (cron jobs) for recurring prompt execution by Copilot · Pull Request #380 · PureWeen/PolyPilot

Copilot · 2026-03-15T23:43:32Z

Adds a cron-like scheduled tasks system for recurring prompt execution — daily stand-ups, periodic reviews, automated checks, etc.

Fixes #367

Features

Three schedule types: Interval (every N min), Daily (at time), Weekly (days + time)
Cron expression support: 5-field (min hour dom month dow) with wildcards, ranges, lists, and step values (e.g., 0 9 * * 1-5 for weekdays at 9am)
Full CRUD UI page at /scheduled-tasks with form validation
Task cards: toggle, edit, run-now, delete (with confirmation), and run history
Expandable run history showing last 10 executions with timestamps and error details
Background timer evaluates due tasks every 30s with interlocked overlap guard
Executes by sending prompt to existing session or creating a new one per run
Persists to ~/.polypilot/scheduled-tasks.json

Bug Fixes (from original Copilot bot PR)

Fix Interval snap-forward: missed intervals return the last boundary, not now
Fix Daily schedule: consistent local-time date comparison instead of mixed UTC/local
Fix SaveTasks race: RecordRunAndSave now holds lock through save
Fix CSS font-family enforcement: use var(--font-mono) not raw monospace
Fix .NET 10 Blazor type inference: use text input for time-of-day

Tests (71 total)

Model: serialization, schedule descriptions, time parsing, due detection
Cron: valid/invalid expressions, ranges, steps, lists, JsonRoundTrip, next-run
Validation: IsValidTimeOfDay, ScheduleType.Cron enum, max interval
Schedule edge cases: daily never-run, daily already-ran-today, interval snap-forward
Service: persistence, CRUD, execution, error handling, corrupt file handling

Files Changed

File	Description
`PolyPilot/Models/ScheduledTask.cs`	Task model with schedule calc, cron parser, validation
`PolyPilot/Services/ScheduledTaskService.cs`	Background service: timer, CRUD, execution, persistence
`PolyPilot/Components/Pages/ScheduledTasks.razor`	UI page with form, task cards, run history
`PolyPilot/Components/Pages/ScheduledTasks.razor.css`	Scoped styling
`PolyPilot/Components/Layout/SessionSidebar.razor`	Nav link (clock icon)
`PolyPilot/MauiProgram.cs`	DI registration
`PolyPilot.Tests/ScheduledTaskTests.cs`	71 tests
`PolyPilot.Tests/TestSetup.cs`	Test file path isolation
`PolyPilot.Tests/PolyPilot.Tests.csproj`	Compile includes

PureWeen · 2026-03-30T00:41:32Z

🤖 Multi-Model Code Review — PR #380

feat: Scheduled tasks (cron jobs) for recurring prompt execution
Reviewed by: Claude Opus 4.6, Claude Sonnet 4.6, GPT-5.3-Codex · 2-of-3 consensus applied
Diff: +2522/−6 across 10 files

🔴 CRITICAL — Scheduled task service never starts unless the page is visited

File: PolyPilot/MauiProgram.cs (DI registration) + PolyPilot/Services/ScheduledTaskService.cs (constructor)
Flagged by: Opus, Sonnet, Codex (unanimous)

ScheduledTaskService is registered as a DI singleton via AddSingleton<ScheduledTaskService>(), but singletons are lazily created on first injection. The only injection point is @inject ScheduledTaskService in ScheduledTasks.razor. If the user creates tasks, restarts the app, and never navigates to the Scheduled Tasks page, the service constructor never runs, the timer never starts, and no scheduled tasks ever execute.

Impact: The core feature (background recurring execution) silently does nothing for users who don't visit the page after every app launch.

Fix: Eagerly resolve the service after builder.Build():

var app = builder.Build();
app.Services.GetRequiredService<ScheduledTaskService>();
return app;

🔴 CRITICAL — Cron tasks never fire (off-by-one in time calculation)

File: PolyPilot/Models/ScheduledTask.cs, GetNextCronTimeUtc method (~line 313)
Flagged by: Opus, Codex

var candidate = new DateTime(local.Year, local.Month, local.Day,
    local.Hour, local.Minute, 0).AddMinutes(1);

The candidate always starts 1 minute in the future. IsDue() checks next <= utcNow. Since the returned time is always ≥1 minute ahead, it is never ≤ now. Example: at 09:00:15, cron 0 9 * * * → candidate starts at 09:01, minute 1 ∉ {0}, so it scans forward to tomorrow's 09:00. IsDue: tomorrow 09:00 ≤ today 09:00:15? No. The 09:00 slot is permanently skipped. This applies to all cron expressions.

Impact: Cron-type schedules never auto-execute.

Fix: Start from the current minute rather than next minute:

var candidate = new DateTime(local.Year, local.Month, local.Day,
    local.Hour, local.Minute, 0, DateTimeKind.Local);

Rely on LastRunAt to prevent re-firing within the same minute.

🔴 CRITICAL — Weekly schedule skips missed same-day slots

File: PolyPilot/Models/ScheduledTask.cs, Weekly case in GetNextRunTimeUtc (~lines 143-157)
Flagged by: Opus, Codex

if (candidateUtc <= now && i == 0) continue; // today's slot already passed

If today is a matching weekday and the scheduled time has passed, the slot is unconditionally skipped — even if the task hasn't run today. The Daily case correctly handles this with a lastRunLocal.Date < localNow.Date check, but the Weekly case does not.

Scenario: Weekly Mon-Fri 09:00, LastRunAt = Monday 09:00. It's now Tuesday 10:00. Today's 09:00 slot is skipped (candidateUtc <= now, i=0), so it returns Wednesday 09:00. Tuesday's run is lost.

Impact: Weekly tasks miss ~1 day of execution per occurrence when the timer fires after the slot time.

Fix: Mirror the Daily logic: check if LastRunAt was before today before skipping past slots.

🔴 CRITICAL — UI edit overwrites run history (data loss race)

File: PolyPilot/Components/Pages/ScheduledTasks.razor (~line 298, SaveTask) + PolyPilot/Services/ScheduledTaskService.cs (~line 81, UpdateTask)
Flagged by: Opus, Sonnet

EditTask() stores a clone of the task. When the user saves, UpdateTask replaces the canonical instance wholesale:

if (idx >= 0) _tasks[idx] = task; // replaces entire object

If the background timer executed the task between EditTask and SaveTask, the clone has stale LastRunAt and RecentRuns. The update silently erases all run history accumulated during the edit window.

Impact: Run history data loss whenever a task executes while the user has the edit form open.

Fix: UpdateTask should merge only user-editable fields (Name, Prompt, Schedule config, IsEnabled) onto the canonical instance rather than replacing the entire object. Never overwrite LastRunAt or RecentRuns.

🟡 MODERATE — Non-atomic file write risks total task loss

File: PolyPilot/Services/ScheduledTaskService.cs, SaveTasksLocked (~lines 276-280)
Flagged by: Opus, Codex

File.WriteAllText(TasksFilePath, json);

If the app crashes or is killed during WriteAllText, the file may be truncated or partially written. On next load, JsonSerializer.Deserialize throws, and the catch block returns an empty list — all tasks silently lost.

Fix: Write to a temp file, then atomic rename:

var tempPath = TasksFilePath + ".tmp";
File.WriteAllText(tempPath, json);
File.Move(tempPath, TasksFilePath, overwrite: true);

🟡 MODERATE — Disk I/O and serialization while holding lock

File: PolyPilot/Services/ScheduledTaskService.cs, SaveTasksLocked (~lines 268-282)
Flagged by: Opus, Sonnet

SaveTasks() acquires _lock, then calls SaveTasksLocked() which does JSON serialization + File.WriteAllText under the lock. All CRUD methods and the timer's RecordRunAndSave do this. Sonnet also notes that GetNextCronTimeUtc (up to 527K iterations scanning minute-by-minute for 366 days) runs inside IsDue() which is called from EvaluateTasksAsync's locked section.

Impact: UI thread freezes on GetTasks(), AddTask(), etc. while the timer thread holds the lock during I/O or cron scanning. Especially problematic on mobile/slow storage.

Fix: Snapshot _tasks.ToList() under the lock, then serialize and write outside it.

Notable Single-Model Findings (informational — did not meet 2-of-3 consensus)

Finding	Model	Severity
`EvaluateTasksAsync` foreach has no per-task exception isolation — one bad handler kills remaining due tasks	Opus	🟡
Tests leak timers (`CreateService()` never disposes, timer fires during/after tests)	Opus	🟡
`Start()` creates zombie timer after `Dispose()` — no `_disposed` guard	Sonnet	🟡
Daily/Weekly with `LastRunAt==null` fires immediately even if slot passed today	Sonnet	🟡
Cron DOM×DOW uses AND semantics; standard cron uses OR when both are restricted	Codex	🟡
Cron parser rejects `7` as Sunday (POSIX cron accepts both 0 and 7)	Opus	🟢
`DateTime` created with `Kind.Unspecified` in cron parser	Opus	🟢
`run.Success = true` means "prompt sent", not "AI responded"	Sonnet	🟢

📋 Summary

CI: ⚠️ No checks configured for this branch
Prior reviews: None (first review)
Test coverage: 71 tests — strong model/serialization coverage. Missing: service eager-start, edit-during-execution race, cron actually-fires integration test, weekly missed-day regression. The existing CronSchedule_IsDue_ReturnsTrueWhenCronMatches test has a comment acknowledging the off-by-one ("current minute won't match") and tests the broken behavior rather than the intended behavior.
Test isolation: SetTasksFilePathForTesting() in TestSetup.cs is correctly implemented ✅

Recommended action: ⚠️ Request changes

Four blockers:

🔴 Service never starts — the entire feature is a no-op after app restart unless the user visits the page
🔴 Cron tasks never fire — the AddMinutes(1) off-by-one makes all cron schedules permanently unfireable
🔴 Weekly skips days — missed-day logic doesn't check LastRunAt like Daily does
🔴 Edit overwrites run history — UpdateTask replaces entire object including stale RecentRuns/LastRunAt

Should also fix before merge:
5. 🟡 Non-atomic file write — crash during save loses all tasks
6. 🟡 Lock during I/O — potential UI freezes on mobile

Review findings addressed (all from multi-model code review): 🔴 CRITICAL fixes: 1. Service eager start — ScheduledTaskService now resolved eagerly in MauiProgram.cs so the background timer starts on app launch, not just when the user visits the Scheduled Tasks page. 2. Cron off-by-one — GetNextCronTimeUtc now starts from the current minute (not +1). LastRunAt prevents re-firing within the same minute. 3. Weekly missed-day — Weekly schedule now mirrors Daily logic: only skips today's passed slot if LastRunAt shows the task already ran today. 4. Edit preserves run history — UpdateTask now merges only user-editable fields onto the canonical instance. LastRunAt and RecentRuns are never overwritten by stale clones from the edit form. 🟡 MODERATE fixes: 5. Atomic file write — SaveTasks uses write-to-temp + File.Move to prevent data loss on crash during write. 6. I/O outside lock — SaveTasks snapshots tasks under lock, then does serialization and file I/O outside the lock to prevent UI freezes. Additional hardening: - Start() checks _disposed guard to prevent zombie timers after Dispose - Per-task exception isolation in EvaluateTasksAsync foreach loop - Fixed 5 flaky timing tests in TurnEndFallbackTests using TCS pattern - Fixed minute-boundary race in CronSchedule_IsDue_ReturnsFalseWhenAlreadyRanThisMinute - Added tests: UpdateTask preserves run history, atomic write verification, Start-after-Dispose guard All 3,088 tests pass (0 failures). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

PureWeen · 2026-03-30T02:34:56Z

All 6 review findings addressed in commit ad10b28. See commit message for details. 3,088 tests pass, 0 failures.

PureWeen · 2026-03-30T03:10:12Z

🤖 Multi-Model Code Review (Round 2) — PR #380

feat: Scheduled tasks (cron jobs) for recurring prompt execution
Reviewed by: Claude Opus 4.6, Claude Sonnet 4.6, GPT-5.3-Codex · 2-of-3 consensus applied
Diff: +2784/−35 across 11 files · 5 commits since R1

R1 Findings — Fix Verification

All 6 findings from Round 1 are genuinely fixed with correct logic and good regression tests:

#	R1 Finding	Status	Verified by
1	🔴 Service never starts unless page visited	✅ FIXED	Opus, Sonnet, Codex
2	🔴 Cron tasks never fire (off-by-one)	✅ FIXED	Opus, Sonnet, Codex
3	🔴 Weekly schedule skips missed same-day slots	✅ FIXED	Opus, Sonnet, Codex
4	🔴 UI edit overwrites run history (data loss)	✅ FIXED	Opus, Sonnet, Codex
5	🟡 Non-atomic file write	✅ FIXED	Opus, Sonnet, Codex
6	🟡 Disk I/O under lock	✅ FIXED	Opus, Sonnet, Codex

Fix details:

R1-1: MauiProgram.cs now calls app.Services.GetRequiredService<ScheduledTaskService>() after Build() — timer starts at app launch
R1-2: GetNextCronTimeUtc starts candidate at current minute (not +1); LastRunAt prevents same-minute re-fire
R1-3: Weekly i==0 branch now mirrors Daily: only skips past slot if LastRunAt.Date >= localNow.Date
R1-4: UpdateTask merges only 11 editable fields; never touches LastRunAt, RecentRuns, or CreatedAt
R1-5: SaveTasks writes to .tmp then File.Move(..., overwrite: true)
R1-6: Snapshot via Clone() under lock, all File I/O outside lock

New Findings (Round 2)

🟡 MODERATE — Shared temp file path creates save race

File: PolyPilot/Services/ScheduledTaskService.cs, SaveTasks (~line 294)
Flagged by: Opus, Codex

var tempPath = TasksFilePath + ".tmp";

SaveTasks() is called from both the UI thread (AddTask/UpdateTask/DeleteTask) and the timer thread (RecordRunAndSave). Both use the same hardcoded .tmp path. Race scenario: UI save writes to .tmp → timer save overwrites .tmp → UI's File.Move moves the timer's content → timer's File.Move throws FileNotFoundException. The catch block prevents a crash, but one save's data is silently lost.

Fix: Use a unique temp file per call:

var tempPath = TasksFilePath + $".{Guid.NewGuid():N}.tmp";

🟢 MINOR — `ExecuteTaskAsync` doesn't re-check `IsEnabled` after snapshot

File: PolyPilot/Services/ScheduledTaskService.cs, EvaluateTasksAsync / ExecuteTaskAsync
Flagged by: Opus, Codex

Due task IDs are collected under lock, then executed one by one. If a user disables a task between collection and execution, the task still fires once. Impact: at most one extra execution per disable. Acceptable race window.

Notable Single-Model Findings (informational — did not meet 2-of-3 consensus)

Finding	Model	Severity
Impossible cron (`0 9 31 2 *`) holds `_lock` for ~50-500ms per eval cycle (527K iterations)	Sonnet	🔴
`AddTask` stores caller's reference directly — leaky abstraction	Sonnet	🟡
`Start()`/`Dispose()` race can create zombie timer (no memory barrier on `_disposed`)	Sonnet	🟡
`run.Success = true` means "prompt sent", not "AI responded"	Opus	🟡
Cron DOM×DOW uses AND semantics; standard cron uses OR when both restricted	Codex	🟢
DST transitions may shift cron fire times (~1 misfire/year)	Opus	🟢

✅ Positive Observations (all 3 models)

Test coverage is excellent: 75+ tests covering model, CRUD, serialization, clone isolation, cron parsing, run history preservation, atomic write, and exact R1 regression scenarios
Test isolation is proper: SetTasksFilePathForTesting in TestSetup.cs module initializer
Clone pattern is thorough: Deep copies DaysOfWeek + RecentRuns; GetTasks()/GetTask() always return clones
CSS follows conventions: All font-size values use var(--type-*) variables
Lazy property pattern: TasksFilePath uses ??= — no static readonly with platform API calls
TurnEndFallbackTests improvements: Replaced fragile Task.Delay timing with TaskCompletionSource synchronization

📋 Summary

CI: ⚠️ No checks configured for this branch
Prior comments: R1 review (6 findings) + author response (all addressed in commit ad10b28)
R1 findings: All 6 verified fixed ✅
New findings: 1 moderate (shared tmp path race), 1 minor (stale IsEnabled check)
Test coverage: Strong — regression tests for all R1 fixes, clone isolation, cron edge cases

Recommended action: ✅ Approve

All 4 critical and 2 moderate R1 findings are genuinely fixed with good test coverage. The one new moderate finding (shared .tmp path) is a real race but low-probability in practice (concurrent saves within the same millisecond window) and has a trivial one-line fix that could be addressed in a follow-up. The PR is ship-ready.

Review findings addressed (all from multi-model code review): 🔴 CRITICAL fixes: 1. Service eager start — ScheduledTaskService now resolved eagerly in MauiProgram.cs so the background timer starts on app launch, not just when the user visits the Scheduled Tasks page. 2. Cron off-by-one — GetNextCronTimeUtc now starts from the current minute (not +1). LastRunAt prevents re-firing within the same minute. 3. Weekly missed-day — Weekly schedule now mirrors Daily logic: only skips today's passed slot if LastRunAt shows the task already ran today. 4. Edit preserves run history — UpdateTask now merges only user-editable fields onto the canonical instance. LastRunAt and RecentRuns are never overwritten by stale clones from the edit form. 🟡 MODERATE fixes: 5. Atomic file write — SaveTasks uses write-to-temp + File.Move to prevent data loss on crash during write. 6. I/O outside lock — SaveTasks snapshots tasks under lock, then does serialization and file I/O outside the lock to prevent UI freezes. Additional hardening: - Start() checks _disposed guard to prevent zombie timers after Dispose - Per-task exception isolation in EvaluateTasksAsync foreach loop - Fixed 5 flaky timing tests in TurnEndFallbackTests using TCS pattern - Fixed minute-boundary race in CronSchedule_IsDue_ReturnsFalseWhenAlreadyRanThisMinute - Added tests: UpdateTask preserves run history, atomic write verification, Start-after-Dispose guard All 3,088 tests pass (0 failures). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- RunNow: fire-and-forget ExecuteTaskAsync to avoid blocking the UI - SetEnabled: guard SaveTasks/OnTasksChanged with null check on task - IsValidTimeOfDay: reject negative TimeSpan values (add >= 0 check) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

PureWeen · 2026-04-01T06:24:49Z

🔍 Multi-Model Code Review — PR #380 (R3)

PR: feat: Scheduled tasks (cron jobs) for recurring prompt execution
Branch: fix/scheduled-tasks → main
Scope: 3120 lines, 7 commits (3 new since R2)
CI: ⚠️ No checks configured
Prior reviews: R1 found 6 issues (4 CRITICAL, 2 MODERATE). R2 confirmed all fixed. R3 is a fresh re-review after 3 new commits.

Review Method

3 parallel sub-agent reviews: Claude Opus 4.6 (architecture, logic), Claude Sonnet 4.6 (patterns, security), GPT-5.3-Codex (edge cases). Only findings flagged by 2+ models are included.

Consensus Findings

🟡 MODERATE-1: `SendPromptAsync` from ThreadPool violates `IsProcessing` UI-thread invariant

File: ScheduledTaskService.cs (timer callback → ExecuteTaskAsync → SendPromptAsync)
Flagged by: Opus, Codex

The Timer callback runs EvaluateTasksAsync on a ThreadPool thread, which calls SendPromptAsync. Per the project's critical invariant: "All mutations to state.Info.IsProcessing must be marshaled to the UI thread." SendPromptAsync sets IsProcessing = true on the caller's thread — here, the ThreadPool. This could cause missed UI updates or races with CompleteResponse (which runs on UI thread).

Suggested fix: Marshal the SendPromptAsync call to the UI thread via MainThread.InvokeOnMainThreadAsync().

🟡 MODERATE-2: Failed runs set `LastRunAt`, blocking Daily/Weekly retry until next cycle

File: ScheduledTask.cs — RecordRun / ScheduledTaskService.cs — RecordRunAndSave
Flagged by: Opus, Sonnet

RecordRun unconditionally sets LastRunAt = run.StartedAt even for failed runs. For a Daily task, if execution fails (e.g., CopilotService not initialized at startup), GetNextRunTimeUtc sees LastRunAt.Date == today and returns tomorrow. The task silently skips today with no retry.

Suggested fix: Either (a) don't set LastRunAt on failed runs, or (b) add a RetryCount/MaxRetries field so failed tasks re-evaluate on the next 30s tick.

🟡 MODERATE-3: Cron DOM+DOW uses AND semantics instead of standard OR

File: ScheduledTask.cs — GetNextCronTimeUtc
Flagged by: Opus, Sonnet, Codex (all 3)

if (cron.Months.Contains(candidate.Month) &&
    cron.DaysOfMonth.Contains(candidate.Day) &&
    cron.DaysOfWeek.Contains((int)candidate.DayOfWeek) &&

Standard POSIX cron: when both DOM and DOW are non-wildcard, they are OR'd. This uses AND — 0 9 15 * 1 fires only when the 15th is a Monday (~1-2×/year) instead of every Monday AND every 15th (~56×/year).

⚠️ Note: This was flagged in R1 as well. The R2 fix may have addressed a different aspect (off-by-one), but the AND/OR semantics issue persists.

🟡 MODERATE-4: `Start()`/`Dispose()` race — no synchronization on `_disposed`

File: ScheduledTaskService.cs
Flagged by: Opus, Sonnet

_disposed is a plain bool (not volatile, not lock-protected). If Dispose() runs on the UI thread while a ThreadPool thread re-enters Start(), a zombie timer can be created after disposal.

Suggested fix: Make _disposed volatile and use Interlocked for the timer swap, or protect both methods with _lock.

🟡 MODERATE-5: No tests for badge increment/clear/skip logic

File: CopilotService.cs — IncrementPendingCompletions / ClearPendingCompletions
Flagged by: Opus, Codex

The badge count logic has 3 untested behaviors: (1) active session is skipped, (2) worker sessions in multi-agent groups are skipped, (3) ClearPendingCompletions resets to 0. These are new code paths with conditional logic that could regress.

🟢 MINOR-1: `RunNow` fire-and-forget — unobserved exception if handler throws

File: ScheduledTasks.razor
Flagged by: Sonnet, Codex

_ = TaskService.ExecuteTaskAsync(task, DateTime.UtcNow);

If OnTasksChanged?.Invoke() throws, the exception becomes unobserved. Consider wrapping with try/catch in the continuation.

🟢 MINOR-2: Badge count grows unbounded for frequent interval tasks

File: CopilotService.Events.cs — IncrementPendingCompletions
Flagged by: Opus, Codex

A 1-minute interval task would produce badge count of 60/hour while in background. Consider capping at 99 or only badging for completions while the app is backgrounded.

🟢 MINOR-3: Cron brute-force O(527K) runs every 30s per cron task

File: ScheduledTask.cs — GetNextCronTimeUtc
Flagged by: Opus, Codex

Worst case iterates 366 × 24 × 60 = 527,040 times per cron task per tick. Fine for a handful of tasks, but consider caching the next fire time and only recomputing after a run.

Summary Table

#	Severity	Finding	Models
1	🟡 MODERATE	SendPromptAsync from ThreadPool violates IsProcessing UI-thread invariant	Opus, Codex
2	🟡 MODERATE	Failed runs block retry until next cycle	Opus, Sonnet
3	🟡 MODERATE	Cron DOM+DOW AND vs standard OR (persists from R1)	All 3
4	🟡 MODERATE	Start/Dispose race on non-volatile _disposed	Opus, Sonnet
5	🟡 MODERATE	No tests for badge logic	Opus, Codex
6	🟢 MINOR	RunNow fire-and-forget swallows exceptions	Sonnet, Codex
7	🟢 MINOR	Badge count grows unbounded	Opus, Codex
8	🟢 MINOR	Cron search O(527K) per tick	Opus, Codex

Prior R1/R2 Issues

✅ Lazy DI — fixed
✅ Cron off-by-one — fixed
⚠️ Cron DOM+DOW OR semantics — still present (MODERATE-3 above)
✅ Weekly skip — fixed
✅ Edit data loss — fixed
✅ Non-atomic file write — fixed (atomic-write pattern in place)
✅ Disk I/O under lock — fixed (clone/snapshot pattern)

Test Coverage

The clone/snapshot pattern, TurnEndFallbackTests, and DiagnosticsLogTests are well-tested. However, badge increment/clear logic (MODERATE-5) and scheduled task edge cases (failed-run retry, DOM+DOW cron matching) lack test coverage.

Recommended Action

⚠️ Request Changes — 5 MODERATE findings, notably the UI-thread invariant violation (MODERATE-1) and the DOM+DOW semantics regression (MODERATE-3). The core architecture is sound; these are focused fixes.

PureWeen · 2026-04-01T06:27:18Z

🔍 Multi-Model Code Review (Round 3) — PR #380

feat: Scheduled tasks (cron jobs) for recurring prompt execution
Reviewed by: Claude Opus 4.6 · Claude Sonnet 4.6 · GPT-5.3-Codex · 2-of-3 consensus applied

CI: ⚠️ No checks configured
Scope: 15 files, +2867/-30 (7 commits)
Prior reviews: R1 (4 CRITICAL + 2 MODERATE, all fixed) · R2 (approved) · 3 new commits since R2

R1/R2 Fixes — All Verified ✅

R1/R2 Finding	Status
Service eager start (timer not running)	✅ Fixed — `GetRequiredService` in MauiProgram.cs
Cron off-by-one (started from +1 minute)	✅ Fixed — starts from current minute
Weekly missed-day skip	✅ Fixed — mirrors Daily logic
Edit overwrites run history	✅ Fixed — UpdateTask merges only user-editable fields
Atomic file write	✅ Fixed — write .tmp + File.Move
I/O outside lock	✅ Fixed — snapshot under lock, serialize/write outside
RunNow fire-and-forget (R2)	✅ Fixed in `e37e5db`
IsValidTimeOfDay negative TimeSpan (R2)	✅ Fixed in `4316011`

🟡 MODERATE — Cron DOM+DOW uses AND semantics instead of standard OR

Flagged by: Opus ✓ · Codex ✓ (2/3)
File: PolyPilot/Models/ScheduledTask.cs, GetNextCronTimeUtc (~lines 335–340)

if (cron.Months.Contains(candidate.Month) &&
    cron.DaysOfMonth.Contains(candidate.Day) &&
    cron.DaysOfWeek.Contains((int)candidate.DayOfWeek) &&

Standard POSIX cron semantics: when both day-of-month and day-of-week are non-wildcard, they should be OR'd (fire if either matches). This implementation uses AND — a cron like 0 9 15 * 1 would only fire when the 15th falls on a Monday (~1-2 times/year) instead of every Monday AND every 15th (~56 times/year). Users familiar with crontab will be surprised.

Fix: Track whether DOM/DOW fields were wildcards in CronSchedule, and apply OR when both are explicit. Or document the deviation in the UI hint text.

🟢 MINOR — Cron brute-force O(527K) iteration per task per 30s tick

Flagged by: Opus ✓ · Sonnet ✓ (2/3)
File: PolyPilot/Models/ScheduledTask.cs, GetNextCronTimeUtc (~line 333)

The loop iterates up to 366 × 24 × 60 = 527,040 times for each cron task on every 30-second evaluation tick. A cron expression with no valid match within a year (e.g., 0 0 31 2 *) runs the full loop returning null, every 30s, forever.

Impact: Low for a handful of well-formed tasks (~10ms). Becomes a CPU concern with misconfigured expressions or many tasks.

Fix: Cache NextRunTime on the task and only recompute after a run or config change.

🟢 MINOR — Badge count inflates for reflection-cycle sub-turns

Flagged by: Opus ✓ · Sonnet ✓ (2/3)
File: PolyPilot/Services/CopilotService.Events.cs (~line 1217)

IncrementPendingCompletions is called in CompleteResponse, which fires on every reflection-cycle sub-turn, not just the terminal completion. A session running 5 reflection iterations in the background increments the badge 5 times. Additionally, frequent interval tasks (e.g., every 1 minute) grow the count unbounded.

Fix: Only badge when ReflectionCycle == null || ReflectionCycle.IsComplete, and consider capping the badge at 99.

Notable 1-of-3 Findings (not consensus, but worth awareness)

These were flagged by only one model but have real substance:

Finding	Model	Severity	Note
`run.Success = true` at dispatch, not response completion	Sonnet	CRITICAL	Success means "prompt dispatched", not "task completed successfully" — misleading run history
RunNow + timer can double-execute same task	Sonnet	HIGH	No per-task in-flight guard; `_evaluating` only prevents timer-vs-timer overlap
`SendPromptAsync` from ThreadPool violates `IsProcessing` UI-thread invariant	Opus	MODERATE	Timer callback → ExecuteTaskAsync → SendPromptAsync runs off UI thread
Failed runs set `LastRunAt`, blocking Daily/Weekly retry until next cycle	Opus	MODERATE	A failed 09:00 task won't retry today — `GetNextRunTimeUtc` returns tomorrow
`SaveTasks()` concurrent saves can interleave (same .tmp file)	Codex	MODERATE	Timer + UI save race on the same temp file path

Test Coverage

Area	Tested?
Model: serialization, schedule, cron, clone	✅ 71+ tests
Service: CRUD, persistence, execution, errors	✅
Cron: valid/invalid, ranges, steps, lists	✅
Validation: IsValidTimeOfDay, negative TimeSpan	✅
Thread safety: clone isolation, stale-clone exec	✅
Atomic write (no .tmp file remains)	✅
Badge: IncrementPendingCompletions / Clear	❌ No tests
Cron DOM+DOW OR semantics	❌ Not tested (bug)
Test services disposal (timer cleanup)	⚠️ Services not disposed in tests

⚠️ Recommended Action: Approve with Caveats

The architecture is solid — the clone/snapshot pattern, atomic persistence, lock discipline, and overlap guard are well-designed. All R1/R2 findings have been addressed. The 71+ tests are thorough.

Should fix before merge (consensus findings):

Cron DOM+DOW AND→OR semantics — real user-visible bug for combined day expressions

Good to fix (low priority):
2. Cache cron next-run time to avoid O(527K) per tick
3. Gate badge increment on reflection cycle completion

PureWeen · 2026-04-03T02:37:50Z

🔍 Multi-Model Code Review — PR #380 R5 (post-fix re-review)

CI status: ⚠️ No CI checks configured
Local validation: ✅ scheduled-task-focused tests and the full PolyPilot.Tests suite pass locally
Consensus: All 3 reviewers agree the previously blocking issues are now fixed and no new substantive defects were found.

Re-review status of the prior blocking findings

Finding	Status	Verdict
One long-running scheduled run could block unrelated due tasks	✅ FIXED	`EvaluateTasksAsync()` now dispatches due tasks without serially awaiting each one, so the scheduler no longer gets held hostage by a single long task
Closing/deleting the target session mid-run could stall until the 11-minute timeout	✅ FIXED	`WaitForSessionCompletionAsync()` now listens for session closure and exits quickly with a failed run result instead of spinning until timeout

Remaining findings

None. This re-review did not find any new high-confidence correctness, concurrency, persistence, or lifecycle issues in the updated diff.

Test coverage

Coverage is now materially stronger for the risky lifecycle edges introduced by the hardening work. In particular, the suite now includes dedicated regression tests for:

a long-running scheduled task not blocking another due task
a target session being closed while a scheduled run is waiting for completion
existing completion-wait, stale-edit/toggle, retry, and duplicate-session-name paths

Recommended action

✅ Approve — the blocking scheduler issues from the prior review are addressed, the new edge cases are covered by regression tests, and the PR is in good shape to merge.

Adds a cron-like scheduled tasks system for recurring prompt execution — daily stand-ups, periodic reviews, automated checks, etc. - Three schedule types: Interval (every N minutes), Daily (at time), Weekly (days + time) - Cron expression support: 5-field (min hour dom month dow) with wildcards, ranges, lists, and step values (e.g., '0 9 * * 1-5' for weekdays at 9am) - Full CRUD UI page at /scheduled-tasks with form validation - Task cards with toggle, edit, run-now, delete (with confirmation), and history - Expandable run history showing last 10 executions with timestamps and error details - Background timer evaluates due tasks every 30 seconds with overlap guard - Executes by sending prompt to existing session or creating a new one per run - Persists to ~/.polypilot/scheduled-tasks.json - Test isolation via SetTasksFilePathForTesting wired into TestSetup.cs - Fix Interval snap-forward: missed intervals return the last boundary, not 'now' - Fix Daily schedule: consistent local-time date comparison instead of mixed UTC/local - Fix SaveTasks race: RecordRunAndSave now holds lock through save - Fix CSS font-family enforcement: use var(--font-mono) not raw monospace - Fix .NET 10 Blazor type inference: use text input for time-of-day instead of type=time - Model: serialization, schedule descriptions, time parsing, due detection - Cron: valid/invalid expressions, ranges, steps, lists, JsonRoundTrip, next-run - Validation: IsValidTimeOfDay, ScheduleType.Cron enum, max interval - Schedule edge cases: daily never-run, daily already-ran-today, interval snap-forward - Service: persistence, CRUD, execution, error handling, corrupt file handling Fixes #367 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

… canonical instance - ScheduledTask.Clone() deep-copies all fields including RecentRuns and DaysOfWeek - GetTasks() and GetTask() return clones so UI mutations cannot race with the timer - EvaluateTasksAsync collects task IDs (not direct references) before releasing lock - ExecuteTaskAsync(string taskId, ...) snapshots task data under lock, then executes async operations against the snapshot — no lock held across awaits - Convenience overload ExecuteTaskAsync(ScheduledTask, ...) delegates to ID-based version - RecordRunAndSave(string taskId, ...) looks up the canonical instance by ID under lock so stale UI clones can never corrupt the internal task's run history - 4 new tests: Clone independence, GetTasks/GetTask mutation isolation, stale-clone execution Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Review findings addressed (all from multi-model code review): 🔴 CRITICAL fixes: 1. Service eager start — ScheduledTaskService now resolved eagerly in MauiProgram.cs so the background timer starts on app launch, not just when the user visits the Scheduled Tasks page. 2. Cron off-by-one — GetNextCronTimeUtc now starts from the current minute (not +1). LastRunAt prevents re-firing within the same minute. 3. Weekly missed-day — Weekly schedule now mirrors Daily logic: only skips today's passed slot if LastRunAt shows the task already ran today. 4. Edit preserves run history — UpdateTask now merges only user-editable fields onto the canonical instance. LastRunAt and RecentRuns are never overwritten by stale clones from the edit form. 🟡 MODERATE fixes: 5. Atomic file write — SaveTasks uses write-to-temp + File.Move to prevent data loss on crash during write. 6. I/O outside lock — SaveTasks snapshots tasks under lock, then does serialization and file I/O outside the lock to prevent UI freezes. Additional hardening: - Start() checks _disposed guard to prevent zombie timers after Dispose - Per-task exception isolation in EvaluateTasksAsync foreach loop - Fixed 5 flaky timing tests in TurnEndFallbackTests using TCS pattern - Fixed minute-boundary race in CronSchedule_IsDue_ReturnsFalseWhenAlreadyRanThisMinute - Added tests: UpdateTask preserves run history, atomic write verification, Start-after-Dispose guard All 3,088 tests pass (0 failures). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…afety The 1024-byte threshold was too tight — when xUnit runs DiagnosticsLogTests in parallel (12 Theory cases + 4 Fact tests share one log file), concurrent writes can exceed 1KB. Bumped to 10KB which still validates the test's intent (file is nowhere near the 10MB rotation threshold). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Add BadgeHelper.cs (MacCatalyst) using UNUserNotificationCenter.SetBadgeCountAsync (Mac Catalyst 16+) with UIApplication fallback for 15.x - Track _pendingCompletionCount in CopilotService; increment in CompleteResponse for non-worker, non-active sessions only - Clear badge in SwitchSession, window.Activated, and OnResume so the count resets whenever the user returns to the app Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- RunNow: fire-and-forget ExecuteTaskAsync to avoid blocking the UI - SetEnabled: guard SaveTasks/OnTasksChanged with null check on task - IsValidTimeOfDay: reject negative TimeSpan values (add >= 0 check) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Per-task in-flight guard: prevent RunNow + timer double-execution of same task - Unique temp file path per SaveTasks() call to eliminate concurrent .tmp race - Gate dock badge on reflection cycle completion (skip sub-turn increments) - Badge code refactored: read cycle once, check IsActive before both badge and reflection paths Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Marshal scheduled Copilot calls back to the captured UI context, preserve enable toggles from stale edit snapshots, prevent stale save writes from winning, and keep daily/weekly tasks due after failed runs. Add regression tests for the retry and stale-edit cases. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Have scheduled runs wait for the session to actually finish before recording success, with an OnSessionComplete/poll fallback and timeout. This prevents false-positive green runs when the prompt dispatch succeeds but the turn later errors. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Prevent long-running scheduled runs from starving unrelated due tasks, fail fast when the target session closes mid-run, and add regression coverage for the new lifecycle edges. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

## Summary Extends the scheduled tasks feature (PR #380) with additional integration test scenarios, UI scaffolding, and bridge support. ### Changes **Tests:** - New integration test scenarios in `scheduled-task-scenarios.json` (desktop entrypoint, target existing session, slash command, persistence after relaunch, session close disables task) - New unit tests in `ScheduledTaskTests.cs` covering the added scenarios - `ScenarioReferenceTests` updated with structural coverage for all new scenarios - `BridgePromptQueueTests` hardened with cancellation token support and increased timeout - `SlashCommandAutocompleteTests` minor fixes **UI:** - `SessionSidebar.razor` — scheduled task indicators and overflow menu link to /scheduled-tasks - `SessionSidebar.razor.css` — styles for scheduled task indicators - `Dashboard.razor` — scheduled task UI scaffolding, injected `ScheduledTaskService` **Bridge:** - `WsBridgeServer.cs` — scheduled task command handling with session busy check **Model:** - `ScheduledTask.cs` — additional model support (149 lines) **Misc:** - `DemoService.cs` — removed unused imports - `index.html` — minor update ### Testing All 3,351 tests pass ✅ --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI assigned Copilot and PureWeen Mar 15, 2026

Copilot started work on behalf of PureWeen March 15, 2026 23:43 View session

Copilot AI changed the title ~~[WIP] [request] Add cron job for replicating new claude code tasks~~ Add scheduled tasks for recurring prompt execution Mar 16, 2026

Copilot AI requested a review from PureWeen March 16, 2026 00:14

Copilot finished work on behalf of PureWeen March 16, 2026 00:14

PureWeen force-pushed the copilot/add-cron-job-for-clause-tasks branch 2 times, most recently from 109fbaf to 01825de Compare March 29, 2026 16:49

PureWeen changed the title ~~Add scheduled tasks for recurring prompt execution~~ feat: Scheduled tasks (cron jobs) for recurring prompt execution Mar 29, 2026

PureWeen force-pushed the copilot/add-cron-job-for-clause-tasks branch from 9179d1a to 65565a1 Compare March 31, 2026 12:16

PureWeen marked this pull request as ready for review April 3, 2026 02:26

PureWeen and others added 9 commits April 5, 2026 22:58

fix: guard ParseTimeOfDay against negative TimeSpan values

3accf8e

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

PureWeen force-pushed the copilot/add-cron-job-for-clause-tasks branch from 9c77e34 to f3cd0f1 Compare April 6, 2026 04:14

PureWeen and others added 2 commits April 5, 2026 23:21

fix: finish scheduled task hardening

7937bef

Prevent long-running scheduled runs from starving unrelated due tasks, fail fast when the target session closes mid-run, and add regression coverage for the new lifecycle edges. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

PureWeen merged commit 20680c3 into main Apr 6, 2026

PureWeen deleted the copilot/add-cron-job-for-clause-tasks branch April 6, 2026 06:30

PureWeen mentioned this pull request Apr 15, 2026

feat: scheduled task integration tests and UI improvements #587

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Scheduled tasks (cron jobs) for recurring prompt execution#380

feat: Scheduled tasks (cron jobs) for recurring prompt execution#380
PureWeen merged 11 commits intomainfrom
copilot/add-cron-job-for-clause-tasks

Copilot AI commented Mar 15, 2026 •

edited by PureWeen

Loading

Uh oh!

PureWeen commented Mar 30, 2026

Uh oh!

PureWeen commented Mar 30, 2026

Uh oh!

PureWeen commented Mar 30, 2026

Uh oh!

PureWeen commented Apr 1, 2026

Uh oh!

PureWeen commented Apr 1, 2026

Uh oh!

PureWeen commented Apr 3, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Mar 15, 2026 • edited by PureWeen Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Features

Bug Fixes (from original Copilot bot PR)

Tests (71 total)

Files Changed

Uh oh!

PureWeen commented Mar 30, 2026

🤖 Multi-Model Code Review — PR #380

🔴 CRITICAL — Scheduled task service never starts unless the page is visited

🔴 CRITICAL — Cron tasks never fire (off-by-one in time calculation)

🔴 CRITICAL — Weekly schedule skips missed same-day slots

🔴 CRITICAL — UI edit overwrites run history (data loss race)

🟡 MODERATE — Non-atomic file write risks total task loss

🟡 MODERATE — Disk I/O and serialization while holding lock

Notable Single-Model Findings (informational — did not meet 2-of-3 consensus)

📋 Summary

Uh oh!

PureWeen commented Mar 30, 2026

Uh oh!

PureWeen commented Mar 30, 2026

🤖 Multi-Model Code Review (Round 2) — PR #380

R1 Findings — Fix Verification

New Findings (Round 2)

🟡 MODERATE — Shared temp file path creates save race

🟢 MINOR — ExecuteTaskAsync doesn't re-check IsEnabled after snapshot

Notable Single-Model Findings (informational — did not meet 2-of-3 consensus)

✅ Positive Observations (all 3 models)

📋 Summary

Uh oh!

PureWeen commented Apr 1, 2026

🔍 Multi-Model Code Review — PR #380 (R3)

Review Method

Consensus Findings

🟡 MODERATE-1: SendPromptAsync from ThreadPool violates IsProcessing UI-thread invariant

🟡 MODERATE-2: Failed runs set LastRunAt, blocking Daily/Weekly retry until next cycle

🟡 MODERATE-3: Cron DOM+DOW uses AND semantics instead of standard OR

🟡 MODERATE-4: Start()/Dispose() race — no synchronization on _disposed

🟡 MODERATE-5: No tests for badge increment/clear/skip logic

🟢 MINOR-1: RunNow fire-and-forget — unobserved exception if handler throws

🟢 MINOR-2: Badge count grows unbounded for frequent interval tasks

🟢 MINOR-3: Cron brute-force O(527K) runs every 30s per cron task

Summary Table

Prior R1/R2 Issues

Test Coverage

Recommended Action

Uh oh!

PureWeen commented Apr 1, 2026

🔍 Multi-Model Code Review (Round 3) — PR #380

R1/R2 Fixes — All Verified ✅

🟡 MODERATE — Cron DOM+DOW uses AND semantics instead of standard OR

🟢 MINOR — Cron brute-force O(527K) iteration per task per 30s tick

🟢 MINOR — Badge count inflates for reflection-cycle sub-turns

Notable 1-of-3 Findings (not consensus, but worth awareness)

Test Coverage

⚠️ Recommended Action: Approve with Caveats

Uh oh!

PureWeen commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔍 Multi-Model Code Review — PR #380 R5 (post-fix re-review)

Re-review status of the prior blocking findings

Remaining findings

Test coverage

Recommended action

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Mar 15, 2026 •

edited by PureWeen

Loading

🟢 MINOR — `ExecuteTaskAsync` doesn't re-check `IsEnabled` after snapshot

🟡 MODERATE-1: `SendPromptAsync` from ThreadPool violates `IsProcessing` UI-thread invariant

🟡 MODERATE-2: Failed runs set `LastRunAt`, blocking Daily/Weekly retry until next cycle

🟡 MODERATE-4: `Start()`/`Dispose()` race — no synchronization on `_disposed`

🟢 MINOR-1: `RunNow` fire-and-forget — unobserved exception if handler throws

PureWeen commented Apr 3, 2026 •

edited

Loading