feat: scheduled-tasks integration test scenario by PureWeen · Pull Request #714 · PureWeen/PolyPilot

PureWeen · 2026-04-22T14:51:34Z

Adds end-to-end integration tests for the scheduled tasks feature (PR #587).

Test script (.github/integration-tests/scheduled-tasks.sh):

Navigate to /scheduled-tasks page
Create a new interval task via the form
Verify task card appears with correct schedule/prompt
Toggle disable/enable
Validate invalid cron rejection
Delete the task and verify removal

Usage:

gh workflow run polypilot-integration.yml -F scenario=scheduled-tasks -F pr_number=587 -F ref=scheduled-task-integration-tests

Runs on both Linux/GTK and Windows via MauiDevFlow CDP.

Adds a bash test script that validates the scheduled tasks feature end-to-end via MauiDevFlow CDP: 1. Navigate to /scheduled-tasks page 2. Create a task via the form (name, prompt, interval) 3. Verify task card appears with correct details 4. Toggle disable/enable 5. Validate invalid cron rejection 6. Delete the task Wired into polypilot-integration.yml as 'scheduled-tasks' scenario option, runs on both Linux and Windows when selected. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-04-22T15:08:26Z

🟡 Design-Level Finding: Missing Mac Catalyst coverage

Flagged by: 1/3 reviewers → verified by 2 follow-up reviewers (both agree)

The "Feature Test: Scheduled Tasks" step was added to both the integration-linux (line 263) and integration-windows (line 672) jobs, but not to integration-maccatalyst. When scenario=full or scenario=scheduled-tasks is used, Mac Catalyst silently skips this test while the other two platforms run it.

If a Blazor rendering or interaction bug is specific to macOS WebKit, it would go undetected.

Suggestion: Add the same step block to the integration-maccatalyst job (after the existing smoke/copilot steps, before "Upload artifacts"):

      - name: "Feature Test: Scheduled Tasks"
        if: steps.devflow.outputs.agent_port != '' && (inputs.scenario == 'scheduled-tasks' || inputs.scenario == 'full')
        run: |
          PORT=${{ steps.devflow.outputs.agent_port }}
          bash .github/integration-tests/scheduled-tasks.sh "$PORT"

Generated by Expert Code Review (auto) for issue #714 · ● 8.9M · ◷

github-actions

Expert Code Review — PR #714: scheduled-tasks integration test scenario

Methodology: 3 independent reviewers with adversarial consensus. Findings required 2/3+ agreement to be included. Disputed findings (1/3) were verified via targeted follow-up review. Model names are anonymized as Reviewer 1/2/3.

Findings Summary (ranked by severity)

#	Severity	Finding	Flagged by	Location
1	🟡 MODERATE	`${{ }}` expression injection in workflow `run:` blocks	2/3	`polypilot-integration.yml:266,676`
2	🟡 MODERATE	`set -euo pipefail` + jq pipe → silent script abort on CDP failure	2/3	`scheduled-tasks.sh:68,76`
3	🟡 MODERATE	Header lists 8 tests, only 6 implemented (missing "Run Now" and run-history)	3/3	`scheduled-tasks.sh:7-14`
4	🟡 MODERATE	Fixed `sleep` instead of `wait_for` polling → flaky race conditions	3/3	`scheduled-tasks.sh:153,161,183,190`
5	🟡 MODERATE	`cdp_set_value`/`cdp_select` embed values into single-quoted JS (fragile)	2/3	`scheduled-tasks.sh:52-54`
6	🟡 MODERATE	No scheduled-tasks step in Mac Catalyst job (platform gap)	1/3 + 2 follow-ups agree	`polypilot-integration.yml` (design)
7	🟢 MINOR	`cdp_eval` builds JSON via string interpolation (latent injection risk)	2/3	`scheduled-tasks.sh:37`
8	🟢 MINOR	No early bail-out on critical setup failure → cascading test noise	2/3	`scheduled-tasks.sh:111-114`

Discarded Findings (single reviewer only, not confirmed)

echo -e portability concern (1/3 — bash shebang makes this safe)
exit $FAIL exit code semantics (1/3 — max 12 tests, well under 255)
CDP endpoint path mismatch speculation (1/3 — no evidence of API difference)
Windows jq availability (1/3 → 0/3 after follow-up — jq is pre-installed on windows-latest)

CI Status

Commit status: pending (0 statuses reported)
Check runs: 6 total — 4 completed (all ✅ success), 2 in progress (agent jobs)
No prior reviews on this PR

Assessment

This PR adds solid integration test scaffolding with well-structured helpers and good test coverage for the scheduled-tasks CRUD flow. The main concerns are around test reliability (race conditions from fixed sleeps, silent abort from pipefail + jq) and maintainability (fragile string interpolation in CDP helpers, misleading header comments). The workflow integration pattern should use environment variables instead of inline ${{ }} expressions per GitHub security best practices. None of the findings indicate data loss or regression risk to existing functionality — this is net-new test code.

Generated by Expert Code Review (auto) for issue #714 · ● 8.9M

github-actions · 2026-04-22T15:08:26Z

+#   1. Navigate to /scheduled-tasks page
+#   2. Create a new interval task via the form
+#   3. Verify the task card appears with correct details
+#   4. Run the task immediately (Run Now)
+#   5. Verify run history shows success
+#   6. Disable/enable the task
+#   7. Delete the task
+#   8. Verify form validation (invalid cron)


🟡 MODERATE — Header lists 8 tests but only 6 are implemented · Flagged by: 3/3 reviewers

The header claims tests for "Run the task immediately (Run Now)" (#4) and "Verify run history shows success" (#5), but neither is implemented. The actual tests are:

Navigate to /scheduled-tasks

Create interval task

Verify task card details

Toggle disable/enable

Form validation (invalid cron)

Delete task

This gives false confidence that "Run Now" and run-history paths have integration coverage when they don't.

Suggestion: Update the header to match the 6 implemented tests, or add the missing "Run Now" and run-history assertions.

github-actions · 2026-04-22T15:08:27Z

+fi
+
+PROMPT_TEXT=$(cdp_text ".task-card[data-task-name='$TASK_NAME'] .task-prompt-preview")
+if echo "$PROMPT_TEXT" | grep -qi "hello"; then


🟡 MODERATE — Fixed sleep instead of polling creates race conditions · Flagged by: 3/3 reviewers

Multiple test steps use bare sleep 1 followed by an immediate state check, rather than the wait_for polling pattern used elsewhere in the script. On loaded CI runners, Blazor re-renders can exceed 1 second, causing flaky false negatives.

Affected locations:

Line 153: toggle disable check (sleep 1 → check .disabled class)

Line 161: toggle re-enable check

Line 190: validation error check after form submit

Line 183: Test 5 form readiness (uses sleep 1 instead of wait_for "#scheduled-task-form" like Test 2 does)

Suggestion: Replace bare sleep + immediate assertion with wait_for:

cdp_click "... [data-task-action='toggle-enabled']" if wait_for ".task-card[data-task-name='$TASK_NAME'].disabled" "task disabled" 5; then pass "Task disabled" else fail "Task not visually disabled after toggle" fi

github-actions · 2026-04-22T15:08:27Z

+  result=$(curl -s --max-time 10 "$BASE/api/cdp/evaluate" \
+    -H "Content-Type: application/json" \
+    -d "{\"expression\": \"$expr\"}" 2>/dev/null) || true
+  echo "$result"
+}


🟢 MINOR — cdp_eval builds JSON via string interpolation (latent injection risk) · Flagged by: 2/3 reviewers

-d "{\"expression\": \"$expr\"}" embeds $expr directly into JSON. Characters like ", \, or newlines in the expression would produce malformed JSON and a silent CDP failure (masked by || true).

Current callers are safe — all values are simple alphanumeric strings. However, this is a maintenance hazard: a future test with a value like He said "hi" or C:\path would silently break.

Suggestion: Use jq to construct the JSON payload safely:

cdp_eval() { local expr="$1" local payload payload=$(jq -n --arg e "$expr" '{"expression": $e}') curl -s --max-time 10 "$BASE/api/cdp/evaluate" \ -H "Content-Type: application/json" \ -d "$payload" 2>/dev/null || true }

github-actions · 2026-04-22T15:08:27Z

+      - name: "Feature Test: Scheduled Tasks"
+        if: steps.devflow.outputs.agent_port != '' && (inputs.scenario == 'scheduled-tasks' || inputs.scenario == 'full')
+        run: |
+          PORT=${{ steps.devflow.outputs.agent_port }}


🟡 MODERATE — $\{\{ }} expression injected directly into run: shell script · Flagged by: 2/3 reviewers

PORT=$\{\{ steps.devflow.outputs.agent_port }} injects the step output directly into the shell script before bash evaluates it. While agent_port is numeric in practice, this is the canonical GitHub Actions expression injection pattern — if the output were ever manipulated, it would allow arbitrary command execution. The same pattern appears in the Windows job (line 676).

Suggestion: Pass through environment variables instead:

env: PORT: $\{\{ steps.devflow.outputs.agent_port }} run: | echo "Running scheduled tasks integration tests against port $PORT" bash .github/integration-tests/scheduled-tasks.sh "$PORT"

github-actions · 2026-04-22T15:08:27Z

+}
+
+# Helper: get text content
+cdp_text() {


🟡 MODERATE — set -euo pipefail + jq pipe causes silent script abort on CDP failure · Flagged by: 2/3 reviewers

If the CDP endpoint returns empty/non-JSON (network hiccup, app restart, timeout), cdp_eval returns empty (curl has || true). Then echo "" | jq -r '.result.value // false' 2>/dev/null → jq fails on invalid JSON → with pipefail, the pipe returns non-zero. When called in a bare assignment like DISABLED=$(cdp_exists "...") (line 155), set -e kills the entire script instantly — no summary, no test results, cryptic CI output.

The same issue affects cdp_text (line 76).

Suggestion: Guard against empty/invalid JSON:

echo "$result" | jq -r '.result.value // false' 2>/dev/null || echo "false"

And for cdp_text:

echo "$result" | jq -r '.result.value // empty' 2>/dev/null || true

github-actions · 2026-04-22T15:08:27Z

+if wait_for "#scheduled-tasks-page" "scheduled tasks page" 10; then
+  pass "Navigated to /scheduled-tasks page"
+else
+  fail "Could not navigate to /scheduled-tasks page"


🟢 MINOR — No early bail-out on critical setup failures → cascading test noise · Flagged by: 2/3 reviewers

If Test 1 (navigation) or Test 2 (task creation) fails, fail() increments the counter but execution continues. All subsequent tests depend on the page being loaded and the task existing — they'll all fail too, producing 5+ cascading failures that obscure the root cause in CI logs.

Suggestion: After critical setup steps, bail out early:

if ! wait_for "#scheduled-tasks-page" ... 10; then fail "Could not navigate — aborting remaining tests" # print summary and exit exit $FAIL fi

github-actions · 2026-04-22T15:08:27Z

+  cdp_eval "(() => { const el = document.querySelector('$selector'); if (el) { el.value = '$value'; el.dispatchEvent(new Event('input', {bubbles:true})); el.dispatchEvent(new Event('change', {bubbles:true})); } return el ? 'ok' : 'not found'; })()"
+}
+
+# Helper: select dropdown option


🟡 MODERATE — cdp_set_value / cdp_select embed $value into single-quoted JS · Flagged by: 2/3 reviewers

el.value = '$value' interpolates the value inside a JS single-quoted string. If any caller passes a value containing ' (e.g., it's a test), the JavaScript becomes syntactically invalid and the CDP call silently fails.

Current callers use safe values, but this is a general-purpose helper that invites reuse with arbitrary strings.

Suggestion: Escape single quotes before interpolation, or pass the value via a JS template literal/variable assignment pattern that avoids inline string quoting issues.

PureWeen merged commit eccebe7 into main Apr 22, 2026

PureWeen deleted the feat/scheduled-task-integration-tests branch April 22, 2026 14:51

github-actions Bot mentioned this pull request Apr 22, 2026

[review-retro] Review Retrospective — PR #714 (feat: scheduled-tasks integration test scenario) #715

Closed

github-actions Bot reviewed Apr 22, 2026

View reviewed changes

github-actions Bot mentioned this pull request Apr 22, 2026

[review-retro] Review Retrospective — PR #719 (fix: fetch integration test scripts from main) #721

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: scheduled-tasks integration test scenario#714

feat: scheduled-tasks integration test scenario#714
PureWeen merged 1 commit intomainfrom
feat/scheduled-task-integration-tests

PureWeen commented Apr 22, 2026

Uh oh!

github-actions Bot commented Apr 22, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot Apr 22, 2026

Uh oh!

github-actions Bot Apr 22, 2026

Uh oh!

github-actions Bot Apr 22, 2026

Uh oh!

github-actions Bot Apr 22, 2026

Uh oh!

github-actions Bot Apr 22, 2026

Uh oh!

github-actions Bot Apr 22, 2026

Uh oh!

github-actions Bot Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

PureWeen commented Apr 22, 2026

Uh oh!

github-actions Bot commented Apr 22, 2026

🟡 Design-Level Finding: Missing Mac Catalyst coverage

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Expert Code Review — PR #714: scheduled-tasks integration test scenario

Findings Summary (ranked by severity)

Discarded Findings (single reviewer only, not confirmed)

CI Status

Assessment

Uh oh!

github-actions Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant