Skip to content

feat: scheduled-tasks integration test scenario#714

Merged
PureWeen merged 1 commit intomainfrom
feat/scheduled-task-integration-tests
Apr 22, 2026
Merged

feat: scheduled-tasks integration test scenario#714
PureWeen merged 1 commit intomainfrom
feat/scheduled-task-integration-tests

Conversation

@PureWeen
Copy link
Copy Markdown
Owner

Adds end-to-end integration tests for the scheduled tasks feature (PR #587).

Test script (.github/integration-tests/scheduled-tasks.sh):

  1. Navigate to /scheduled-tasks page
  2. Create a new interval task via the form
  3. Verify task card appears with correct schedule/prompt
  4. Toggle disable/enable
  5. Validate invalid cron rejection
  6. Delete the task and verify removal

Usage:

gh workflow run polypilot-integration.yml -F scenario=scheduled-tasks -F pr_number=587 -F ref=scheduled-task-integration-tests

Runs on both Linux/GTK and Windows via MauiDevFlow CDP.

Adds a bash test script that validates the scheduled tasks feature
end-to-end via MauiDevFlow CDP:
1. Navigate to /scheduled-tasks page
2. Create a task via the form (name, prompt, interval)
3. Verify task card appears with correct details
4. Toggle disable/enable
5. Validate invalid cron rejection
6. Delete the task

Wired into polypilot-integration.yml as 'scheduled-tasks' scenario
option, runs on both Linux and Windows when selected.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen PureWeen merged commit eccebe7 into main Apr 22, 2026
@PureWeen PureWeen deleted the feat/scheduled-task-integration-tests branch April 22, 2026 14:51
@github-actions
Copy link
Copy Markdown
Contributor

🟡 Design-Level Finding: Missing Mac Catalyst coverage

Flagged by: 1/3 reviewers → verified by 2 follow-up reviewers (both agree)

The "Feature Test: Scheduled Tasks" step was added to both the integration-linux (line 263) and integration-windows (line 672) jobs, but not to integration-maccatalyst. When scenario=full or scenario=scheduled-tasks is used, Mac Catalyst silently skips this test while the other two platforms run it.

If a Blazor rendering or interaction bug is specific to macOS WebKit, it would go undetected.

Suggestion: Add the same step block to the integration-maccatalyst job (after the existing smoke/copilot steps, before "Upload artifacts"):

      - name: "Feature Test: Scheduled Tasks"
        if: steps.devflow.outputs.agent_port != '' && (inputs.scenario == 'scheduled-tasks' || inputs.scenario == 'full')
        run: |
          PORT=${{ steps.devflow.outputs.agent_port }}
          bash .github/integration-tests/scheduled-tasks.sh "$PORT"

Generated by Expert Code Review (auto) for issue #714 · ● 8.9M ·

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expert Code Review — PR #714: scheduled-tasks integration test scenario

Methodology: 3 independent reviewers with adversarial consensus. Findings required 2/3+ agreement to be included. Disputed findings (1/3) were verified via targeted follow-up review. Model names are anonymized as Reviewer 1/2/3.


Findings Summary (ranked by severity)

# Severity Finding Flagged by Location
1 🟡 MODERATE ${{ }} expression injection in workflow run: blocks 2/3 polypilot-integration.yml:266,676
2 🟡 MODERATE set -euo pipefail + jq pipe → silent script abort on CDP failure 2/3 scheduled-tasks.sh:68,76
3 🟡 MODERATE Header lists 8 tests, only 6 implemented (missing "Run Now" and run-history) 3/3 scheduled-tasks.sh:7-14
4 🟡 MODERATE Fixed sleep instead of wait_for polling → flaky race conditions 3/3 scheduled-tasks.sh:153,161,183,190
5 🟡 MODERATE cdp_set_value/cdp_select embed values into single-quoted JS (fragile) 2/3 scheduled-tasks.sh:52-54
6 🟡 MODERATE No scheduled-tasks step in Mac Catalyst job (platform gap) 1/3 + 2 follow-ups agree polypilot-integration.yml (design)
7 🟢 MINOR cdp_eval builds JSON via string interpolation (latent injection risk) 2/3 scheduled-tasks.sh:37
8 🟢 MINOR No early bail-out on critical setup failure → cascading test noise 2/3 scheduled-tasks.sh:111-114

Discarded Findings (single reviewer only, not confirmed)

  • echo -e portability concern (1/3 — bash shebang makes this safe)
  • exit $FAIL exit code semantics (1/3 — max 12 tests, well under 255)
  • CDP endpoint path mismatch speculation (1/3 — no evidence of API difference)
  • Windows jq availability (1/3 → 0/3 after follow-up — jq is pre-installed on windows-latest)

CI Status

  • Commit status: pending (0 statuses reported)
  • Check runs: 6 total — 4 completed (all ✅ success), 2 in progress (agent jobs)
  • No prior reviews on this PR

Assessment

This PR adds solid integration test scaffolding with well-structured helpers and good test coverage for the scheduled-tasks CRUD flow. The main concerns are around test reliability (race conditions from fixed sleeps, silent abort from pipefail + jq) and maintainability (fragile string interpolation in CDP helpers, misleading header comments). The workflow integration pattern should use environment variables instead of inline ${{ }} expressions per GitHub security best practices. None of the findings indicate data loss or regression risk to existing functionality — this is net-new test code.

Generated by Expert Code Review (auto) for issue #714 · ● 8.9M

Comment on lines +7 to +14
# 1. Navigate to /scheduled-tasks page
# 2. Create a new interval task via the form
# 3. Verify the task card appears with correct details
# 4. Run the task immediately (Run Now)
# 5. Verify run history shows success
# 6. Disable/enable the task
# 7. Delete the task
# 8. Verify form validation (invalid cron)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MODERATE — Header lists 8 tests but only 6 are implemented · Flagged by: 3/3 reviewers

The header claims tests for "Run the task immediately (Run Now)" (#4) and "Verify run history shows success" (#5), but neither is implemented. The actual tests are:

  1. Navigate to /scheduled-tasks
  2. Create interval task
  3. Verify task card details
  4. Toggle disable/enable
  5. Form validation (invalid cron)
  6. Delete task

This gives false confidence that "Run Now" and run-history paths have integration coverage when they don't.

Suggestion: Update the header to match the 6 implemented tests, or add the missing "Run Now" and run-history assertions.

Comment on lines +153 to +156
fi

PROMPT_TEXT=$(cdp_text ".task-card[data-task-name='$TASK_NAME'] .task-prompt-preview")
if echo "$PROMPT_TEXT" | grep -qi "hello"; then
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MODERATE — Fixed sleep instead of polling creates race conditions · Flagged by: 3/3 reviewers

Multiple test steps use bare sleep 1 followed by an immediate state check, rather than the wait_for polling pattern used elsewhere in the script. On loaded CI runners, Blazor re-renders can exceed 1 second, causing flaky false negatives.

Affected locations:

  • Line 153: toggle disable check (sleep 1 → check .disabled class)
  • Line 161: toggle re-enable check
  • Line 190: validation error check after form submit
  • Line 183: Test 5 form readiness (uses sleep 1 instead of wait_for "#scheduled-task-form" like Test 2 does)

Suggestion: Replace bare sleep + immediate assertion with wait_for:

cdp_click "... [data-task-action='toggle-enabled']"
if wait_for ".task-card[data-task-name='$TASK_NAME'].disabled" "task disabled" 5; then
  pass "Task disabled"
else
  fail "Task not visually disabled after toggle"
fi

Comment on lines +35 to +39
result=$(curl -s --max-time 10 "$BASE/api/cdp/evaluate" \
-H "Content-Type: application/json" \
-d "{\"expression\": \"$expr\"}" 2>/dev/null) || true
echo "$result"
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 MINOR — cdp_eval builds JSON via string interpolation (latent injection risk) · Flagged by: 2/3 reviewers

-d "{\"expression\": \"$expr\"}" embeds $expr directly into JSON. Characters like ", \, or newlines in the expression would produce malformed JSON and a silent CDP failure (masked by || true).

Current callers are safe — all values are simple alphanumeric strings. However, this is a maintenance hazard: a future test with a value like He said "hi" or C:\path would silently break.

Suggestion: Use jq to construct the JSON payload safely:

cdp_eval() {
  local expr="$1"
  local payload
  payload=$(jq -n --arg e "$expr" '{"expression": $e}')
  curl -s --max-time 10 "$BASE/api/cdp/evaluate" \
    -H "Content-Type: application/json" \
    -d "$payload" 2>/dev/null || true
}

- name: "Feature Test: Scheduled Tasks"
if: steps.devflow.outputs.agent_port != '' && (inputs.scenario == 'scheduled-tasks' || inputs.scenario == 'full')
run: |
PORT=${{ steps.devflow.outputs.agent_port }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MODERATE — $\{\{ }} expression injected directly into run: shell script · Flagged by: 2/3 reviewers

PORT=$\{\{ steps.devflow.outputs.agent_port }} injects the step output directly into the shell script before bash evaluates it. While agent_port is numeric in practice, this is the canonical GitHub Actions expression injection pattern — if the output were ever manipulated, it would allow arbitrary command execution. The same pattern appears in the Windows job (line 676).

Suggestion: Pass through environment variables instead:

env:
  PORT: $\{\{ steps.devflow.outputs.agent_port }}
run: |
  echo "Running scheduled tasks integration tests against port $PORT"
  bash .github/integration-tests/scheduled-tasks.sh "$PORT"

Comment on lines +67 to +70
}

# Helper: get text content
cdp_text() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MODERATE — set -euo pipefail + jq pipe causes silent script abort on CDP failure · Flagged by: 2/3 reviewers

If the CDP endpoint returns empty/non-JSON (network hiccup, app restart, timeout), cdp_eval returns empty (curl has || true). Then echo "" | jq -r '.result.value // false' 2>/dev/null → jq fails on invalid JSON → with pipefail, the pipe returns non-zero. When called in a bare assignment like DISABLED=$(cdp_exists "...") (line 155), set -e kills the entire script instantly — no summary, no test results, cryptic CI output.

The same issue affects cdp_text (line 76).

Suggestion: Guard against empty/invalid JSON:

echo "$result" | jq -r '.result.value // false' 2>/dev/null || echo "false"

And for cdp_text:

echo "$result" | jq -r '.result.value // empty' 2>/dev/null || true

Comment on lines +111 to +114
if wait_for "#scheduled-tasks-page" "scheduled tasks page" 10; then
pass "Navigated to /scheduled-tasks page"
else
fail "Could not navigate to /scheduled-tasks page"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 MINOR — No early bail-out on critical setup failures → cascading test noise · Flagged by: 2/3 reviewers

If Test 1 (navigation) or Test 2 (task creation) fails, fail() increments the counter but execution continues. All subsequent tests depend on the page being loaded and the task existing — they'll all fail too, producing 5+ cascading failures that obscure the root cause in CI logs.

Suggestion: After critical setup steps, bail out early:

if ! wait_for "#scheduled-tasks-page" ... 10; then
  fail "Could not navigate — aborting remaining tests"
  # print summary and exit
  exit $FAIL
fi

Comment on lines +51 to +54
cdp_eval "(() => { const el = document.querySelector('$selector'); if (el) { el.value = '$value'; el.dispatchEvent(new Event('input', {bubbles:true})); el.dispatchEvent(new Event('change', {bubbles:true})); } return el ? 'ok' : 'not found'; })()"
}

# Helper: select dropdown option
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MODERATE — cdp_set_value / cdp_select embed $value into single-quoted JS · Flagged by: 2/3 reviewers

el.value = '$value' interpolates the value inside a JS single-quoted string. If any caller passes a value containing ' (e.g., it's a test), the JavaScript becomes syntactically invalid and the CDP call silently fails.

Current callers use safe values, but this is a general-purpose helper that invites reuse with arbitrary strings.

Suggestion: Escape single quotes before interpolation, or pass the value via a JS template literal/variable assignment pattern that avoids inline string quoting issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant