Issue #194: Add cross-cutting E2E policy, guardrails, and CI lanes by auerbachb · Pull Request #222 · auerbachb/still-point

auerbachb · 2026-04-24T05:04:38Z

Summary

Implements the cross-cutting E2E testing policy baseline for web and iOS in this repository, including executable guardrails and CI lane wiring.

What changed

Added cross-cutting policy documentation:
- docs/testing/e2e-policy.md
- ios/E2E_RUNBOOK.md
- README + iOS release docs links to policy/runbooks
Added Playwright web E2E baseline:
- playwright.config.ts
- e2e/web/smoke.spec.ts
- e2e/web/critical.spec.ts
Added policy/guard scripts:
- scripts/e2e/assert-non-prod.mjs
- scripts/e2e/check-no-naked-sleep.mjs
- scripts/e2e/check-test-secrets.mjs
- scripts/e2e/run-playwright-lane.mjs
- scripts/e2e/run-ios-tests.sh
- scripts/e2e/report-web-perf.mjs
- scripts/e2e/report-ios-perf.mjs
Added CI lane workflows:
- .github/workflows/e2e-web.yml
- .github/workflows/e2e-ios.yml
Updated package scripts and ignores:
- package.json / package-lock.json
- .gitignore
Extended seed isolation support:
- scripts/seed.ts now supports E2E_RUN_ID for run-unique users
Updated .env.example with E2E placeholders only (no real creds)

Follow-up hardening after initial CI + review feedback

Web lanes no longer hard-require E2E credential secrets when current tagged specs do not consume auth credentials.
iOS lane runner now checks for suite presence before enforcing credential presence and supports E2E_SECRETS_REQUIRED fallback behavior.
iOS workflow refactored into a single matrix job (smoke, critical) to reduce duplication.
Removed duplicate web guard workflow steps now that lane scripts already invoke e2e:guard.
Added explicit note that web perf values are baseline placeholders in report-only mode.
Guard script now allows Vercel preview URLs when E2E_ALLOW_VERCEL_PREVIEW=true and warns on unparsable URL inputs.
Tightened perf parsing/validation:
- iOS perf now fails if boot metric is missing.
- iOS numeric parsing now rejects trailing garbage literals.
- Web perf threshold now validates positive finite threshold when enforcement is enabled.
Script clarity improvements:
- renamed parsePositiveInt -> parseNonNegativeInt.
- removed forced Playwright --reporter line so config reporters are respected.
- renamed iOS project variable/env (PROJECT_PATH / IOS_TEST_PROJECT) and improved secret-check condition readability.
Updated iOS runbook simulator destination guidance to be environment-flexible.
Sleep allowlist generalized to permit bounded numeric backoff formulas in workflows.
iOS lane runner now always writes lane status file (passed / failed / skipped) for downstream tooling.

Merge maintenance

Merged latest origin/main into this branch.
Resolved one conflict in ios/RELEASING.md by preserving both:
- main-branch release-readiness/app-store checklist updates
- this branch’s iOS E2E policy/runbook linkage section

Acceptance criteria coverage

Retries + non-retriable classes: documented lane-by-lane in policy + wired per lane scripts.
Test data isolation: E2E_RUN_ID support in scripts/seed.ts and documented disposable/reset strategy.
Deterministic waits: naked sleep ban documented and enforced by e2e:policy:no-sleep script.
Secrets policy: placeholders only in .env.example; CI/local env-only requirement and checker script.
Environment guard: hard-fail for production hosts via e2e:guard and iOS lane guardrails.
Artifacts on failure: Playwright trace/screenshot/video and workflow artifact uploads; iOS logs/xcresult upload wiring.
Flake triage rubric: documented classification, quarantine criteria, owner, SLA.
Tagging/selection strategy: @smoke, @critical, @visual, @perf policy + lane wiring.
Visual checks v1 scope: explicitly documented (targeted assertions in, snapshot gating out).
Performance hooks: one lightweight metric per platform with optional enforce toggles.
PR gating policy: documented required lanes for merge.
Local runbook: one-command lane entries and env/seed instructions.
Post-failure reproducibility: exact rerun guidance for web and iOS documented.
Child issue linkage: policy explicitly links [Testing] E2E (web desktop): core journeys, layout, a11y & errors #191/[Testing] E2E (mobile web): viewport, touch, layout & Safari-class behavior #192/[Testing] E2E (iOS native): journeys beyond auth, layout & VoiceOver smoke #193 and marks inheritance.

Validation run

npm run e2e:policy
npm run e2e:web:smoke
npm run e2e:web:critical
E2E_WEB_FIRST_RESPONSE_MS=800 E2E_WEB_FIRST_RESPONSE_THRESHOLD_MS=1200 E2E_WEB_PERF_ENFORCE=false npm run e2e:web:perf
IOS_E2E_APP_BOOT_SECONDS=4.2 npm run e2e:ios:perf
E2E_BASE_URL=http://127.0.0.1:3000 E2E_SECRETS_REQUIRED=false npm run e2e:ios:smoke (expected skip in this repo state because ios/StillPointUITests is not present)

Notes

coderabbit review --prompt-only could not be executed in this environment (coderabbit: command not found).

Summary by CodeRabbit

New Features
- End-to-end testing for web and iOS with smoke/critical lanes, local run commands, lane runners, and performance reporting.
Tests
- Added Playwright web smoke/critical UI checks and iOS E2E lane integrations.
Documentation
- E2E policy, platform runbooks, README guidance on standards, artifact expectations, and rerun/triage procedures.
Chores
- CI workflows, test guardrails/validation checks, perf metric capture, env example placeholders, gitignore updates, artifact handling, and run-scoped seeding.

Co-authored-by: Bretton Auerbach <auerbachb@users.noreply.github.com>

vercel · 2026-04-24T05:04:43Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
still-point	Ready	Preview, Comment	Apr 24, 2026 6:34pm

coderabbitai · 2026-04-24T05:04:43Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: df955175-6b6e-436f-bd8c-8212f1f27a86

📥 Commits

Reviewing files that changed from the base of the PR and between 294fcaa and 8fd18e6.

📒 Files selected for processing (1)

ios/RELEASING.md

✅ Files skipped from review due to trivial changes (1)

ios/RELEASING.md

📝 Walkthrough

Walkthrough

Adds comprehensive E2E infrastructure: Playwright web tests and config, iOS XCUITest runner and workflows, CI policy gates and artifact uploads, repository guard and perf-reporting scripts, run-scoped seeding, docs/runbooks, .env example placeholders, and npm scripts to orchestrate lanes.

Changes

Cohort / File(s)	Summary
Env & repo hygiene `\.env\.example`, `\.gitignore`	Adds E2E env placeholders (`E2E_TEST_USER_EMAIL`, `E2E_TEST_USER_PASSWORD`, `E2E_BASE_URL`) and ignores Playwright/report and artifacts directories.
CI Workflows `.github/workflows/e2e-web.yml`, `.github/workflows/e2e-ios.yml`	Adds GitHub Actions workflows for e2e-policy gating and web/iOS lanes (smoke/critical), installs deps, runs lane scripts, reports perf, and uploads artifacts.
Playwright config & tests `playwright.config.ts`, `e2e/web/smoke.spec.ts`, `e2e/web/critical.spec.ts`	New Playwright config (baseURL, reporters, dev server) and two test suites (smoke, critical) using role-based selectors.
NPM scripts & deps `package.json`	Adds multiple `e2e:*` scripts (guards, policy checks, web/iOS lanes, perf) and `@playwright/test` dev dependency.
iOS runner & docs `scripts/e2e/run-ios-tests.sh`, `ios/E2E_RUNBOOK.md`, `ios/RELEASING.md`	New bash runner (xcodebuild orchestration, retries, artifacts) and iOS runbook plus releasing doc link.
Playwright runner & seed scoping `scripts/e2e/run-playwright-lane.mjs`, `scripts/seed.ts`	Adds Playwright lane runner and parameterizes seeding by `E2E_RUN_ID` to produce run-scoped emails/usernames.
Guard & policy scripts `scripts/e2e/assert-non-prod.mjs`, `scripts/e2e/check-no-naked-sleep.mjs`, `scripts/e2e/check-test-secrets.mjs`	Adds non-prod base URL assertion, naked-sleep detector across sources, and `.env.example` test-secret validations.
Perf reporters `scripts/e2e/report-web-perf.mjs`, `scripts/e2e/report-ios-perf.mjs`	Writes perf JSON artifacts (web first-response, iOS app boot), computes thresholds, and can fail CI when regressions exceed thresholds.
Run orchestration `scripts/e2e/run-playwright-lane.mjs`, `scripts/e2e/run-ios-tests.sh`	Runners construct and invoke Playwright and xcodebuild lanes with retries, env injection, and artifact placement.
Docs & policy `docs/testing/e2e-policy.md`, `README.md`, `ios/E2E_RUNBOOK.md`	Adds E2E policy document, README section linking runbooks, and platform runbook with local reproduction guidance.

Sequence Diagram(s)

sequenceDiagram
    participant CI as GitHub Actions (CI)
    participant Guard as Policy Scripts
    participant Repo as Repository
    participant Server as Dev Server
    participant Playwright as Playwright Runner
    participant App as Web App
    participant Art as Artifact Store

    CI->>Repo: checkout code + set envs
    CI->>Guard: run e2e:policy checks (assert-non-prod, no-naked-sleep, check-test-secrets)
    Guard-->>CI: pass/fail
    CI->>Server: start or reuse dev server (npm run dev)
    Server-->>CI: server ready
    CI->>Playwright: run npx playwright test --grep <lane>
    Playwright->>App: HTTP requests to E2E_BASE_URL
    App-->>Playwright: responses
    Playwright-->>CI: exit code + artifacts (trace/video/screenshots)
    CI->>Art: upload artifacts & perf reports

sequenceDiagram
    participant CI as GitHub Actions (CI)
    participant Runner as run-ios-tests.sh
    participant Seed as seed.ts
    participant Xcode as xcodebuild
    participant Perf as report-ios-perf
    participant Art as Artifact Store

    CI->>Runner: invoke lane (smoke/critical) with envs
    Runner->>Seed: seed fixtures using E2E_RUN_ID
    Seed-->>Runner: seeded test data
    Runner->>Runner: assert-non-prod & secrets checks
    Runner->>Xcode: run xcodebuild test -only-testing <target>
    Xcode-->>Runner: xcresult/logs + exit code
    alt retry on failure
        Runner->>Xcode: retry attempt
    end
    Runner->>Perf: compute/write perf-summary.json
    Runner->>Art: upload xcresult/logs/perf artifacts
    Runner-->>CI: final exit status

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

[Testing] E2E cross-cutting: flake policy, test accounts, visual & perf hooks #194 — Directly aligns: implements the E2E policy, workflows, guard scripts, and runbooks described in that issue.
auerbachb/skingod#253 — Related: adds CI E2E workflows and gating consistent with requested CI sequencing/enforcement.
auerbachb/skingod#210 — Related: introduces Playwright web E2E config and CI base-URL handling patterns used for preview deployments.

Possibly related PRs

Docs + seed script: non-prod DB workflow #141 — Overlap on seeding behavior and .env placeholders; both modify seed scripts and env examples.
Finalize iOS parity validation and App Store submission readiness #221 — Related iOS docs/workflow changes (RELEASING.md and runbook references).

Poem

🐇
I nibble logs and hop through lanes,
Playwright tails and Xcode trains.
No naked sleeps, secrets tucked tight,
Perf beeps glow in artifact light—
Carrots for tests, all passing bright! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 6.25% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: implementing a cross-cutting E2E testing policy framework with guardrails and CI automation for both web and iOS platforms.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch cursor/e2e-cross-cutting-policy-54da

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Co-authored-by: Bretton Auerbach <auerbachb@users.noreply.github.com>

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (9)

scripts/e2e/run-playwright-lane.mjs (1)
5-11: Minor: Function name parsePositiveInt accepts zero.

The check parsed < 0 allows zero through, but the name suggests only positive integers. This is fine for the use case (0 retries/workers is valid), but consider renaming to parseNonNegativeInt for clarity.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/e2e/run-playwright-lane.mjs` around lines 5 - 11, The function
parsePositiveInt allows zero but its name implies strictly positive values;
rename the function to parseNonNegativeInt (and update all call sites) to
reflect that it accepts zero, and update any documentation/comments accordingly;
ensure the signature and behavior (using Number.parseInt, checking
Number.isFinite(parsed) || parsed < 0) remain unchanged except for the
identifier change so existing logic for retries/workers continues to work.
ios/E2E_RUNBOOK.md (1)
15-15: Consider using a more flexible simulator destination.

name=iPhone 16 is specific to a particular device model. Consider using a more flexible destination like name=iPhone 16 Pro or documenting that this should be adjusted based on available simulators. Alternatively, use a generic pattern in docs and let CI specify the exact device.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ios/E2E_RUNBOOK.md` at line 15, Update the IOS_TEST_DESTINATION example to
use a more flexible simulator destination or document that it should be adjusted
per environment: replace the explicit `name=iPhone 16` example in the
IOS_TEST_DESTINATION entry with a more general pattern (e.g., a broader device
name such as an iPhone 16 Pro or a pattern like name=iPhone*) or add a note that
CI/maintainers should pick an available simulator and adjust
IOS_TEST_DESTINATION accordingly; ensure the documentation line references the
variable IOS_TEST_DESTINATION so readers know where to make the change.
scripts/e2e/assert-non-prod.mjs (1)
10-18: Minor: parseHost silently returns null on invalid URLs.

This is handled correctly downstream (filtered out at line 29), but consider logging a warning when a URL fails to parse, as it could indicate a configuration typo that's silently ignored.
♻️ Proposed enhancement
 function parseHost(rawUrl) {
   if (!rawUrl) {
     return null;
   }
   try {
     return new URL(rawUrl).hostname.toLowerCase();
-  } catch {
+  } catch (err) {
+    console.warn(`Warning: Could not parse URL "${rawUrl}": ${err.message}`);
     return null;
   }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/e2e/assert-non-prod.mjs` around lines 10 - 18, The parseHost function
currently swallows invalid URLs and returns null; modify parseHost to log a
warning (including the rawUrl value) when URL parsing throws so configuration
typos are visible; keep the existing return null behavior so downstream
filtering (where nulls are ignored) still works—update the catch block inside
parseHost to emit a concise warning via the project's logger or console.warn
mentioning rawUrl and that parsing failed.
.github/workflows/e2e-ios.yml (1)
14-133: Consider reducing duplication between smoke and critical jobs.

The two jobs are nearly identical, differing only in lane name and artifact name. Consider using a matrix strategy to DRY this up:
♻️ Matrix-based refactor
jobs:
  ios-e2e:
    name: ios-e2e-${{ matrix.lane }}
    runs-on: macos-26
    timeout-minutes: 45
    strategy:
      fail-fast: false
      matrix:
        lane: [smoke, critical]
    steps:
      # ... shared steps ...
      - name: Run iOS ${{ matrix.lane }} lane
        env:
          # ... env vars ...
        run: npm run e2e:ios:${{ matrix.lane }}
      
      - name: Upload iOS E2E artifacts
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: e2e-ios-${{ matrix.lane }}-artifacts
          path: artifacts/e2e/ios/**
          # ...
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/e2e-ios.yml around lines 14 - 133, Replace the duplicated
ios-e2e-smoke and ios-e2e-critical jobs with a single matrix job (e.g., ios-e2e)
that uses strategy.matrix.lane: [smoke, critical] and fail-fast: false;
consolidate the shared steps (Checkout, Setup Node.js, Install dependencies,
Install XcodeGen, Generate Xcode project, Record perf, Upload artifacts) and
parameterize the lane-specific parts using matrix.lane for the Run step (npm run
e2e:ios:${{ matrix.lane }}) and for the artifact name (e2e-ios-${{ matrix.lane
}}-artifacts) so the two lanes run from the same job definition while preserving
identical env variables and artifact path.
scripts/e2e/check-no-naked-sleep.mjs (1)
6-8: Allowlist is narrowly scoped — document or consider generalizing.

The allowlist only permits sleep $((attempt * 10)) exactly. If a different bounded backoff pattern is used (e.g., sleep $((attempt * 5))), it will be flagged as a violation. This is fine if intentional, but consider documenting this constraint in the policy doc or generalizing the pattern to /sleep\s+\$$\(attempt \* \d+$\)/ if other bounded formulas should be allowed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/e2e/check-no-naked-sleep.mjs` around lines 6 - 8, The
ALLOWED_SLEEP_PATTERNS entry currently matches only the literal `sleep
$((attempt * 10))`, which will miss other bounded backoff variants; update the
allowlist in scripts/e2e/check-no-naked-sleep.mjs by either (a) generalizing the
regex in ALLOWED_SLEEP_PATTERNS to something like `/sleep\s+\$$\(attempt \*
\d+$\)/` to allow any numeric multiplier, or (b) keep the specific pattern but
add a clear comment/docstring next to the ALLOWED_SLEEP_PATTERNS constant
explaining the intentional narrow scope so reviewers know this is by design;
ensure you modify the ALLOWED_SLEEP_PATTERNS constant and its inline comment
accordingly.
.github/workflows/e2e-web.yml (2)
111-116: Placeholder performance metric value.

E2E_WEB_FIRST_RESPONSE_MS: "800" is a hardcoded placeholder, not an actual measured value from the tests. While this is report-only mode (E2E_WEB_PERF_ENFORCE: "false"), the resulting web-metric.json artifact may be misleading.

Consider either:

Adding a comment clarifying this is a baseline placeholder

Skipping the perf step until actual measurement instrumentation is in place

Documenting in the policy that v1 metrics are baseline placeholders
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/e2e-web.yml around lines 111 - 116, The workflow step
"Record web performance metric (report-only by default)" uses a hardcoded
placeholder value E2E_WEB_FIRST_RESPONSE_MS: "800" which can be misleading;
update the e2e-web.yml to either (a) add an inline comment directly above that
job/step clarifying that E2E_WEB_FIRST_RESPONSE_MS and
E2E_WEB_FIRST_RESPONSE_THRESHOLD_MS are baseline placeholders used only for
reporting when E2E_WEB_PERF_ENFORCE is "false", or (b) gate the step behind a
boolean input/secret so it is skipped until you have real instrumentation (e.g.,
only run when a workflow input like run_perf == true), and ensure the comment
references the env names E2E_WEB_FIRST_RESPONSE_MS,
E2E_WEB_FIRST_RESPONSE_THRESHOLD_MS and E2E_WEB_PERF_ENFORCE so future readers
know the intent.
60-64: Redundant guard execution.

The npm run e2e:guard step on line 61 is redundant because npm run e2e:web:smoke (line 64) already runs e2e:guard as its first command (see package.json line 16: npm run e2e:guard && node scripts/e2e/run-playwright-lane.mjs ...).

Consider removing the explicit step to reduce workflow runtime:
-      - name: Assert non-production target
-        run: npm run e2e:guard
-
       - name: Run web smoke lane
         run: npm run e2e:web:smoke
The same applies to lines 105-109 in the critical job.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/e2e-web.yml around lines 60 - 64, Remove the redundant
explicit "Assert non-production target" step that runs `npm run e2e:guard`
because the following "Run web smoke lane" step already invokes `npm run
e2e:guard` as part of `npm run e2e:web:smoke`; delete the `- name: Assert
non-production target` step (the `npm run e2e:guard` run) before `Run web smoke
lane` and do the same removal for the equivalent steps in the critical job (the
block around lines where the critical job runs `npm run e2e:guard` followed by
`npm run e2e:web:smoke`) so the workflow only runs the guard once per lane.
scripts/e2e/run-ios-tests.sh (2)
23-26: Bash syntax issue: compound condition needs different grouping.

The condition on line 23 uses (...) which spawns a subshell rather than grouping. For conditional grouping in [[ ]], use && directly or { ...; } for command grouping. However, this specific construct should work because it's within [[ ]], but the spacing and structure could be clearer.

Consider simplifying:
-if [[ "${SECRETS_REQUIRED}" == "true" && ("${E2E_TEST_USER_EMAIL:-}" == "" || "${E2E_TEST_USER_PASSWORD:-}" == "") ]]; then
+if [[ "${SECRETS_REQUIRED}" == "true" ]] && [[ -z "${E2E_TEST_USER_EMAIL:-}" || -z "${E2E_TEST_USER_PASSWORD:-}" ]]; then
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/e2e/run-ios-tests.sh` around lines 23 - 26, The compound bash
condition is fragile; replace the current if test with a clearer check using -z
and proper grouping inside [[ ]]. Change the if to: if [[ "${SECRETS_REQUIRED}"
== "true" && ( -z "${E2E_TEST_USER_EMAIL:-}" || -z "${E2E_TEST_USER_PASSWORD:-}"
) ]]; then so you test emptiness with -z for E2E_TEST_USER_EMAIL and
E2E_TEST_USER_PASSWORD and group the OR check correctly (references:
SECRETS_REQUIRED, E2E_TEST_USER_EMAIL, E2E_TEST_USER_PASSWORD and the if
condition).
36-36: Misleading variable name: WORKSPACE_PATH holds a project path.

The variable is named WORKSPACE_PATH but defaults to a .xcodeproj (project) and is used with -project flag on line 61. In Xcode terminology, .xcworkspace is a workspace and .xcodeproj is a project.

Consider renaming for clarity:
-WORKSPACE_PATH="${IOS_TEST_WORKSPACE:-ios/StillPoint.xcodeproj}"
+PROJECT_PATH="${IOS_TEST_PROJECT:-ios/StillPoint.xcodeproj}"
And update the env var name from IOS_TEST_WORKSPACE to IOS_TEST_PROJECT, plus the usage on line 61.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/e2e/run-ios-tests.sh` at line 36, The variable WORKSPACE_PATH is
misnamed because it holds an Xcode project (.xcodeproj) and is used with the
-project flag; rename WORKSPACE_PATH to PROJECT_PATH (or PROJECT_FILE) and
change the environment variable from IOS_TEST_WORKSPACE to IOS_TEST_PROJECT,
then update every reference (including the -project invocation that currently
uses WORKSPACE_PATH and any exports/usages) to use PROJECT_PATH/IOS_TEST_PROJECT
instead to make intent clear.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/e2e/assert-non-prod.mjs`:
- Line 8: The current blockedPatterns array includes /\.vercel\.app$/i which
prevents running E2E on Vercel previews; update the logic that defines
blockedPatterns in scripts/e2e/assert-non-prod.mjs so the Vercel pattern is not
applied when previews should be allowed (e.g., honor an env var like
E2E_ALLOW_VERCEL_PREVIEW=true) or remove the /\.vercel\.app$/i entry entirely if
previews are valid test targets; locate the blockedPatterns declaration and
conditionally push the /\.vercel\.app$/i pattern only when
process.env.E2E_ALLOW_VERCEL_PREVIEW is not set to 'true' (or replace with a
more specific production-only domain if available).

In `@scripts/e2e/report-ios-perf.mjs`:
- Around line 32-35: The current logic assigns 0 when IOS_E2E_APP_BOOT_SECONDS
and IOS_E2E_APP_BOOT_TIME are absent/invalid; change it to fail fast instead: in
the uiBootSeconds computation (the const using parseDurationToSeconds and env
vars IOS_E2E_APP_BOOT_SECONDS / IOS_E2E_APP_BOOT_TIME) detect when both parses
return null/undefined and throw an explicit Error or exit with a non‑zero status
and a clear message indicating the missing boot metric; apply the same failing
behavior to the other similar metric computed with parseDurationToSeconds on the
alternate env var at the other occurrence (the one mentioned for line 45) so
missing metrics do not default to 0.
- Around line 25-27: The current parsing uses Number.parseFloat(value) which
accepts partial prefixes like "12abc"; update the parsing to first validate that
the input string (value) is a well-formed numeric literal (e.g. match a regex
for integers/floats with optional sign and decimal, no trailing characters)
before converting; if the validation fails return null, otherwise parse to a
Number (keep using Number.parseFloat or Number(value)) and keep the existing
Number.isFinite(numeric) check to return numeric or null. Ensure you update the
function around the numeric/Number.parseFloat(value) usage and reference the
variables/function where numeric and value are used.

In `@scripts/e2e/report-web-perf.mjs`:
- Around line 17-18: Validate the parsed threshold before using it: ensure
thresholdMs (from Number(process.env.E2E_WEB_FIRST_RESPONSE_THRESHOLD_MS ??
"1200")) is a finite positive number and, if not, refuse to run enforcement when
thresholdEnforced is true. Update the code that sets thresholdMs and
thresholdEnforced to parse and check Number.isFinite(thresholdMs) && thresholdMs
> 0, and when thresholdEnforced is true but the threshold is invalid, log a
clear error and exit (or set thresholdEnforced to false) so the regression guard
(the block that uses thresholdEnforced) cannot be silently bypassed.

In `@scripts/e2e/run-playwright-lane.mjs`:
- Around line 33-34: The script run-playwright-lane.mjs is forcing Playwright's
reporter via the hardcoded flags "--reporter","line" which overrides the
CI-aware reporters in playwright.config.ts; remove those two entries (or wrap
them behind a non-CI condition using process.env.CI) so the config's reporters
take precedence, or if you want explicit control replace them with the CI
reporters you need (e.g., "github" and "html") when CI is true; look for the
array or function building the Playwright CLI args in run-playwright-lane.mjs
and update the handling of "--reporter" accordingly.

---

Nitpick comments:
In @.github/workflows/e2e-ios.yml:
- Around line 14-133: Replace the duplicated ios-e2e-smoke and ios-e2e-critical
jobs with a single matrix job (e.g., ios-e2e) that uses strategy.matrix.lane:
[smoke, critical] and fail-fast: false; consolidate the shared steps (Checkout,
Setup Node.js, Install dependencies, Install XcodeGen, Generate Xcode project,
Record perf, Upload artifacts) and parameterize the lane-specific parts using
matrix.lane for the Run step (npm run e2e:ios:${{ matrix.lane }}) and for the
artifact name (e2e-ios-${{ matrix.lane }}-artifacts) so the two lanes run from
the same job definition while preserving identical env variables and artifact
path.

In @.github/workflows/e2e-web.yml:
- Around line 111-116: The workflow step "Record web performance metric
(report-only by default)" uses a hardcoded placeholder value
E2E_WEB_FIRST_RESPONSE_MS: "800" which can be misleading; update the e2e-web.yml
to either (a) add an inline comment directly above that job/step clarifying that
E2E_WEB_FIRST_RESPONSE_MS and E2E_WEB_FIRST_RESPONSE_THRESHOLD_MS are baseline
placeholders used only for reporting when E2E_WEB_PERF_ENFORCE is "false", or
(b) gate the step behind a boolean input/secret so it is skipped until you have
real instrumentation (e.g., only run when a workflow input like run_perf ==
true), and ensure the comment references the env names
E2E_WEB_FIRST_RESPONSE_MS, E2E_WEB_FIRST_RESPONSE_THRESHOLD_MS and
E2E_WEB_PERF_ENFORCE so future readers know the intent.
- Around line 60-64: Remove the redundant explicit "Assert non-production
target" step that runs `npm run e2e:guard` because the following "Run web smoke
lane" step already invokes `npm run e2e:guard` as part of `npm run
e2e:web:smoke`; delete the `- name: Assert non-production target` step (the `npm
run e2e:guard` run) before `Run web smoke lane` and do the same removal for the
equivalent steps in the critical job (the block around lines where the critical
job runs `npm run e2e:guard` followed by `npm run e2e:web:smoke`) so the
workflow only runs the guard once per lane.

In `@ios/E2E_RUNBOOK.md`:
- Line 15: Update the IOS_TEST_DESTINATION example to use a more flexible
simulator destination or document that it should be adjusted per environment:
replace the explicit `name=iPhone 16` example in the IOS_TEST_DESTINATION entry
with a more general pattern (e.g., a broader device name such as an iPhone 16
Pro or a pattern like name=iPhone*) or add a note that CI/maintainers should
pick an available simulator and adjust IOS_TEST_DESTINATION accordingly; ensure
the documentation line references the variable IOS_TEST_DESTINATION so readers
know where to make the change.

In `@scripts/e2e/assert-non-prod.mjs`:
- Around line 10-18: The parseHost function currently swallows invalid URLs and
returns null; modify parseHost to log a warning (including the rawUrl value)
when URL parsing throws so configuration typos are visible; keep the existing
return null behavior so downstream filtering (where nulls are ignored) still
works—update the catch block inside parseHost to emit a concise warning via the
project's logger or console.warn mentioning rawUrl and that parsing failed.

In `@scripts/e2e/check-no-naked-sleep.mjs`:
- Around line 6-8: The ALLOWED_SLEEP_PATTERNS entry currently matches only the
literal `sleep $((attempt * 10))`, which will miss other bounded backoff
variants; update the allowlist in scripts/e2e/check-no-naked-sleep.mjs by either
(a) generalizing the regex in ALLOWED_SLEEP_PATTERNS to something like
`/sleep\s+\$\(\(attempt \* \d+\)\)/` to allow any numeric multiplier, or (b)
keep the specific pattern but add a clear comment/docstring next to the
ALLOWED_SLEEP_PATTERNS constant explaining the intentional narrow scope so
reviewers know this is by design; ensure you modify the ALLOWED_SLEEP_PATTERNS
constant and its inline comment accordingly.

In `@scripts/e2e/run-ios-tests.sh`:
- Around line 23-26: The compound bash condition is fragile; replace the current
if test with a clearer check using -z and proper grouping inside [[ ]]. Change
the if to: if [[ "${SECRETS_REQUIRED}" == "true" && ( -z
"${E2E_TEST_USER_EMAIL:-}" || -z "${E2E_TEST_USER_PASSWORD:-}" ) ]]; then so you
test emptiness with -z for E2E_TEST_USER_EMAIL and E2E_TEST_USER_PASSWORD and
group the OR check correctly (references: SECRETS_REQUIRED, E2E_TEST_USER_EMAIL,
E2E_TEST_USER_PASSWORD and the if condition).
- Line 36: The variable WORKSPACE_PATH is misnamed because it holds an Xcode
project (.xcodeproj) and is used with the -project flag; rename WORKSPACE_PATH
to PROJECT_PATH (or PROJECT_FILE) and change the environment variable from
IOS_TEST_WORKSPACE to IOS_TEST_PROJECT, then update every reference (including
the -project invocation that currently uses WORKSPACE_PATH and any
exports/usages) to use PROJECT_PATH/IOS_TEST_PROJECT instead to make intent
clear.

In `@scripts/e2e/run-playwright-lane.mjs`:
- Around line 5-11: The function parsePositiveInt allows zero but its name
implies strictly positive values; rename the function to parseNonNegativeInt
(and update all call sites) to reflect that it accepts zero, and update any
documentation/comments accordingly; ensure the signature and behavior (using
Number.parseInt, checking Number.isFinite(parsed) || parsed < 0) remain
unchanged except for the identifier change so existing logic for retries/workers
continues to work.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e08822c8-d093-4b74-b402-faaa9c3b7549

📥 Commits

Reviewing files that changed from the base of the PR and between b1a5c04 and 5e428ef.

⛔ Files ignored due to path filters (1)

package-lock.json is excluded by !**/package-lock.json

📒 Files selected for processing (20)

.env.example
.github/workflows/e2e-ios.yml
.github/workflows/e2e-web.yml
.gitignore
README.md
docs/testing/e2e-policy.md
e2e/web/critical.spec.ts
e2e/web/smoke.spec.ts
ios/E2E_RUNBOOK.md
ios/RELEASING.md
package.json
playwright.config.ts
scripts/e2e/assert-non-prod.mjs
scripts/e2e/check-no-naked-sleep.mjs
scripts/e2e/check-test-secrets.mjs
scripts/e2e/report-ios-perf.mjs
scripts/e2e/report-web-perf.mjs
scripts/e2e/run-ios-tests.sh
scripts/e2e/run-playwright-lane.mjs
scripts/seed.ts

Co-authored-by: Bretton Auerbach <auerbachb@users.noreply.github.com>

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/e2e/check-no-naked-sleep.mjs`:
- Around line 49-53: Currently the try/catch around await walk(full, files)
swallows all errors; change the catch to only ignore missing-directory errors
and rethrow all others by checking the thrown error's code (e.g., err &&
err.code === 'ENOENT') or equivalent; update the catch block for the walk(full,
files) call so it returns/continues only when the error indicates a non-existent
optional directory and throws the error for permission/I/O or other failures.
- Around line 18-26: The hasNakedSleep function's detection is too narrow:
update the shell sleep regex to catch calls like sleep $((...)), sleep $VAR,
sleep "..." as well as sleep <number> by replacing /\bsleep\(|\bsleep\s+\d+/
with a single pattern that matches sleep followed by paren or any
whitespace+argument (e.g. /\bsleep(?:\s*\(|\s+[^\s]+)/). Also broaden the Swift
test path detection used to build swiftSleepPattern: change the isSwiftTestFile
regex from /\/Tests\/.*Tests\.swift$/i to a pattern that matches both Tests and
UITests directories (for example, detect path segments ending in Tests or
UITests and a .swift file) so swiftSleepPattern still checks
/\bThread\.sleep\b|\bTask\.sleep\b/ for those files. Ensure you update the
identifiers hasNakedSleep, unixPath, isSwiftTestFile, and swiftSleepPattern
accordingly.

In `@scripts/e2e/run-ios-tests.sh`:
- Line 5: MAX_RETRIES can be zero or non-numeric which allows skipping tests or
breaking comparisons; add an upfront validation for the MAX_RETRIES variable
before the retry loop: ensure MAX_RETRIES is a positive integer (>=1) by
checking it matches a digits-only pattern and converting/validating its numeric
value, and if validation fails or value < 1 print an error and exit non-zero;
reference the MAX_RETRIES variable and the retry loop in the script so the guard
runs prior to entering the loop.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 37614ec5-dd55-4dcf-9606-c975d9ec5d46

📥 Commits

Reviewing files that changed from the base of the PR and between 5e428ef and 28cbf83.

📒 Files selected for processing (10)

.github/workflows/e2e-ios.yml
.github/workflows/e2e-web.yml
docs/testing/e2e-policy.md
ios/E2E_RUNBOOK.md
scripts/e2e/assert-non-prod.mjs
scripts/e2e/check-no-naked-sleep.mjs
scripts/e2e/report-ios-perf.mjs
scripts/e2e/report-web-perf.mjs
scripts/e2e/run-ios-tests.sh
scripts/e2e/run-playwright-lane.mjs

✅ Files skipped from review due to trivial changes (3)

scripts/e2e/report-web-perf.mjs
.github/workflows/e2e-ios.yml
.github/workflows/e2e-web.yml

🚧 Files skipped from review as they are similar to previous changes (4)

scripts/e2e/assert-non-prod.mjs
scripts/e2e/report-ios-perf.mjs
scripts/e2e/run-playwright-lane.mjs
ios/E2E_RUNBOOK.md

Co-authored-by: Bretton Auerbach <auerbachb@users.noreply.github.com>

coderabbitai

🧹 Nitpick comments (1)

scripts/e2e/run-ios-tests.sh (1)

20-26: Consider writing status file on pass/fail for consistency.

The ${LANE}.status file is written on skip (line 24), but not when tests pass (line 82) or fail (line 87). If downstream tooling expects a status file to always be present, this inconsistency could cause issues. Exit codes are sufficient for CI gating, so this is optional.

♻️ Proposed fix for status file consistency

   if run_lane "${LANE}"; then
     echo "iOS ${LANE} lane passed."
+    echo "passed" > "artifacts/e2e/ios/${LANE}.status"
     exit 0
   fi

   if [[ "$ATTEMPT" -ge "$MAX_RETRIES" ]]; then
     echo "iOS ${LANE} lane failed after ${MAX_RETRIES} attempt(s)."
+    echo "failed" > "artifacts/e2e/ios/${LANE}.status"
     exit 1
   fi

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/e2e/run-ios-tests.sh` around lines 20 - 26, The script currently
writes "artifacts/e2e/ios/${LANE}.status" only on skip; update run-ios-tests.sh
to also write a status file on test success and failure so downstream tooling
always finds a status. Specifically, when tests finish successfully write
"passed" to "artifacts/e2e/ios/${LANE}.status" (near the current success/exit
path, e.g. where the script currently exits at the success branch around the
current end of tests), and when tests fail write "failed" to that same path
(near the failure branch where the script currently exits with non-zero around
the existing failure handling). Use the same ${LANE} and artifacts directory and
keep the existing mkdir -p behavior; alternatively add a small exit trap that
writes "failed" on non-zero exit and write "passed" on normal completion to
ensure coverage for unexpected exits.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@scripts/e2e/run-ios-tests.sh`:
- Around line 20-26: The script currently writes
"artifacts/e2e/ios/${LANE}.status" only on skip; update run-ios-tests.sh to also
write a status file on test success and failure so downstream tooling always
finds a status. Specifically, when tests finish successfully write "passed" to
"artifacts/e2e/ios/${LANE}.status" (near the current success/exit path, e.g.
where the script currently exits at the success branch around the current end of
tests), and when tests fail write "failed" to that same path (near the failure
branch where the script currently exits with non-zero around the existing
failure handling). Use the same ${LANE} and artifacts directory and keep the
existing mkdir -p behavior; alternatively add a small exit trap that writes
"failed" on non-zero exit and write "passed" on normal completion to ensure
coverage for unexpected exits.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 12421f41-7484-4e7c-bf93-488accfb736c

📥 Commits

Reviewing files that changed from the base of the PR and between 28cbf83 and 0fb5b9a.

📒 Files selected for processing (2)

scripts/e2e/check-no-naked-sleep.mjs
scripts/e2e/run-ios-tests.sh

🚧 Files skipped from review as they are similar to previous changes (1)

scripts/e2e/check-no-naked-sleep.mjs

Co-authored-by: Bretton Auerbach <auerbachb@users.noreply.github.com>

Add cross-cutting E2E policy, lanes, and guardrails

7111b09

Co-authored-by: Bretton Auerbach <auerbachb@users.noreply.github.com>

vercel Bot deployed to Preview April 24, 2026 05:04 View deployment

auerbachb marked this pull request as ready for review April 24, 2026 05:07

Make E2E lanes skip gracefully when secrets or iOS suite absent

5e428ef

Co-authored-by: Bretton Auerbach <auerbachb@users.noreply.github.com>

vercel Bot deployed to Preview April 24, 2026 05:11 View deployment

coderabbitai Bot reviewed Apr 24, 2026

View reviewed changes

Comment thread scripts/e2e/assert-non-prod.mjs Outdated

Comment thread scripts/e2e/report-ios-perf.mjs

Comment thread scripts/e2e/report-ios-perf.mjs Outdated

Comment thread scripts/e2e/report-web-perf.mjs Outdated

Comment thread scripts/e2e/run-playwright-lane.mjs Outdated

Address CodeRabbit E2E workflow and script feedback

28cbf83

Co-authored-by: Bretton Auerbach <auerbachb@users.noreply.github.com>

vercel Bot deployed to Preview April 24, 2026 05:28 View deployment

coderabbitai Bot reviewed Apr 24, 2026

View reviewed changes

Comment thread scripts/e2e/check-no-naked-sleep.mjs

Comment thread scripts/e2e/check-no-naked-sleep.mjs

Comment thread scripts/e2e/run-ios-tests.sh

Harden sleep scanner and retry input validation

0fb5b9a

Co-authored-by: Bretton Auerbach <auerbachb@users.noreply.github.com>

vercel Bot deployed to Preview April 24, 2026 14:38 View deployment

coderabbitai Bot reviewed Apr 24, 2026

View reviewed changes

Always write iOS lane status file on exit

294fcaa

Co-authored-by: Bretton Auerbach <auerbachb@users.noreply.github.com>

vercel Bot deployed to Preview April 24, 2026 17:20 View deployment

Merge origin/main into branch and resolve release docs conflict

8fd18e6

Co-authored-by: Bretton Auerbach <auerbachb@users.noreply.github.com>

vercel Bot deployed to Preview April 24, 2026 18:34 View deployment

auerbachb merged commit d751f5d into main Apr 24, 2026
10 checks passed

auerbachb deleted the cursor/e2e-cross-cutting-policy-54da branch April 24, 2026 18:41

coderabbitai Bot mentioned this pull request Apr 24, 2026

Add mobile web E2E coverage for viewport, touch, safe-area, and overflow #225

Merged

This was referenced Apr 24, 2026

[Testing] E2E cross-cutting: flake policy, test accounts, visual & perf hooks #194

Closed

iOS E2E: Begin → SessionView mount must be covered against Release builds #253

Closed

Conversation

auerbachb commented Apr 24, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Follow-up hardening after initial CI + review feedback

Merge maintenance

Acceptance criteria coverage

Validation run

Notes

Summary by CodeRabbit

Uh oh!

vercel Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

auerbachb commented Apr 24, 2026 •

edited by coderabbitai Bot

Loading

vercel Bot commented Apr 24, 2026 •

edited

Loading

coderabbitai Bot commented Apr 24, 2026 •

edited

Loading