Skip to content

ci(ios-e2e): bump smoke + critical from 0 to 1 retry#323

Merged
auerbachb merged 1 commit into
mainfrom
issue-322-ios-e2e-retry-bump
May 1, 2026
Merged

ci(ios-e2e): bump smoke + critical from 0 to 1 retry#323
auerbachb merged 1 commit into
mainfrom
issue-322-ios-e2e-retry-bump

Conversation

@auerbachb
Copy link
Copy Markdown
Owner

@auerbachb auerbachb commented May 1, 2026

User description

Summary

  • Flips package.json ios e2e args from 1 (= 1 attempt, 0 retries) to 2 (= 2 attempts, 1 retry).
  • Aligns with the cross-cutting policy at docs/testing/e2e-policy.md Section 1, which already permits 1 retry for both ios-e2e-smoke and ios-e2e-critical.
  • Activates the existing retry loop at scripts/e2e/run-ios-tests.sh:156-171; per-attempt artifacts (<lane>-attempt-1.xcresult, <lane>-attempt-2.xcresult) continue to upload.
  • Motivation: macos-26 / iOS 26 simulator pipeline is producing intermittent EXC_GUARD / XPC_EXIT_REASON_FAULT crashes inside Apple's BackboardServices XPC handling on arbitrary SHAs. PR feat: timer extensions + bonus time (#90) #308 green yesterday red today on same branch; PR feat(ios): Settings username inline edit (closes #282) #311 green at 17:09 red at 17:24 with only an unrelated 1-line SettingsView change between. Stacks have zero StillPoint frames. Engineers are manually rerunning these via gh run rerun --failed; this PR moves the recovery into the runner so transient infra crashes self-heal without human intervention.

Closes #322

Test plan

  • package.json smoke arg flipped 12
  • package.json critical arg flipped 12
  • No other files changed (single-file +2/-2 diff)
  • Both ios-e2e-smoke and ios-e2e-critical lanes pass on this PR's HEAD
  • On a (likely) transient first-attempt failure, the retry kicks in and the lane still reports green; logs and artifacts contain attempt-2.xcresult evidence

Trade-offs

  • When the first attempt fails, total CI time for that lane roughly doubles (~25-30 min added). Common case (first attempt passes) adds zero cost.
  • A real product regression takes 2× the attempts to surface red but engineers were already manually rerunning flakes — net wall-clock impact is neutral or better.
  • The script doesn't distinguish retriable (infra) vs non-retriable (assertion) failures — a real bug consistently fails both attempts and still shows red. Costs ~25 min of compute, not human time. Refining this is a separate larger ticket.

🤖 Generated with Claude Code


CodeAnt-AI Description

Retry iOS smoke and critical end-to-end tests once after transient failures

What Changed

  • iOS smoke and critical test runs now get one retry instead of failing after the first attempt
  • Transient simulator crashes are less likely to leave these CI lanes red when the next attempt passes

Impact

✅ Fewer flaky iOS CI failures
✅ Fewer manual reruns after simulator crashes
✅ Higher pass rate for iOS smoke and critical checks

🔄 Retrigger CodeAnt AI Review

Details

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

…tor XPC faults

Flips the second arg passed to scripts/e2e/run-ios-tests.sh from 1 (=1
attempt, 0 retries — see while ATTEMPT <= MAX_RETRIES) to 2 (=2
attempts, 1 retry). Aligns with the cross-cutting policy at
docs/testing/e2e-policy.md Section 1, which already permits 1 retry
for both ios-e2e-smoke and ios-e2e-critical.

The retry loop is already implemented at scripts/e2e/run-ios-tests.sh
lines 156-171; this just turns it on. Per-attempt artifacts continue
to upload (lane-attempt-1.xcresult, lane-attempt-2.xcresult).

Motivation: macos-26 / iOS 26 simulator pipeline is producing
intermittent EXC_GUARD / XPC_EXIT_REASON_FAULT crashes inside Apple's
BackboardServices XPC handling on apparently arbitrary SHAs (PR #308
green yesterday red today on same branch; PR #311 green at 17:09 red
at 17:24 with only an unrelated 1-line SettingsView change between).
The faulting stacks contain zero StillPoint frames. Engineers are
manually rerunning these via gh run rerun --failed; this change moves
the same recovery into the test runner, so transient infra crashes
self-heal without human intervention.

Closes #322

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@codeant-ai
Copy link
Copy Markdown

codeant-ai Bot commented May 1, 2026

CodeAnt AI is reviewing your PR.

@vercel
Copy link
Copy Markdown

vercel Bot commented May 1, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
still-point Ready Ready Preview, Comment May 1, 2026 5:40pm

Request Review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 1, 2026

Warning

Rate limit exceeded

@auerbachb has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 2 minutes and 59 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ba39adc3-9cc8-4f57-bf64-f4ebf8c47aff

📥 Commits

Reviewing files that changed from the base of the PR and between c5b8760 and 6006164.

📒 Files selected for processing (1)
  • package.json
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch issue-322-ios-e2e-retry-bump

Review rate limit: 0/5 reviews remaining, refill in 2 minutes and 59 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

@codeant-ai codeant-ai Bot added the size:XS This PR changes 0-9 lines, ignoring generated files label May 1, 2026
@codeant-ai
Copy link
Copy Markdown

codeant-ai Bot commented May 1, 2026

CodeAnt AI finished reviewing your PR.

@auerbachb
Copy link
Copy Markdown
Owner Author

@cursor review

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 6006164. Configure here.

@auerbachb
Copy link
Copy Markdown
Owner Author

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 1, 2026

✅ Actions performed

Full review triggered.

@auerbachb
Copy link
Copy Markdown
Owner Author

@graphite-app re-review

@auerbachb auerbachb merged commit de91966 into main May 1, 2026
16 checks passed
@auerbachb auerbachb deleted the issue-322-ios-e2e-retry-bump branch May 1, 2026 18:14
auerbachb added a commit that referenced this pull request May 1, 2026
…fast on real assertions (#328)

Closes #324

Adds a 4-tier classifier to scripts/e2e/run-ios-tests.sh that decides whether
to consume the retry budget after each failed xcodebuild attempt:

  1. Crash check — *StillPoint*.ips written after attempt-start marker → retry
  2. Timeout-shaped UI wait — XCTAssertTrue did-not-appear / Asynchronous wait
     Exceeded timeout → retry (covers waitForExistence + XCTestExpectation forms)
  3. Strict assertion (XCTAssertEqual, etc.) or method-level error frame → fail fast
  4. Default (infra/transient/unknown) → retry

Aligns docs/testing/e2e-policy.md Section 1 with the runner's actual behavior:
the table rows for ios-e2e-smoke / ios-e2e-critical now list .ips reports as
retriable (the original "deterministic app crash with same stack" rule is
intentionally not implemented — would require cross-attempt stack diffing).

Real product regressions surface red ~25 min faster (no second attempt on real
bugs); transient sim XPC faults and UI-timing flakes still self-heal as
designed by #323.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@codeant-ai
Copy link
Copy Markdown

codeant-ai Bot commented May 8, 2026

CodeAnt AI is running the review.

@codeant-ai codeant-ai Bot added size:XS This PR changes 0-9 lines, ignoring generated files and removed size:XS This PR changes 0-9 lines, ignoring generated files labels May 8, 2026
@codeant-ai
Copy link
Copy Markdown

codeant-ai Bot commented May 8, 2026

Sequence Diagram

This PR updates the iOS smoke and critical e2e scripts to allow one automatic retry, so the shared iOS test runner can rerun failing lanes once before marking the CI job as failed.

sequenceDiagram
    participant CI
    participant NpmScripts
    participant IosTestRunner

    CI->>NpmScripts: Trigger ios e2e smoke or critical
    NpmScripts->>IosTestRunner: Run ios tests with 2 attempts

    loop Up to 2 attempts
        IosTestRunner->>IosTestRunner: Execute ios e2e lane attempt
        alt Attempt succeeds
            IosTestRunner-->>CI: Report lane success
        else Attempt fails and attempts remain
            IosTestRunner->>IosTestRunner: Schedule next attempt
        end
    end
Loading

Generated by CodeAnt AI

@codeant-ai
Copy link
Copy Markdown

codeant-ai Bot commented May 8, 2026

CodeAnt AI finished running the review.

@codeant-ai
Copy link
Copy Markdown

codeant-ai Bot commented May 8, 2026

CodeAnt AI is running the review.

@codeant-ai codeant-ai Bot added size:XS This PR changes 0-9 lines, ignoring generated files and removed size:XS This PR changes 0-9 lines, ignoring generated files labels May 8, 2026
@codeant-ai
Copy link
Copy Markdown

codeant-ai Bot commented May 8, 2026

Sequence Diagram

This PR updates the iOS E2E smoke and critical npm scripts to allow two attempts per lane, enabling a built-in retry so transient simulator or infra crashes are absorbed within CI without manual reruns.

sequenceDiagram
    participant CI
    participant NpmScripts
    participant IosTestRunner
    participant Simulator

    CI->>NpmScripts: Run e2e ios smoke or critical
    NpmScripts->>IosTestRunner: Start lane with max attempts 2
    IosTestRunner->>Simulator: Run iOS tests attempt 1

    alt Attempt 1 passes
        Simulator-->>IosTestRunner: Tests pass
        IosTestRunner-->>CI: Report success
    else Attempt 1 fails
        Simulator-->>IosTestRunner: Transient failure
        IosTestRunner->>Simulator: Run iOS tests attempt 2
        Simulator-->>IosTestRunner: Tests pass
        IosTestRunner-->>CI: Report success after retry
    end
Loading

Generated by CodeAnt AI

@codeant-ai
Copy link
Copy Markdown

codeant-ai Bot commented May 8, 2026

CodeAnt AI finished running the review.

@codeant-ai
Copy link
Copy Markdown

codeant-ai Bot commented May 8, 2026

CodeAnt AI is running the review.

@codeant-ai codeant-ai Bot removed the size:XS This PR changes 0-9 lines, ignoring generated files label May 8, 2026
@codeant-ai codeant-ai Bot added the size:XS This PR changes 0-9 lines, ignoring generated files label May 8, 2026
@codeant-ai
Copy link
Copy Markdown

codeant-ai Bot commented May 8, 2026

Sequence Diagram

This PR updates the iOS smoke and critical npm scripts to run the shared iOS test runner with two total attempts, activating its built-in retry loop so transient simulator or infra failures can be auto-retried once before failing CI.

sequenceDiagram
    participant CI
    participant IosE2ERunner
    participant Xcodebuild

    CI->>IosE2ERunner: Start iOS smoke or critical lane (max attempts 2)
    IosE2ERunner->>IosE2ERunner: Configure lane and attempt counter

    loop Up to 2 attempts
        IosE2ERunner->>Xcodebuild: Run iOS tests for lane
        Xcodebuild-->>IosE2ERunner: Test result
    end

    alt Any attempt passes
        IosE2ERunner-->>CI: Report lane passed
    else All attempts fail
        IosE2ERunner-->>CI: Report lane failed
    end
Loading

Generated by CodeAnt AI

@codeant-ai
Copy link
Copy Markdown

codeant-ai Bot commented May 8, 2026

CodeAnt AI finished running the review.

@codeant-ai
Copy link
Copy Markdown

codeant-ai Bot commented May 8, 2026

CodeAnt AI is running the review.

@codeant-ai codeant-ai Bot added size:XS This PR changes 0-9 lines, ignoring generated files and removed size:XS This PR changes 0-9 lines, ignoring generated files labels May 8, 2026
@codeant-ai
Copy link
Copy Markdown

codeant-ai Bot commented May 8, 2026

Sequence Diagram

This PR updates the iOS smoke and critical test scripts so the shared iOS test runner allows one automatic retry (two total attempts) before marking the lane as failed, reducing noise from transient simulator crashes.

sequenceDiagram
    participant CI
    participant IOSRunner

    CI->>IOSRunner: Start iOS smoke or critical lane with max attempts 2
    loop Up to 2 attempts
        IOSRunner->>IOSRunner: Run lane tests and record attempt artifacts
        alt Attempt passes
            IOSRunner-->>CI: Report lane passed
            break
        else Attempt fails and another attempt allowed
            IOSRunner->>IOSRunner: Increment attempt and prepare retry
        end
    end
Loading

Generated by CodeAnt AI

@codeant-ai
Copy link
Copy Markdown

codeant-ai Bot commented May 8, 2026

CodeAnt AI finished running the review.

@codeant-ai
Copy link
Copy Markdown

codeant-ai Bot commented May 12, 2026

CodeAnt AI is running the review.

@codeant-ai codeant-ai Bot added size:XS This PR changes 0-9 lines, ignoring generated files and removed size:XS This PR changes 0-9 lines, ignoring generated files labels May 12, 2026
@codeant-ai
Copy link
Copy Markdown

codeant-ai Bot commented May 12, 2026

CodeAnt AI finished running the review.

@codeant-ai
Copy link
Copy Markdown

codeant-ai Bot commented May 12, 2026

CodeAnt AI is running the review.

@codeant-ai codeant-ai Bot added size:XS This PR changes 0-9 lines, ignoring generated files and removed size:XS This PR changes 0-9 lines, ignoring generated files labels May 12, 2026
@codeant-ai
Copy link
Copy Markdown

codeant-ai Bot commented May 12, 2026

Sequence Diagram

This PR updates the iOS smoke and critical E2E CI scripts to allow one automatic retry, so transient simulator crashes are retried within the pipeline instead of requiring manual reruns.

sequenceDiagram
    participant CI
    participant NpmScripts
    participant IosTestRunner
    participant IosSimulator

    CI->>NpmScripts: Run e2e ios smoke or critical
    NpmScripts->>IosTestRunner: run-ios-tests lane with max attempts 2

    loop Up to 2 attempts
        IosTestRunner->>IosSimulator: Run iOS E2E lane
        alt Attempt passes
            IosSimulator-->>IosTestRunner: Test success
            break
        else Attempt fails with transient fault
            IosSimulator-->>IosTestRunner: Test failure
        end
    end

    alt At least one attempt succeeded
        IosTestRunner-->>CI: Report lane success
    else All attempts failed
        IosTestRunner-->>CI: Report lane failure
    end
Loading

Generated by CodeAnt AI

@codeant-ai
Copy link
Copy Markdown

codeant-ai Bot commented May 12, 2026

CodeAnt AI finished running the review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XS This PR changes 0-9 lines, ignoring generated files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ci(ios-e2e): bump smoke + critical from 0 to 1 retry to absorb simulator XPC faults

1 participant