Skip to content

chore: Run CI on changed files; add retries & gating#1614

Open
edwinjosechittilappilly wants to merge 1 commit into
mainfrom
fix-strict-pipeline
Open

chore: Run CI on changed files; add retries & gating#1614
edwinjosechittilappilly wants to merge 1 commit into
mainfrom
fix-strict-pipeline

Conversation

@edwinjosechittilappilly
Copy link
Copy Markdown
Collaborator

@edwinjosechittilappilly edwinjosechittilappilly commented May 15, 2026

Update GitHub Actions workflows (lint-backend, lint-frontend, test-e2e, test-integration) to only run on relevant changed files by adding a preliminary changes job that uses dorny/paths-filter and exposes a run flag. Introduce merge_group support and add lightweight required verifier jobs so skipped jobs report as passing to the merge queue. Serialize heavy tests (E2E and Integration) with a shared concurrency group and wrap Playwright and integration test steps with nick-fields/retry to add timeouts, retries, and cleanup/on-retry teardown logic. Misc: gate jobs on the changes output, small whitespace/checkout fixes, and improved logging for decision paths.

Summary by CodeRabbit

  • Chores
    • CI workflows now decide in-workflow whether to run based on changed files and run unconditionally for merge/merge-group events.
    • Linting, formatting and test jobs are conditionally gated by that decision to avoid unnecessary runs.
    • Added mandatory post-check steps that fail the workflow when checks are failed or cancelled.
    • Added retries, improved cleanup between attempts, and concurrency controls to reduce overlapping and flaky runs.

Review Change Stack

Copilot AI review requested due to automatic review settings May 15, 2026 03:39
@github-actions github-actions Bot added the ci ⬛ CI/CD, build, and infrastructure issues label May 15, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 15, 2026

Walkthrough

This PR unifies four GitHub Actions workflows (lint-backend, lint-frontend, test-e2e, test-integration) with a consistent pattern: path-based conditional execution that skips unnecessary runs, workflow-level result validation that prevents silent failures, and enhanced test retry logic with cleanup to improve reliability.

Changes

CI/CD Workflow Improvements

Layer / File(s) Summary
Conditional execution via path filtering
.github/workflows/lint-backend.yml, .github/workflows/lint-frontend.yml, .github/workflows/test-e2e.yml, .github/workflows/test-integration.yml
All workflows introduce a changes job using dorny/paths-filter to gate execution: lint jobs run only when backend/frontend code changes; test jobs run on merge_group trigger or when relevant paths are modified. Test workflows add concurrency configuration to serialize heavy runs per ref.
Workflow result validation and fail-safe
.github/workflows/lint-backend.yml, .github/workflows/lint-frontend.yml, .github/workflows/test-e2e.yml, .github/workflows/test-integration.yml
A new required job runs after each main job with always(), reads its result status, and hard-fails the entire workflow if the main job ended in failure or cancelled, ensuring no test or lint failure is silently masked by conditional skipping.
Test retry with cleanup and teardown
.github/workflows/test-e2e.yml, .github/workflows/test-integration.yml
E2E and integration test execution is wrapped with nick-fields/retry@v3 (60/45-minute timeout, 2 attempts). Between retries, stale Playwright artifacts or Docker Compose state is cleaned up via on_retry_command to prevent cross-attempt contamination.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

enhancement

Suggested reviewers

  • mfortman11
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: running CI on changed files, adding retries, and adding gating/conditionals to workflows.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix-strict-pipeline

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/lint-backend.yml:
- Around line 119-131: The Verify lint result step currently only checks ${{
needs.lint.result }} so a skipped lint due to a failing changes job can be
overlooked; update that step to also inspect ${{ needs.changes.result }} (from
the 'changes' job) by capturing both into variables (e.g. result_lint and
result_changes) and failing the step (exit 1) if either equals "failure" or
"cancelled", otherwise print a passing message; keep the step name "Verify lint
result" and update the conditional logic accordingly.

In @.github/workflows/lint-frontend.yml:
- Around line 55-67: The guard in the required job only checks
needs.biome.result and misses failures from the changes job; update the run
script in the required job (the steps under the job named "required" that
currently reads needs.biome.result) to also inspect needs.changes.result (or the
appropriate needs.<job>.result for your changes job) and treat "failure" or
"cancelled" from either job as a failing condition—i.e., fetch both results, if
either equals "failure" or "cancelled" echo a descriptive message and exit 1,
otherwise continue and print a passing message.

In @.github/workflows/test-e2e.yml:
- Around line 196-208: The verification step in the "required" job only checks
needs.e2e.result and ignores the outcome of the needs.changes job, allowing a
failed changes job to still show green; update the Verify e2e result step (the
script in the "required" job) to also read and validate needs.changes.result (in
addition to needs.e2e.result) and fail if either result is "failure" or
"cancelled" — ensure the script references both needs.e2e.result and
needs.changes.result and returns non‑zero when any of those indicate
failure/cancellation.

In @.github/workflows/test-integration.yml:
- Around line 150-162: The 'required' job only inspects needs.tests.result so it
can pass if 'changes' failed and caused 'tests' to be skipped; update the job
named "required" to depend on the changes job and/or explicitly check its
result: add changes to the job's needs (e.g., needs: [tests, changes]) and
modify the shell check to consider both needs.tests.result and
needs.changes.result (treat failure or cancelled for either as a failing
condition) so the job correctly fails when changes fails.
- Around line 34-35: The globs 'src/**.py' and 'tests/**.py' do not match nested
files; update the workflow to use recursive globs by replacing those patterns
with 'src/**/*.py' and 'tests/**/*.py' so changes in nested directories (e.g.
src/a/b.py) trigger the integration tests; modify the patterns wherever
'src/**.py' and 'tests/**.py' appear in the test-integration.yml workflow.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ecedfa54-67fa-45e4-a19b-c5c65f46139b

📥 Commits

Reviewing files that changed from the base of the PR and between e7fb324 and 4c4116b.

📒 Files selected for processing (4)
  • .github/workflows/lint-backend.yml
  • .github/workflows/lint-frontend.yml
  • .github/workflows/test-e2e.yml
  • .github/workflows/test-integration.yml

Comment on lines +119 to +131
required:
needs: lint
if: always()
runs-on: ubuntu-latest
steps:
- name: Verify lint result
run: |
result="${{ needs.lint.result }}"
if [ "$result" = "failure" ] || [ "$result" = "cancelled" ]; then
echo "lint job did not succeed: $result"
exit 1
fi
echo "lint result: $result (success or skipped — passing)"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Include changes result in the required gate.

required only validates needs.lint.result. If changes fails, lint is skipped and this job still passes, which can hide a real CI failure.

Proposed fix
 required:
-  needs: lint
+  needs: [changes, lint]
   if: always()
   runs-on: ubuntu-latest
   steps:
     - name: Verify lint result
       run: |
-        result="${{ needs.lint.result }}"
-        if [ "$result" = "failure" ] || [ "$result" = "cancelled" ]; then
-          echo "lint job did not succeed: $result"
+        changes_result="${{ needs.changes.result }}"
+        lint_result="${{ needs.lint.result }}"
+        if [ "$changes_result" = "failure" ] || [ "$changes_result" = "cancelled" ]; then
+          echo "changes job did not succeed: $changes_result"
+          exit 1
+        fi
+        if [ "$lint_result" = "failure" ] || [ "$lint_result" = "cancelled" ]; then
+          echo "lint job did not succeed: $lint_result"
           exit 1
         fi
-        echo "lint result: $result (success or skipped — passing)"
+        echo "changes=$changes_result, lint=$lint_result (success or skipped — passing)"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
required:
needs: lint
if: always()
runs-on: ubuntu-latest
steps:
- name: Verify lint result
run: |
result="${{ needs.lint.result }}"
if [ "$result" = "failure" ] || [ "$result" = "cancelled" ]; then
echo "lint job did not succeed: $result"
exit 1
fi
echo "lint result: $result (success or skipped — passing)"
required:
needs: [changes, lint]
if: always()
runs-on: ubuntu-latest
steps:
- name: Verify lint result
run: |
changes_result="${{ needs.changes.result }}"
lint_result="${{ needs.lint.result }}"
if [ "$changes_result" = "failure" ] || [ "$changes_result" = "cancelled" ]; then
echo "changes job did not succeed: $changes_result"
exit 1
fi
if [ "$lint_result" = "failure" ] || [ "$lint_result" = "cancelled" ]; then
echo "lint job did not succeed: $lint_result"
exit 1
fi
echo "changes=$changes_result, lint=$lint_result (success or skipped — passing)"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/lint-backend.yml around lines 119 - 131, The Verify lint
result step currently only checks ${{ needs.lint.result }} so a skipped lint due
to a failing changes job can be overlooked; update that step to also inspect ${{
needs.changes.result }} (from the 'changes' job) by capturing both into
variables (e.g. result_lint and result_changes) and failing the step (exit 1) if
either equals "failure" or "cancelled", otherwise print a passing message; keep
the step name "Verify lint result" and update the conditional logic accordingly.

Comment on lines +55 to +67
required:
needs: biome
if: always()
runs-on: ubuntu-latest
steps:
- name: Verify biome result
run: |
result="${{ needs.biome.result }}"
if [ "$result" = "failure" ] || [ "$result" = "cancelled" ]; then
echo "biome job did not succeed: $result"
exit 1
fi
echo "biome result: $result (success or skipped — passing)"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail the required job when changes fails.

Current guard only inspects needs.biome.result. A failed changes job can still yield a green required result via skipped biome.

Proposed fix
 required:
-  needs: biome
+  needs: [changes, biome]
   if: always()
   runs-on: ubuntu-latest
   steps:
     - name: Verify biome result
       run: |
-        result="${{ needs.biome.result }}"
-        if [ "$result" = "failure" ] || [ "$result" = "cancelled" ]; then
-          echo "biome job did not succeed: $result"
+        changes_result="${{ needs.changes.result }}"
+        biome_result="${{ needs.biome.result }}"
+        if [ "$changes_result" = "failure" ] || [ "$changes_result" = "cancelled" ]; then
+          echo "changes job did not succeed: $changes_result"
+          exit 1
+        fi
+        if [ "$biome_result" = "failure" ] || [ "$biome_result" = "cancelled" ]; then
+          echo "biome job did not succeed: $biome_result"
           exit 1
         fi
-        echo "biome result: $result (success or skipped — passing)"
+        echo "changes=$changes_result, biome=$biome_result (success or skipped — passing)"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
required:
needs: biome
if: always()
runs-on: ubuntu-latest
steps:
- name: Verify biome result
run: |
result="${{ needs.biome.result }}"
if [ "$result" = "failure" ] || [ "$result" = "cancelled" ]; then
echo "biome job did not succeed: $result"
exit 1
fi
echo "biome result: $result (success or skipped — passing)"
required:
needs: [changes, biome]
if: always()
runs-on: ubuntu-latest
steps:
- name: Verify biome result
run: |
changes_result="${{ needs.changes.result }}"
biome_result="${{ needs.biome.result }}"
if [ "$changes_result" = "failure" ] || [ "$changes_result" = "cancelled" ]; then
echo "changes job did not succeed: $changes_result"
exit 1
fi
if [ "$biome_result" = "failure" ] || [ "$biome_result" = "cancelled" ]; then
echo "biome job did not succeed: $biome_result"
exit 1
fi
echo "changes=$changes_result, biome=$biome_result (success or skipped — passing)"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/lint-frontend.yml around lines 55 - 67, The guard in the
required job only checks needs.biome.result and misses failures from the changes
job; update the run script in the required job (the steps under the job named
"required" that currently reads needs.biome.result) to also inspect
needs.changes.result (or the appropriate needs.<job>.result for your changes
job) and treat "failure" or "cancelled" from either job as a failing
condition—i.e., fetch both results, if either equals "failure" or "cancelled"
echo a descriptive message and exit 1, otherwise continue and print a passing
message.

Comment on lines +196 to +208
required:
needs: e2e
if: always()
runs-on: ubuntu-latest
steps:
- name: Verify e2e result
run: |
result="${{ needs.e2e.result }}"
if [ "$result" = "failure" ] || [ "$result" = "cancelled" ]; then
echo "e2e job did not succeed: $result"
exit 1
fi
echo "e2e result: $result (success or skipped — passing)"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

required should validate both changes and e2e outcomes.

Right now a failed changes job can still surface as green because only needs.e2e.result is evaluated.

Proposed fix
 required:
-  needs: e2e
+  needs: [changes, e2e]
   if: always()
   runs-on: ubuntu-latest
   steps:
     - name: Verify e2e result
       run: |
-        result="${{ needs.e2e.result }}"
-        if [ "$result" = "failure" ] || [ "$result" = "cancelled" ]; then
-          echo "e2e job did not succeed: $result"
+        changes_result="${{ needs.changes.result }}"
+        e2e_result="${{ needs.e2e.result }}"
+        if [ "$changes_result" = "failure" ] || [ "$changes_result" = "cancelled" ]; then
+          echo "changes job did not succeed: $changes_result"
+          exit 1
+        fi
+        if [ "$e2e_result" = "failure" ] || [ "$e2e_result" = "cancelled" ]; then
+          echo "e2e job did not succeed: $e2e_result"
           exit 1
         fi
-        echo "e2e result: $result (success or skipped — passing)"
+        echo "changes=$changes_result, e2e=$e2e_result (success or skipped — passing)"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/test-e2e.yml around lines 196 - 208, The verification step
in the "required" job only checks needs.e2e.result and ignores the outcome of
the needs.changes job, allowing a failed changes job to still show green; update
the Verify e2e result step (the script in the "required" job) to also read and
validate needs.changes.result (in addition to needs.e2e.result) and fail if
either result is "failure" or "cancelled" — ensure the script references both
needs.e2e.result and needs.changes.result and returns non‑zero when any of those
indicate failure/cancellation.

Comment on lines +34 to +35
- 'src/**.py'
- 'tests/**.py'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

In picomatch (used by dorny/paths-filter), does src/.pymatch nested paths likesrc/a/b.py, or is src//*.py required for recursive matching?

💡 Result:

In picomatch, src/.py does NOT match nested paths like src/a/b.py; you need src//*.py for recursive matching. Reason (per picomatch glob semantics): * matches any characters excluding path separators, while is the special “globstar” that can match across path separators. The picomatch docs explicitly note that `*` won’t match `a/b.js` (it returns false for `a/b.js`) and that is what matches path separators when used appropriately (nested directories require ``). [1][2] So, src/.py is effectively “`src/` + any non-separator chars + `.py`”, not “`src/` + any number of nested directories + `.py`”. [1] For dorny/paths-filter: since it evaluates path expressions using picomatch, the same matching rules apply there. [3]

Citations:


Use recursive Python globs that match nested files.

src/**.py and tests/**.py do not match nested files like src/a/b.py with picomatch semantics (used by dorny/paths-filter). This causes integration tests to be skipped when changes occur in nested directories. Use src/**/*.py and tests/**/*.py for proper recursive matching.

Proposed fix
           filters: |
             relevant:
-              - 'src/**.py'
-              - 'tests/**.py'
+              - 'src/**/*.py'
+              - 'tests/**/*.py'
               - 'pyproject.toml'
               - 'uv.lock'
               - 'sdks/**'
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- 'src/**.py'
- 'tests/**.py'
- 'src/**/*.py'
- 'tests/**/*.py'
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/test-integration.yml around lines 34 - 35, The globs
'src/**.py' and 'tests/**.py' do not match nested files; update the workflow to
use recursive globs by replacing those patterns with 'src/**/*.py' and
'tests/**/*.py' so changes in nested directories (e.g. src/a/b.py) trigger the
integration tests; modify the patterns wherever 'src/**.py' and 'tests/**.py'
appear in the test-integration.yml workflow.

Comment on lines +150 to +162
required:
needs: tests
if: always()
runs-on: ubuntu-latest
steps:
- name: Verify tests result
run: |
# For PRs, always build locally since we're testing new code
# For workflow_dispatch, use the input (defaults to true)
USE_LOCAL="${{ inputs.use_local_images }}"
if [ "${{ github.event_name }}" == "pull_request" ] || [ "$USE_LOCAL" != "false" ]; then
echo "Running tests with locally built images..."
make test-ci-local
else
echo "Running tests with DockerHub images..."
make test-ci
result="${{ needs.tests.result }}"
if [ "$result" = "failure" ] || [ "$result" = "cancelled" ]; then
echo "tests job did not succeed: $result"
exit 1
fi
echo "Keys directory after tests:"
ls -la keys/ || echo "No keys directory"
echo "tests result: $result (success or skipped — passing)"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Prevent required from passing when changes fails.

required only checks needs.tests.result; if changes fails, tests can be skipped and this job still passes.

Proposed fix
 required:
-  needs: tests
+  needs: [changes, tests]
   if: always()
   runs-on: ubuntu-latest
   steps:
     - name: Verify tests result
       run: |
-        result="${{ needs.tests.result }}"
-        if [ "$result" = "failure" ] || [ "$result" = "cancelled" ]; then
-          echo "tests job did not succeed: $result"
+        changes_result="${{ needs.changes.result }}"
+        tests_result="${{ needs.tests.result }}"
+        if [ "$changes_result" = "failure" ] || [ "$changes_result" = "cancelled" ]; then
+          echo "changes job did not succeed: $changes_result"
+          exit 1
+        fi
+        if [ "$tests_result" = "failure" ] || [ "$tests_result" = "cancelled" ]; then
+          echo "tests job did not succeed: $tests_result"
           exit 1
         fi
-        echo "tests result: $result (success or skipped — passing)"
+        echo "changes=$changes_result, tests=$tests_result (success or skipped — passing)"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/test-integration.yml around lines 150 - 162, The
'required' job only inspects needs.tests.result so it can pass if 'changes'
failed and caused 'tests' to be skipped; update the job named "required" to
depend on the changes job and/or explicitly check its result: add changes to the
job's needs (e.g., needs: [tests, changes]) and modify the shell check to
consider both needs.tests.result and needs.changes.result (treat failure or
cancelled for either as a failing condition) so the job correctly fails when
changes fails.

This comment was marked as outdated.

Update GitHub Actions workflows (lint-backend, lint-frontend, test-e2e, test-integration) to only run on relevant changed files by adding a preliminary `changes` job that uses dorny/paths-filter and exposes a run flag. Introduce merge_group support and add lightweight `required` verifier jobs so skipped jobs report as passing to the merge queue. Serialize heavy tests (E2E and Integration) with a shared concurrency group and wrap Playwright and integration test steps with nick-fields/retry to add timeouts, retries, and cleanup/on-retry teardown logic. Misc: gate jobs on the changes output, small whitespace/checkout fixes, and improved logging for decision paths.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.github/workflows/lint-frontend.yml (1)

55-74: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Apply the same conditional execution pattern to the typecheck job.

The typecheck job is not gated on the changes output, so it will always run even when no frontend files have changed. This is inconsistent with the conditional execution pattern applied to the biome job and defeats the core purpose of this PR.

🔧 Proposed fix
 typecheck:
+  needs: changes
+  if: needs.changes.outputs.run == 'true'
   name: TypeScript type check
   runs-on: ubuntu-latest
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/lint-frontend.yml around lines 55 - 74, The typecheck job
currently always runs; add the same conditional gating used by the biome job so
typecheck only runs when frontend changes were detected. Update the "typecheck"
job (name/typecheck) to include the same if: expression used by the biome job
(e.g. if: ${{ steps.files.outputs.changes == 'true' }} or the equivalent outputs
variable your workflow uses) so the job is skipped when no frontend files
changed.
♻️ Duplicate comments (1)
.github/workflows/lint-frontend.yml (1)

75-87: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Validate all upstream jobs in the required gate.

The required job has two critical gaps:

  1. Missing changes job validation (flagged in past review): If the changes job fails, biome is skipped, and required sees "skipped" and passes—allowing a failed workflow to report success.

  2. Missing typecheck job validation: The typecheck job is not validated at all. If type checking fails, the workflow will still pass because only biome is checked.

Both gaps violate the PR's goal of preventing silent failures.

🛡️ Proposed fix to validate all jobs
 required:
-  needs: biome
+  needs: [changes, biome, typecheck]
   if: always()
   runs-on: ubuntu-latest
   steps:
-    - name: Verify biome result
+    - name: Verify all job results
       run: |
-        result="${{ needs.biome.result }}"
-        if [ "$result" = "failure" ] || [ "$result" = "cancelled" ]; then
-          echo "biome job did not succeed: $result"
+        changes_result="${{ needs.changes.result }}"
+        biome_result="${{ needs.biome.result }}"
+        typecheck_result="${{ needs.typecheck.result }}"
+        
+        if [ "$changes_result" = "failure" ] || [ "$changes_result" = "cancelled" ]; then
+          echo "changes job did not succeed: $changes_result"
+          exit 1
+        fi
+        if [ "$biome_result" = "failure" ] || [ "$biome_result" = "cancelled" ]; then
+          echo "biome job did not succeed: $biome_result"
+          exit 1
+        fi
+        if [ "$typecheck_result" = "failure" ] || [ "$typecheck_result" = "cancelled" ]; then
+          echo "typecheck job did not succeed: $typecheck_result"
           exit 1
         fi
-        echo "biome result: $result (success or skipped — passing)"
+        echo "changes=$changes_result, biome=$biome_result, typecheck=$typecheck_result (all success or skipped — passing)"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/lint-frontend.yml around lines 75 - 87, Update the
required job's verification step to validate all upstream jobs instead of only
`biome`: check `needs.biome.result`, `needs.changes.result`, and
`needs.typecheck.result`, and fail the job if any of those results is not
"success" (treat "failure", "cancelled", or "skipped" as a non-success state).
Replace the single `result="${{ needs.biome.result }}"` check with a loop or
explicit checks that reads each `needs.<job>.result` and exits with an error and
a clear message if any is not "success", ensuring `required` cannot pass when
`changes` or `typecheck` (or `biome`) did not succeed.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In @.github/workflows/lint-frontend.yml:
- Around line 55-74: The typecheck job currently always runs; add the same
conditional gating used by the biome job so typecheck only runs when frontend
changes were detected. Update the "typecheck" job (name/typecheck) to include
the same if: expression used by the biome job (e.g. if: ${{
steps.files.outputs.changes == 'true' }} or the equivalent outputs variable your
workflow uses) so the job is skipped when no frontend files changed.

---

Duplicate comments:
In @.github/workflows/lint-frontend.yml:
- Around line 75-87: Update the required job's verification step to validate all
upstream jobs instead of only `biome`: check `needs.biome.result`,
`needs.changes.result`, and `needs.typecheck.result`, and fail the job if any of
those results is not "success" (treat "failure", "cancelled", or "skipped" as a
non-success state). Replace the single `result="${{ needs.biome.result }}"`
check with a loop or explicit checks that reads each `needs.<job>.result` and
exits with an error and a clear message if any is not "success", ensuring
`required` cannot pass when `changes` or `typecheck` (or `biome`) did not
succeed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 95e7e81d-7a9c-4ff7-9407-ba957ffe5eff

📥 Commits

Reviewing files that changed from the base of the PR and between 4c4116b and ff48340.

📒 Files selected for processing (4)
  • .github/workflows/lint-backend.yml
  • .github/workflows/lint-frontend.yml
  • .github/workflows/test-e2e.yml
  • .github/workflows/test-integration.yml
🚧 Files skipped from review as they are similar to previous changes (3)
  • .github/workflows/lint-backend.yml
  • .github/workflows/test-e2e.yml
  • .github/workflows/test-integration.yml

@edwinjosechittilappilly
Copy link
Copy Markdown
Collaborator Author

@coderabbitai create a note on why this PR is relevant and draft a proposal also.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 15, 2026

Tip

For best results, initiate chat on the files or code changes.

@edwinjosechittilappilly Sure! Here is a note on the relevance of this PR and a drafted proposal.


📋 Note: Why PR #1614 Is Relevant

Context

Modern CI pipelines face three recurring pain points: wasted compute on irrelevant changes, flaky test failures blocking merges, and merge queues misreading skipped jobs as failures. PR #1614 directly addresses all three in one cohesive change.

Why It Matters

1. 🚀 CI Efficiency — Running Only What's Needed

Currently, every push to a PR triggers the full lint and test suite regardless of whether the changed files are relevant to those jobs. By introducing a changes job powered by dorny/paths-filter, the workflow now inspects which files were actually modified and skips jobs entirely when no relevant paths were touched. This reduces:

  • Unnecessary runner minutes consumed
  • Developer wait time for unrelated CI jobs
  • Cost on CI compute resources

2. 🔁 Reliability — Retry Logic for Flaky Tests

E2E (Playwright) and Integration tests are inherently susceptible to environment-level flakiness (network timeouts, Docker startup races, etc.). Wrapping these steps with nick-fields/retry@v3 provides:

  • Up to 2 attempts with a defined timeout (60 min for E2E, 45 min for Integration)
  • Automatic teardown of stale state (test-results, playwright-report, Docker Compose stacks) between retries
  • Fewer false negatives blocking legitimate PRs

3. 🔀 Merge Queue Compatibility — merge_group Support + required Jobs

GitHub's merge queue requires that all required status checks pass, even if they were skipped. Without required verifier jobs, a skipped lint/test job registers as "missing" rather than "passing," blocking the merge queue. The new required jobs solve this by:

  • Always running (regardless of skip logic)
  • Reporting success when a job was skipped or passed
  • Only failing when the underlying job explicitly failed or was cancelled

4. ⛓️ Serialization — Preventing Resource Contention

Heavy E2E and Integration tests now share a concurrency group (heavy-tests-${{ github.ref }}), ensuring they never run in parallel on the same branch ref. This prevents resource contention and unpredictable failures from competing Docker environments.


📝 Proposal

Title: Adopt Path-Filtered, Retry-Gated, Merge-Queue-Compatible CI for OpenRAG

Objective

Modernize the GitHub Actions CI pipeline for the openrag repository to be efficient, resilient, and fully compatible with GitHub's merge queue feature.

Proposed Changes

flowchart TD
    A[Push / PR / merge_group] --> B{changes job\ndorny/paths-filter}
    B -- relevant files changed --> C[lint / biome / e2e / integration]
    B -- no relevant files changed --> D[skip job]
    C --> E{required job\nalways runs}
    D --> E
    E -- passed or skipped --> F[✅ Status Check: Passing]
    E -- failed or cancelled --> G[❌ Status Check: Failing]

    style F fill:rgba(0,200,0,0.5)
    style G fill:rgba(200,0,0,0.5)
    style B fill:rgba(0,100,255,0.5)
    style E fill:rgba(255,165,0,0.5)
Loading

Scope

Workflow Changes
lint-backend.yml Add changes + required jobs, remove paths trigger
lint-frontend.yml Add merge_group trigger, changes + required jobs
test-e2e.yml Add changes + required jobs, concurrency group, retry wrapper
test-integration.yml Add changes + required jobs, concurrency group, retry wrapper

Benefits

Benefit Before After
Unnecessary CI runs All jobs run on every push Only runs when relevant files change
Flaky test failures Single attempt, hard fail Up to 2 retries with clean state
Merge queue compatibility Skipped jobs block merge queue required jobs always report status
Heavy test parallelism May run concurrently Serialized per branch ref

Risks & Mitigations

Risk Mitigation
paths-filter missing a relevant path pattern Patterns should be reviewed carefully; merge_group always forces run=true as a safety net
Retry masking real failures Max 2 attempts only; teardown between retries ensures clean state
Concurrency group causing long queue waits cancel-in-progress: false ensures no run is silently dropped

Recommendation

Approve and merge this PR. It establishes a solid foundation for scalable and developer-friendly CI with minimal risk, and directly unblocks reliable merge queue usage.


Note to reviewers: Pay particular attention to the paths-filter patterns in each changes job — these determine which file changes trigger which jobs and should be validated against the actual project layout.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci ⬛ CI/CD, build, and infrastructure issues ignore-for-release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants