[fix] Removed 'FAILED (' from strict markers to unblock auto-retry by stktyagi · Pull Request #655 · openwisp/openwisp-utils

stktyagi · 2026-04-24T16:25:49Z

Removed 'FAILED' keyword from strict failure markers which caused blocking of re-run mechanism

Checklist

I have read the OpenWISP Contributing Guidelines.
I have manually tested the changes proposed in this pull request.
I have written new test cases for new code and/or updated existing tests for changes to existing code.
I have updated the documentation.

Description of Changes

Existence of 'FAILED' keyword in strict failure markers blocked re-run mechanism due its occurence in any case of failure in CI failure logs.

Removed 'FAILED' keyword from strict failure markers which caused blocking of re-run mechanism

coderabbitai · 2026-04-24T16:26:02Z

📝 Walkthrough

Walkthrough

Two changes in the CI failure analysis logic and its tests: the "FAILED (" string was removed from STRICT_TEST_FAILURE_MARKERS in .github/actions/bot-ci-failure/analyze_failure.py, and "selenium.common.exceptions.WebDriverException" was added to TRANSIENT_FAILURE_MARKERS. A new unit test was added to ensure process_error_logs does not treat a unittest terminal summary (FAILED (errors=1)) as a strict failure when the underlying logs indicate a transient WebDriver/infrastructure error.

Sequence Diagram(s)

(omitted)

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

[ci] Added Selenium connection failure to transient markers #648 — Adds "selenium.common.exceptions.WebDriverException" to transient failure markers in the same analyzer.
[ci] Fixed CI transient error re-run mechanism #650 — Adjusts test-failure vs. transient-failure detection in analyze_failure.py and related tests.
[ci:fix] CI Failure bot refinements #624 — Refines transient-failure detection and failure marker handling in the same CI failure classifier.

Suggested labels

bug, github_actions, helper-bots

Suggested reviewers

nemesifier

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title follows the required [type] format with 'fix' prefix and clearly describes the main change: removing 'FAILED (' from strict markers to unblock auto-retry.
Description check	✅ Passed	The description includes most required sections: checklist items marked, reference context provided, and changes described. However, the 'Reference to Existing Issue' section is missing the issue number link.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Bug Fixes	✅ Passed	The pull request properly fixes the root cause by removing the problematic 'FAILED (' marker from STRICT_TEST_FAILURE_MARKERS and adding WebDriverException to transient markers. A regression test validates the exact bug scenario with deterministic hardcoded log data.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/transient-error-markers

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

kilo-code-bot · 2026-04-24T16:26:52Z

Code Review Summary

Status: No Issues Found | Recommendation: Merge

Overview

The change removes "FAILED (" from the strict failure markers and "selenium.common.exceptions.WebDriverException" from transient markers to fix the auto-retry mechanism being incorrectly blocked.

Incremental Changes (since last review):

Also removed "selenium.common.exceptions.WebDriverException" from TRANSIENT_FAILURE_MARKERS
This broad exception class is no longer needed since more specific markers (InvalidSessionIdException, about:neterror) remain
Removed obsolete test test_transient_ignores_unittest_failed_summary from test_analyze_failure.py

Analysis:

The remaining markers ("FAIL:" and "AssertionError" for strict; InvalidSessionIdException, about:neterror for transient) are more specific indicators
This is a minimal, targeted fix that addresses the false positive issue described in the PR
No security vulnerabilities or critical bugs introduced

Files Reviewed (2 files)

.github/actions/bot-ci-failure/analyze_failure.py - Removes "FAILED (" from STRICT_TEST_FAILURE_MARKERS and "selenium.common.exceptions.WebDriverException" from TRANSIENT_FAILURE_MARKERS
.github/actions/bot-ci-failure/test_analyze_failure.py - Removes obsolete test for the deleted markers

_{Reviewed by kimi-k2.5-0127 · 134,461 tokens}

Ensures that the standard 'FAILED (errors=x)' summary appended by the unittest framework does not falsely override transient crash detection and block the auto-retry mechanism.

coveralls · 2026-04-24T16:33:24Z

coverage: 97.529%. remained the same — fix/transient-error-markers into master

openwisp-companion · 2026-04-24T16:33:58Z

Test Failures in CI

Hello @stktyagi,
(Analysis for commit 3b35a46)

The CI pipeline failed due to a test failure in the bot-ci-failure action.

Failure:
The test test_transient_ignores_unittest_failed_summary in .github/actions/bot-ci-failure/test_analyze_failure.py failed with an AssertionError. The test expects tests_failed to be False, but it received True. This indicates that the process_error_logs function incorrectly identified a transient error (WebDriverException: Reached error page: about:neterror) as a regular test failure.

Fix:
Review the logic in process_error_logs within .github/actions/bot-ci-failure/analyze_failure.py to ensure that transient errors like WebDriverException are correctly handled and do not increment the tests_failed count. Specifically, the function should correctly identify and categorize the about:neterror page as a transient issue, preventing it from being counted as a failed test.

Added WebDriverException in transient failure error markers

openwisp-companion · 2026-04-24T16:43:42Z

Test Failures in CI

Hello @stktyagi,
(Analysis for commit 925b0cf)

There are 3 test failures in the CI logs:

TestIsTransientFailure.test_detects_coveralls: This test failed because it expected True but received False. The _is_transient_failure function likely did not correctly identify the Coveralls error message as transient.

Fix: Review the logic in _is_transient_failure within .github/actions/bot-ci-failure/test_analyze_failure.py to ensure it correctly handles Coveralls-related error messages.

TestProcessErrorLogs.test_coveralls_only_is_transient: This test failed because it expected True for the transient flag but received False. This indicates that the process_error_logs function did not correctly classify the Coveralls error as transient.

Fix: Examine the process_error_logs function in .github/actions/bot-ci-failure/test_analyze_failure.py to ensure it correctly identifies and flags Coveralls errors as transient.

TestProcessErrorLogs.test_transient_ignores_unittest_failed_summary: This test failed because it expected False for tests_failed but received True. This suggests that the process_error_logs function incorrectly identified a unittest failure summary as a transient error, overriding the actual test failure count.

Fix: Adjust the logic in process_error_logs to correctly differentiate between transient errors and standard test failures, ensuring that unittest failure summaries are not misinterpreted.

stktyagi · 2026-04-24T16:51:48Z

@coderabbitai full review

coderabbitai · 2026-04-24T16:51:54Z

✅ Actions performed

Full review triggered.

stktyagi · 2026-04-24T16:56:11Z

@coderabbitai resolve

coderabbitai · 2026-04-24T16:56:24Z

✅ Actions performed

Comments resolved and changes approved.

nemesifier

@stktyagi I am thinking.. we are trying to figure out which of these 3 cases we're in:

tests failed due to flakyness but also has real test failures
tests failed only due to flakyness
tests failed only due to real test failures

I think the problem is mainly discern between 1 and 2. Implementing this with text patterns and make it work across different programming languages (python, lua, nodejs, etc) will be super tricky.

Maybe we can make it work for Python, but I am skeptical this will work also for the nodejs repos.

Are you sure this is the right approach?
Why don't we give all the context to the LLM and ask it to tell us if we are under case 1 or 2? The LLM should be good at this. We could do this in a separate API request, which has the sole goal of understanding if we are falling under case 1 or case 2, based on the result of the response we'll either restart the build or not, what do you think of this?

stktyagi · 2026-04-26T07:46:52Z

what do you think of this?

My main concern with that approach would be handling hallucinations, we would need to be deterministic with whether a CI failure is due to actual test failure or flakyness. Also, while Node.js and Python format tracebacks differently, the actual infrastructure crashes we want to forgive are mostly consistent across all CIs and even if we encounter new ones, I feel including them iteratively until we reach max coverage seems fine.

It feels like a trade-off :- Non-deterministic approach would be language versatile but might hallucinate, whereas current approach would be deterministic but we would need to consistently monitor new transient error inclusions and iterations. Depends which seems better to us.

openwisp-companion · 2026-04-26T07:51:04Z

All CI checks passed

Hello @stktyagi,
(Analysis for commit 6bb9def)

No failures were detected in the CI logs.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/actions/bot-ci-failure/test_analyze_failure.py:
- Around line 341-354: Add a negative regression test to ensure a unittest
summary "FAILED (errors=1)" does not cancel a real error classification when
there is a non-transient "ERROR:" traceback; create a new test (e.g.,
test_unittest_failed_with_real_error_blocks_retry) that builds a log string
containing a real "ERROR:"/traceback (no transient WebDriverException markers)
plus the "FAILED (errors=1)" summary, call process_error_logs and assert
tests_failed is True and transient_only is False; reference the existing
test_transient_ignores_unittest_failed_summary and the flags
has_strict_failure/is_transient in analyze_failure.py as the behavior you are
locking in.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 36263c46-7352-42ba-8d54-5612a5a4b122

📥 Commits

Reviewing files that changed from the base of the PR and between c1111f0 and 6bb9def.

📒 Files selected for processing (2)

.github/actions/bot-ci-failure/analyze_failure.py
.github/actions/bot-ci-failure/test_analyze_failure.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)

GitHub Check: Python==3.13 | django~=5.1.0
GitHub Check: Python==3.11 | django~=5.0.0
GitHub Check: Python==3.10 | django~=5.2.0
GitHub Check: Python==3.13 | django~=5.2.0
GitHub Check: Python==3.12 | django~=5.0.0
GitHub Check: Python==3.10 | django~=4.2.0
GitHub Check: Python==3.11 | django~=5.2.0
GitHub Check: Python==3.10 | django~=5.1.0
GitHub Check: Python==3.12 | django~=5.2.0
GitHub Check: Python==3.12 | django~=4.2.0
GitHub Check: Python==3.11 | django~=5.1.0

🧰 Additional context used

🧠 Learnings (1)

📓 Common learnings

Learnt from: CR
Repo: openwisp/openwisp-utils PR: 0
File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0
Timestamp: 2026-03-14T20:44:14.568Z
Learning: Bug Fixes: Ensure the test is deterministic and not flaky - flag tests that depend on timing, sleeps, specific timezones, system time, randomness without fixed seed, race conditions, concurrency timing, network access, external services, filesystem state, environment-specific configuration, execution order, shared global state, hardcoded ports, or unawaited async operations

Learnt from: pushpitkamboj
Repo: openwisp/openwisp-utils PR: 584
File: .github/workflows/reusable-bot-changelog.yml:49-49
Timestamp: 2026-03-05T09:38:10.320Z
Learning: In openwisp-utils, PR title prefixes are strictly limited to `[feature]`, `[fix]`, and `[change]` (exact bracketed tags, no scoping/sub-types). The regex `^\[(feature|fix|change)\]` in `.github/workflows/reusable-bot-changelog.yml` is intentional and correct — scoped variants like `[feature/bots]` are not valid and should not be matched.

Learnt from: CR
Repo: openwisp/openwisp-utils PR: 0
File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0
Timestamp: 2026-03-14T20:44:14.568Z
Learning: Features: Add tests for new features and ensure coverage does not decrease significantly; prefer Selenium browser tests for UI-impacting features

Learnt from: CR
Repo: openwisp/openwisp-utils PR: 0
File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0
Timestamp: 2026-03-14T20:44:14.568Z
Learning: Bug Fixes: If the bug affects the user interface, include a Selenium browser test; if missing, raise a warning

🔇 Additional comments (2)

.github/actions/bot-ci-failure/analyze_failure.py (2)

10-13: Removal of "FAILED (" from strict markers looks safe.

Genuine unittest failures still emit per-test FAIL: <test> or ERROR: <test> lines (caught by STRICT_TEST_FAILURE_MARKERS/GENERIC_TEST_FAILURE_MARKERS), and assertion bugs still surface via AssertionError. The terminal FAILED (errors=N, failures=M) line is purely a summary, so dropping it from the strict set unblocks the transient-retry path without weakening detection of real failures. The new regression test in test_analyze_failure.py pins this behavior.

37-37: Adding base-class WebDriverException as a transient marker — accepting the false-positive trade-off.

selenium.common.exceptions.WebDriverException is the base class of most Selenium exceptions (NoSuchElementException, ElementClickInterceptedException, TimeoutException, …). In practice that's fine for substring matching because Python tracebacks print the concrete subclass FQN, so this marker will only fire when the runtime raises WebDriverException directly (typical for browser/marionette crashes, about:neterror, session teardown).

However, it does mean any future code path or test helper that raises a bare WebDriverException — including legitimate test logic bugs — will now be classified as transient and will forgive co-located ERROR: / Traceback markers in the same job, potentially masking a real failure and triggering an auto-retry loop. This iterative pattern-matching trade-off has been accepted.

Minor note: selenium.common.exceptions.InvalidSessionIdException on line 36 is a WebDriverException subclass, but both entries are needed because tracebacks print the concrete subclass, so explicit subclass entries remain useful for matching.

coderabbitai · 2026-04-26T07:52:28Z

+    def test_transient_ignores_unittest_failed_summary(self):
+        """Ensure unittest's 'FAILED (errors=1)' summary does not override transient crashes."""
+        content = (
+            "===== JOB 5 =====\n"
+            "Traceback (most recent call last):\n"
+            "selenium.common.exceptions.WebDriverException: Message: Reached error page: about:neterror\n"
+            "----------------------------------------------------------------------\n"
+            "Ran 367 tests in 311.148s\n\n"
+            "FAILED (errors=1)\n"
+        )
+        text, tests_failed, transient_only = process_error_logs(content)
+        self.assertFalse(tests_failed)
+        self.assertTrue(transient_only)
+


🧹 Nitpick | 🔵 Trivial

LGTM — regression test correctly pins the fix.

The scenario exercises exactly the buggy classification path: transient WebDriverException plus a generic Traceback plus the unittest FAILED (errors=1) summary. With the marker changes in analyze_failure.py, has_strict_failure is False, is_transient is True, so the transient branch forgives the generic traceback and the unittest summary, yielding tests_failed=False, transient_only=True as asserted.

One small follow-up worth considering (optional): also add a negative test where FAILED (errors=1) appears with a real ERROR: traceback but no transient marker, asserting tests_failed=True. test_pure_generic_bug_blocks_retry covers the spirit of this, but a variant that explicitly contains the FAILED (errors=N) summary would lock in the contract that the summary line alone never flips classification either way.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/actions/bot-ci-failure/test_analyze_failure.py around lines 341 - 354, Add a negative regression test to ensure a unittest summary "FAILED (errors=1)" does not cancel a real error classification when there is a non-transient "ERROR:" traceback; create a new test (e.g., test_unittest_failed_with_real_error_blocks_retry) that builds a log string containing a real "ERROR:"/traceback (no transient WebDriverException markers) plus the "FAILED (errors=1)" summary, call process_error_logs and assert tests_failed is True and transient_only is False; reference the existing test_transient_ignores_unittest_failed_summary and the flags has_strict_failure/is_transient in analyze_failure.py as the behavior you are locking in.

[fix] Removed 'FAILED (' from strict markers to unblock auto-retry

c1111f0

Removed 'FAILED' keyword from strict failure markers which caused blocking of re-run mechanism

coderabbitai Bot added bug github_actions Pull requests that update GitHub Actions code helper-bots Helper bots, release management automation labels Apr 24, 2026

coderabbitai Bot previously approved these changes Apr 24, 2026

View reviewed changes

[ci] Ddded regression test for unittest 'FAILED' error marker

3b35a46

Ensures that the standard 'FAILED (errors=x)' summary appended by the unittest framework does not falsely override transient crash detection and block the auto-retry mechanism.

stktyagi dismissed coderabbitai[bot]’s stale review via 3b35a46 April 24, 2026 16:28

stktyagi added 2 commits April 24, 2026 22:08

[fix] Added WebDriverException in transient error markers

925b0cf

Added WebDriverException in transient failure error markers

[fix] Added WebDriverException in transient error markers

0211ce6

Added WebDriverException in transient failure error markers

coderabbitai Bot approved these changes Apr 24, 2026

View reviewed changes

nemesifier reviewed Apr 24, 2026

View reviewed changes

Merge branch 'master' into fix/transient-error-markers

6bb9def

coderabbitai Bot requested changes Apr 26, 2026

View reviewed changes

Uh oh!

Conversation

stktyagi commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Description of Changes

Uh oh!

coderabbitai Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

kilo-code-bot Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Summary

Overview

Uh oh!

coveralls commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openwisp-companion Bot commented Apr 24, 2026

Test Failures in CI

Uh oh!

openwisp-companion Bot commented Apr 24, 2026

Test Failures in CI

Uh oh!

stktyagi commented Apr 24, 2026

Uh oh!

coderabbitai Bot commented Apr 24, 2026

Uh oh!

stktyagi commented Apr 24, 2026

Uh oh!

coderabbitai Bot commented Apr 24, 2026

Uh oh!

nemesifier left a comment

Choose a reason for hiding this comment

Uh oh!

stktyagi commented Apr 26, 2026

Uh oh!

openwisp-companion Bot commented Apr 26, 2026

All CI checks passed

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

stktyagi commented Apr 24, 2026 •

edited

Loading

coderabbitai Bot commented Apr 24, 2026 •

edited

Loading

kilo-code-bot Bot commented Apr 24, 2026 •

edited

Loading

coveralls commented Apr 24, 2026 •

edited

Loading