feat: add retry with exponential backoff for GitHub API calls#1071
feat: add retry with exponential backoff for GitHub API calls#1071
Conversation
|
Report bugs in Issues Welcome! 🎉This pull request will be automatically processed with the following features: 🔄 Automatic Actions
📋 Available CommandsPR Status Management
Review & Approval
Testing & Validation
Container Operations
Cherry-pick Operations
Label Management
✅ Merge RequirementsThis PR will be automatically approved when the following conditions are met:
📊 Review ProcessApprovers and ReviewersApprovers:
Reviewers:
Available Labels
AI Features
💡 Tips
For more information, please refer to the project documentation or contact the maintainers. |
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (15)
WalkthroughCentralizes PyGithub blocking-call handling by adding an async retry wrapper Changes
Sequence DiagramsequenceDiagram
participant Handler as Handler (e.g., PullRequestHandler)
participant Wrapper as github_api_call()
participant Thread as asyncio.to_thread()
participant PyGithub as PyGithub (sync)
participant GitHub as GitHub Server
Handler->>Wrapper: await github_api_call(func, *args, logger, log_prefix)
Wrapper->>Thread: offload func to thread
Thread->>PyGithub: execute synchronous call
PyGithub->>GitHub: HTTP request
alt success
GitHub-->>PyGithub: 200 OK
PyGithub-->>Thread: return result
Thread-->>Wrapper: result
Wrapper-->>Handler: return result
else transient 5xx / connection error
GitHub-->>PyGithub: 5xx / connection fail
PyGithub-->>Thread: raise exception
Thread-->>Wrapper: exception
Wrapper->>Wrapper: _is_retryable -> true
Wrapper->>Wrapper: log warning (attempt N/M)
Wrapper->>Wrapper: await asyncio.sleep(exponential_delay)
Wrapper->>Thread: retry offload
else non-retryable 4xx / auth / validation
GitHub-->>PyGithub: 4xx
PyGithub-->>Thread: raise exception
Thread-->>Wrapper: exception
Wrapper->>Wrapper: _is_retryable -> false
Wrapper-->>Handler: re-raise immediately
end
alt retries exhausted
Wrapper-->>Handler: re-raise last exception
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 16
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
webhook_server/libs/handlers/runner_handler.py (1)
1073-1088:⚠️ Potential issue | 🔴 CriticalCRITICAL:
create_issue_comment()is called with wrong argument count.
pull_request.create_issue_commentaccepts a singlebodyparameter. This call passes two positional strings, so whengh pr createfails, the fallback comment will raiseTypeErrorinstead of posting recovery instructions. PyGithub's signature iscreate_issue_comment(body: str).Proposed fix
- await github_api_call( - pull_request.create_issue_comment, - "**Cherry-pick branch created, but PR creation failed**\n", - f"Branch `{new_branch_name}` was pushed to the repository.\n" - f"Create the PR manually:\n" - "```\n" - f"gh pr create --repo {repo_full_name}" - f" --base {target_branch}" - f" --head {new_branch_name}" - f" --label {cherry_picked_label}" - + (f" --label {AI_RESOLVED_CONFLICTS_LABEL}" if cherry_pick_had_conflicts else "") - + f" --title '{pr_title}'" - f" --body '{pr_body}'\n" - "```", - logger=self.logger, - log_prefix=self.log_prefix, - ) + body = ( + "**Cherry-pick branch created, but PR creation failed**\n" + f"Branch `{new_branch_name}` was pushed to the repository.\n" + f"Create the PR manually:\n" + "```\n" + f"gh pr create --repo {repo_full_name}" + f" --base {target_branch}" + f" --head {new_branch_name}" + f" --label {cherry_picked_label}" + + (f" --label {AI_RESOLVED_CONFLICTS_LABEL}" if cherry_pick_had_conflicts else "") + + f" --title '{pr_title}'" + f" --body '{pr_body}'\n" + "```" + ) + await github_api_call( + pull_request.create_issue_comment, + body, + logger=self.logger, + log_prefix=self.log_prefix, + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@webhook_server/libs/handlers/runner_handler.py` around lines 1073 - 1088, The call to pull_request.create_issue_comment is passing multiple positional strings, causing a TypeError because create_issue_comment(body: str) expects a single string; fix this in the github_api_call invocation by concatenating the message into one string (e.g., assemble a single variable like body using new_branch_name, repo_full_name, target_branch, cherry_picked_label, AI_RESOLVED_CONFLICTS_LABEL, pr_title, pr_body, and cherry_pick_had_conflicts) and then call await github_api_call(pull_request.create_issue_comment, body, logger=self.logger, log_prefix=self.log_prefix).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.dev/docker-compose.yaml:
- Line 15: Replace the overly-broad "privileged: true" setting with explicit
minimal capabilities and device/security options: remove privileged: true and
add cap_add entries (SYS_ADMIN, MKNOD, NET_ADMIN, SETUID, SETGID), devices entry
(/dev/fuse) and security_opt entries (label=disable, seccomp=unconfined);
optionally make privileged togglable by using an environment variable
(PRIVILEGED_MODE) and keeping the same cap_add/devices/security_opt when
enabled; apply the same change to both the .dev/docker-compose.yaml and
examples/docker-compose.yaml files, targeting the "privileged" key in each
compose service block.
- Around line 11-14: Disable unauthenticated endpoints by default and restrict
binding: set ENABLE_LOG_SERVER and ENABLE_MCP_SERVER to false in the
docker-compose service and change the port mapping from "19876:5000" to bind
only localhost (e.g., "127.0.0.1:19876:5000") or remove the mapping entirely so
the log/MCP server is not exposed on 0.0.0.0; update references to
ENABLE_LOG_SERVER and ENABLE_MCP_SERVER and the "19876:5000" mapping
accordingly.
In @.dev/local-server-with-smee.sh:
- Around line 8-9: The cleanup() trap must guard against unset or stale SMEE_PID
to avoid failures; update cleanup() to first check that the SMEE_PID variable is
non-empty and a valid positive integer and that a process with that PID exists
(e.g., test with kill -0 "$SMEE_PID" or pid existence check) before invoking
kill, and ensure any kill errors are suppressed/handled so the EXIT trap is
idempotent; reference the cleanup() function and SMEE_PID variable when applying
this change.
- Around line 2-6: The startup script must source the env file relative to the
script location and validate required env vars before starting processes:
compute the script directory and source the .env from that directory (instead of
relying on the current working directory), then check SMEE_URL, SERVER_PORT, and
DOCKER_COMPOSE_FILE are set and non-empty (exit with a clear error message if
any are missing). Also harden the cleanup/trap logic to only kill SMEE_PID if it
is set and a valid numeric PID (or check the process exists) to avoid killing an
empty/invalid PID. Update references in the script to use the validated
variables going forward.
In `@webhook_server/libs/github_api.py`:
- Around line 1182-1183: _get_last_commit currently consumes the PaginatedList
returned by pull_request.get_commits outside the retryable github_api_call
boundary; change the call so the pagination is iterated inside github_api_call
by passing a lambda that converts the generator to a list (e.g.,
github_api_call(lambda: list(pull_request.get_commits()), logger=self.logger,
log_prefix=self.log_prefix)), then return the last element from that returned
list; update the call site in _get_last_commit to use this pattern so iteration
is inside github_api_call.
In `@webhook_server/libs/handlers/check_run_handler.py`:
- Around line 424-426: The call to github_api_call passes pull_request.base.ref
directly, causing evaluation before the wrapper; change the argument to be a
zero-arg lambda that returns pull_request.base.ref (i.e.,
github_api_call(lambda: pull_request.base.ref, self.repository.get_branch, ...))
so the property is accessed inside github_api_call; keep the same target
function (self.repository.get_branch) and the existing logger/log_prefix
parameters (self.logger, self.log_prefix) intact.
- Around line 265-267: The loop over the PaginatedList returned by
github_api_call(self.github_webhook.last_commit.get_check_runs, ...) causes lazy
page fetches on the event loop; change the call so the PaginatedList is fully
materialized inside the thread pool (e.g., invoke github_api_call with a lambda
that calls list(self.github_webhook.last_commit.get_check_runs()) or otherwise
wraps the method call in list(...)) so the returned object is a concrete list
and the subsequent for run in ... iteration does not block the event loop.
In `@webhook_server/libs/handlers/issue_comment_handler.py`:
- Around line 412-414: The current call materializes the PaginatedList outside
the retry wrapper (repo_contributors = list(await
github_api_call(self.repository.get_contributors, ...))), which can block the
event loop; change it so github_api_call performs the iteration by passing a
lambda that converts the PaginatedList to a list inside the retry wrapper (e.g.,
call github_api_call(lambda: list(self.repository.get_contributors()),
logger=self.logger, log_prefix=self.log_prefix)) so that `repo_contributors` is
assigned from the awaited result of that wrapped call; keep references to
github_api_call, self.repository.get_contributors, logger and log_prefix
unchanged.
In `@webhook_server/libs/handlers/labels_handler.py`:
- Around line 115-116: The current code calls
github_api_call(pull_request.get_labels, ...) and then iterates over the
returned PaginatedList outside github_api_call; change the call to materialize
the PaginatedList inside github_api_call by passing a lambda that returns
list(pull_request.get_labels()), e.g. github_api_call(lambda:
list(pull_request.get_labels()), logger=self.logger,
log_prefix=self.log_prefix), then iterate over that list to build and return
[lb.name for lb in labels]; ensure references to github_api_call and
pull_request.get_labels are used as shown.
In `@webhook_server/libs/handlers/owners_files_handler.py`:
- Around line 496-498: The current iteration over the PaginatedList returned by
pull_request.get_issue_comments occurs on the event loop; wrap the iteration
inside the worker thread by calling github_api_call with a lambda that
materializes the list (e.g., lambda: list(pull_request.get_issue_comments(...)))
so the full PaginatedList is fetched inside github_api_call; update the call
site where you currently pass pull_request.get_issue_comments to instead pass a
lambda that returns list(pull_request.get_issue_comments(...)) and then iterate
the returned concrete list.
In `@webhook_server/libs/handlers/pull_request_handler.py`:
- Around line 1559-1564: The code currently treats any falsy
pull_request.mergeable as a failure; change it to only fail when mergeable is
explicitly False by reusing the polling behavior in
label_pull_request_by_merge_state(): call github_api_call to fetch
pull_request.mergeable (as before) but if the result is None, poll/retry (with
the same delay/count/timeout logic used by label_pull_request_by_merge_state())
until mergeable is True or False, and only append the failure message when the
final value is False; keep using github_api_call and reference
pull_request.mergeable and label_pull_request_by_merge_state to locate and align
the retry logic.
- Around line 943-948: The PR comment creation is awaited inline inside the
tag-deletion critical path (calls to github_api_call with
pull_request.create_issue_comment), which can cause the delete functions
(_delete_registry_tag_via_regctl and _delete_ghcr_tag_via_github_api) to surface
failures from comment posting; change this to fire-and-forget: after the delete
completes, spawn a detached async task/future to call
github_api_call(pull_request.create_issue_comment, ...) and log any errors
inside that task without re-raising, so the main delete flow only fails on
actual delete errors and not on notification failures; use
self.logger/self.log_prefix for internal task logging and ensure the original
awaited delete call remains unchanged.
- Around line 224-226: The create calls (e.g.,
github_api_call(pull_request.create_issue_comment, ...)) are non-idempotent and
retries can duplicate comments; make these writes idempotent by adding a
deterministic marker/nonce to the comment body (e.g., "// marker:
welcome-<pr-number>" or similar) and change the flow to first search existing
comments via pull_request.get_issue_comments (or the corresponding list call)
for that marker and skip create if found, otherwise call github_api_call to
create; alternatively implement an upsert by creating then editing the comment
if a matching marker exists — apply this pattern to all affected operations
(create_issue_comment and the other create-style writes noted in the review) so
retries do not produce duplicates.
In `@webhook_server/tests/test_github_retry.py`:
- Around line 99-288: Patch asyncio.to_thread in these tests to avoid real
thread offloading: add a patch for
"webhook_server.utils.github_retry.asyncio.to_thread" (new_callable=AsyncMock)
to each test decorator or as a class-level patch so each test gets an extra mock
parameter (e.g., mock_to_thread), and configure mock_to_thread.side_effect =
lambda fn, *a, **kw: fn(*a, **kw) so github_api_call (which delegates to
asyncio.to_thread) runs synchronously and deterministic for backoff/assertions.
In `@webhook_server/utils/github_retry.py`:
- Around line 96-98: The retry wrapper currently allows re-executing mutating
GitHub calls (e.g., create_issue_comment, create_issue, create_review_request,
create_check_run, merge) which can duplicate side effects; modify the retry
logic in webhook_server/utils/github_retry.py to make write retries explicit by
splitting policies into read vs write (e.g., add two decorators or functions
like retry_read_only and retry_mutating) and change the default wrapper to use
retry_read_only; require an explicit opt-in flag or different callsite wrapper
(e.g., retry_mutating(retries=..., force_opt_in=True)) for mutating functions so
callers of
create_issue_comment/create_issue/create_review_request/create_check_run/merge
must opt-in to retry behavior.
- Around line 46-62: The retry classifier _is_retryable currently skips explicit
handling of urllib3.exceptions.ResponseError; import ResponseError and update
_is_retryable to include an isinstance(ex, ResponseError) check alongside
RequestsConnectionError and MaxRetryError so ResponseError is treated as
retryable (follow the same pattern used for MaxRetryError in the existing
block).
---
Outside diff comments:
In `@webhook_server/libs/handlers/runner_handler.py`:
- Around line 1073-1088: The call to pull_request.create_issue_comment is
passing multiple positional strings, causing a TypeError because
create_issue_comment(body: str) expects a single string; fix this in the
github_api_call invocation by concatenating the message into one string (e.g.,
assemble a single variable like body using new_branch_name, repo_full_name,
target_branch, cherry_picked_label, AI_RESOLVED_CONFLICTS_LABEL, pr_title,
pr_body, and cherry_pick_had_conflicts) and then call await
github_api_call(pull_request.create_issue_comment, body, logger=self.logger,
log_prefix=self.log_prefix).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: cae89f37-bd88-4ca6-a359-fbe5e3002215
📒 Files selected for processing (16)
.dev/docker-compose.yaml.dev/local-server-with-smee.shCLAUDE.mdwebhook_server/libs/github_api.pywebhook_server/libs/handlers/check_run_handler.pywebhook_server/libs/handlers/issue_comment_handler.pywebhook_server/libs/handlers/labels_handler.pywebhook_server/libs/handlers/owners_files_handler.pywebhook_server/libs/handlers/pull_request_handler.pywebhook_server/libs/handlers/push_handler.pywebhook_server/libs/handlers/runner_handler.pywebhook_server/libs/test_oracle.pywebhook_server/tests/test_clean_rebase_detection.pywebhook_server/tests/test_github_retry.pywebhook_server/tests/test_test_oracle.pywebhook_server/utils/github_retry.py
a0cae1c to
389d213
Compare
389d213 to
c6feea9
Compare
|
@coderabbitai[bot] The following review comments were reviewed and a decision was made:
|
|
Tip For best results, initiate chat on the files or code changes.
🧠 Learnings usedIf you found this review helpful, would you consider giving us a shout-out on X? Thank you for using CodeRabbit! |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
webhook_server/libs/handlers/issue_comment_handler.py (1)
100-127:⚠️ Potential issue | 🔴 CriticalCRITICAL:
CancelledErroris silently swallowed in the command fan-out.In Python,
asyncio.CancelledErroris aBaseException, not anException. Whenasyncio.gather(..., return_exceptions=True)collects results from cancelled tasks, it returnsCancelledErrorobjects. The current code checksisinstance(result, Exception)first, which fails forCancelledErrorsince it's not a subclass ofException. This means cancelled command tasks are treated as success and the handler step completes normally instead of propagating the cancellation signal—breaking handler cancellation semantics.Fix
failed_commands: list[tuple[str, Exception]] = [] for idx, result in enumerate(results): user_command = _user_commands[idx] + if isinstance(result, asyncio.CancelledError): + raise result if isinstance(result, Exception): - # Re-raise CancelledError immediately to allow cancellation to propagate - if isinstance(result, asyncio.CancelledError): - raise result self.logger.error(f"{self.log_prefix} Command execution failed: /{user_command} - {result}") failed_commands.append((user_command, result))Check for
CancelledErrorbefore checking for otherExceptiontypes to ensure cancellation propagates correctly. Per coding guidelines: "Always re-raiseasyncio.CancelledErrorin except blocks to preserve cancellation semantics".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@webhook_server/libs/handlers/issue_comment_handler.py` around lines 100 - 127, The code that processes results from asyncio.gather is incorrectly treating asyncio.CancelledError as a successful result because it first checks isinstance(result, Exception); change the result-handling logic in the block that iterates over results (the for loop using results, _user_commands, and idx) to first detect and re-raise asyncio.CancelledError (or isinstance(result, asyncio.CancelledError)) before handling other Exception types so that cancellations from tasks created via asyncio.create_task(self.user_commands(...)) propagate correctly; ensure this check occurs prior to any Exception-based handling of failures.
♻️ Duplicate comments (2)
webhook_server/libs/handlers/pull_request_handler.py (1)
1171-1178:⚠️ Potential issue | 🟠 MajorHIGH:
pull_request.user.loginis still evaluated beforegithub_api_call()starts.Python evaluates call arguments first, so both sites can still block the event loop and bypass retry entirely. In
add_pull_request_owner_as_assingee(), a transient failure resolving.user.loginwill also send control into the fallback-assignee path even though the assignee lookup was the part that failed. Fetch the login first, e.g.assignee_login = await github_api_call(lambda: pull_request.user.login, logger=self.logger, log_prefix=self.log_prefix), and then reuseassignee_loginin the wrapped call.Read-only verification
#!/bin/bash set -euo pipefail rg -n -C2 'pull_request\.user\.login' webhook_server/libs/handlers/pull_request_handler.py fd -a 'CLAUDE.md' | xargs -r rg -n 'user\.login|base\.ref|Safe operations'Based on learnings, "Only PyGithub properties or methods that perform live API calls (e.g.,
.user.login,.base.ref,get_pull(),create_issue_comment(), etc.) should be wrapped to avoid blocking."Also applies to: 1497-1503
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@webhook_server/libs/handlers/pull_request_handler.py` around lines 1171 - 1178, The call currently passes pull_request.user.login as a positional argument to github_api_call which evaluates the property before github_api_call runs; change this by first resolving the owner login with an awaited github_api_call wrapper (e.g., assignee_login = await github_api_call(lambda: pull_request.user.login, logger=self.logger, log_prefix=self.log_prefix)) and then call github_api_call(self.repository.create_issue, ..., assignee=assignee_login, ...); update the method containing this code (e.g., add_pull_request_owner_as_assingee) and apply the same pattern to the other occurrences noted (around the block at the later occurrence that references pull_request.user.login) so any PyGithub property access is wrapped and awaited before being passed into github_api_call.webhook_server/libs/github_api.py (1)
1025-1029:⚠️ Potential issue | 🟠 MajorHIGH: Materialize
get_pulls()insidegithub_api_call().PyGithub documents
Commit.get_pulls()as returningPaginatedList[PullRequest], andPaginatedListis consumed through collection operations like indexing._pulls[0]therefore still happens outside the retry/non-blocking boundary here, which reintroduces the same pagination bug you already fixed forget_commits(). (pygithub.readthedocs.io)Proposed fix
commit_obj = await github_api_call( self.repository.get_commit, commit["sha"], logger=self.logger, log_prefix=self.log_prefix ) with contextlib.suppress(Exception): - _pulls = await github_api_call(commit_obj.get_pulls, logger=self.logger, log_prefix=self.log_prefix) - if _pulls: - self.logger.debug(f"{self.log_prefix} Found PR from commit SHA: {_pulls[0].number}") - return _pulls[0] + pulls = await github_api_call( + lambda: list(commit_obj.get_pulls()), + logger=self.logger, + log_prefix=self.log_prefix, + ) + if pulls: + self.logger.debug(f"{self.log_prefix} Found PR from commit SHA: {pulls[0].number}") + return pulls[0]As per coding guidelines, "Wrap PaginatedList iteration with
github_api_call(lambda: list(...), ...)to prevent blocking the event loop."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@webhook_server/libs/github_api.py` around lines 1025 - 1029, The call to commit_obj.get_pulls is returning a PaginatedList and the indexing _pulls[0] happens outside the retry/non-blocking wrapper; change the github_api_call invocation to materialize the list within the wrapper (e.g., call github_api_call(lambda: list(commit_obj.get_pulls()), ...)) so the pagination iteration happens inside github_api_call, then use the returned list to check and log the first PR (the symbols to update are commit_obj.get_pulls, github_api_call, the local _pulls variable and the self.logger.debug(...) that references _pulls[0]).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@webhook_server/libs/handlers/owners_files_handler.py`:
- Around line 209-211: The code is using github_api_call for local filesystem
operations (e.g., reading a file via content_path.read_text and enumerating with
rglob) which misclassifies disk I/O as GitHub API failures; replace those
github_api_call usages in owners_files_handler.py with plain thread-offloading
using asyncio.to_thread: call await asyncio.to_thread(content_path.read_text,
encoding="utf-8") for reading file contents and use await
asyncio.to_thread(lambda: list(content_root.rglob(...))) or await
asyncio.to_thread(list, content_root.rglob(...)) for recursive file listing,
remove any retry/logging behavior tied to github_api_call for these paths, and
keep using self.logger/self.log_prefix for local-file error logging where
appropriate.
In `@webhook_server/libs/handlers/pull_request_handler.py`:
- Around line 1023-1033: The exception handler logs "Tag cleanup succeeded"
which contradicts reality—this branch runs when registry login/delete failed—so
update the message in the except branch (around github_api_call invoking
pull_request.create_issue_comment) to accurately state that tag cleanup failed
and the PR notification failed (e.g., use "Tag cleanup failed and PR
notification failed: {repository_full_tag}") and include self.log_prefix and the
caught exception context via self.logger.exception so the log reflects the
failure and includes error details.
- Around line 1766-1768: Remove the redundant truthiness guard around the
pull_request parameter: delete the "if pull_request and" short-circuit and
directly call github_api_call(lambda: pull_request.is_merged(), ...) so a
missing/None pull_request fails fast; alternatively add an explicit check that
raises a ValueError (e.g., if pull_request is None: raise
ValueError("pull_request is required")) before calling github_api_call. This
touches the usage of the pull_request parameter and the github_api_call(lambda:
pull_request.is_merged(), logger=self.logger, log_prefix=self.log_prefix)
invocation in the pull request handler.
In `@webhook_server/libs/handlers/runner_handler.py`:
- Around line 724-725: Change is_branch_exists to return a bool and handle 404
instead of propagating exceptions: update the signature of
is_branch_exists(self, branch: str) -> bool, import GithubException from
github.GithubException, call github_api_call(self.repository.get_branch, branch,
...) inside a try/except, return True on success, catch GithubException and if
ex.status == 404 return False, otherwise re-raise the exception; keep logger and
log_prefix parameters when calling github_api_call.
In `@webhook_server/tests/test_github_retry.py`:
- Around line 73-83: Add unit tests that explicitly exercise
urllib3.exceptions.ResponseError so the explicit type-check branch in
_is_retryable is covered: import ResponseError from urllib3.exceptions and add a
test like test_urllib3_response_error_is_retryable that constructs
ResponseError("connection reset by peer") and asserts _is_retryable(ex) is True;
additionally add a parallel case in the TestGithubApiCall retry-flow tests (the
tests that currently exercise RequestsConnectionError and MaxRetryError) to
simulate/raise a ResponseError and assert the retry behavior is triggered for
the same code paths. Ensure the tests reference _is_retryable and the
TestGithubApiCall retry logic so coverage protects the explicit ResponseError
branch.
In `@webhook_server/utils/github_retry.py`:
- Around line 65-70: The wrapper github_api_call currently erases the wrapped
callable's parameter types by using Callable[..., T] and *args: Any/**kwargs:
Any; fix this by introducing a ParamSpec (e.g., P) and make the function generic
over P and T, change the callable type to Callable[P, T], and update the
signature to accept *args: P.args and **kwargs: P.kwargs (keeping logger and
log_prefix params) so callers preserve proper type checking; also import
ParamSpec from typing and adjust any relevant type variables declarations (P, T)
accordingly.
---
Outside diff comments:
In `@webhook_server/libs/handlers/issue_comment_handler.py`:
- Around line 100-127: The code that processes results from asyncio.gather is
incorrectly treating asyncio.CancelledError as a successful result because it
first checks isinstance(result, Exception); change the result-handling logic in
the block that iterates over results (the for loop using results,
_user_commands, and idx) to first detect and re-raise asyncio.CancelledError (or
isinstance(result, asyncio.CancelledError)) before handling other Exception
types so that cancellations from tasks created via
asyncio.create_task(self.user_commands(...)) propagate correctly; ensure this
check occurs prior to any Exception-based handling of failures.
---
Duplicate comments:
In `@webhook_server/libs/github_api.py`:
- Around line 1025-1029: The call to commit_obj.get_pulls is returning a
PaginatedList and the indexing _pulls[0] happens outside the retry/non-blocking
wrapper; change the github_api_call invocation to materialize the list within
the wrapper (e.g., call github_api_call(lambda: list(commit_obj.get_pulls()),
...)) so the pagination iteration happens inside github_api_call, then use the
returned list to check and log the first PR (the symbols to update are
commit_obj.get_pulls, github_api_call, the local _pulls variable and the
self.logger.debug(...) that references _pulls[0]).
In `@webhook_server/libs/handlers/pull_request_handler.py`:
- Around line 1171-1178: The call currently passes pull_request.user.login as a
positional argument to github_api_call which evaluates the property before
github_api_call runs; change this by first resolving the owner login with an
awaited github_api_call wrapper (e.g., assignee_login = await
github_api_call(lambda: pull_request.user.login, logger=self.logger,
log_prefix=self.log_prefix)) and then call
github_api_call(self.repository.create_issue, ..., assignee=assignee_login,
...); update the method containing this code (e.g.,
add_pull_request_owner_as_assingee) and apply the same pattern to the other
occurrences noted (around the block at the later occurrence that references
pull_request.user.login) so any PyGithub property access is wrapped and awaited
before being passed into github_api_call.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 28cd11e8-16eb-420d-9716-a4c69483ad0b
📒 Files selected for processing (14)
CLAUDE.mdwebhook_server/libs/github_api.pywebhook_server/libs/handlers/check_run_handler.pywebhook_server/libs/handlers/issue_comment_handler.pywebhook_server/libs/handlers/labels_handler.pywebhook_server/libs/handlers/owners_files_handler.pywebhook_server/libs/handlers/pull_request_handler.pywebhook_server/libs/handlers/push_handler.pywebhook_server/libs/handlers/runner_handler.pywebhook_server/libs/test_oracle.pywebhook_server/tests/test_clean_rebase_detection.pywebhook_server/tests/test_github_retry.pywebhook_server/tests/test_test_oracle.pywebhook_server/utils/github_retry.py
c6feea9 to
94f2b86
Compare
|
@coderabbitai[bot] The following review comments were reviewed and a decision was made:
|
|
Tip For best results, initiate chat on the files or code changes.
🧠 Learnings used |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
New container for ghcr.io/myk-org/github-webhook-server:latest published |
Summary
Replace all
asyncio.to_thread()GitHub API calls withgithub_api_call()wrapper that retries on transient HTTP 500/502/503/504 errors with exponential backoff (2s, 4s, 8s, 16s — max 4 retries).Problem
When GitHub's API experiences temporary issues (HTTP 500 errors), all API calls fail immediately after urllib3's built-in retries exhaust. This was observed in production on the RedHat webhook server for
mtv-api-testswhere multiple webhooks failed withtoo many 500 error responses.Solution
github_api_call()function inwebhook_server/utils/github_retry.py— drop-in replacement forasyncio.to_thread()with retry logicConnectionError,MaxRetryErrorasyncio.to_thread()GitHub API calls replaced across 9 handler fileshttpx) path inget_unresolved_review_threadsalso gets retry logicshutil.rmtree) remain asasyncio.to_thread()CLAUDE.mdupdated to enforcegithub_api_call()over rawasyncio.to_thread()Testing
github_retry.py)myk-org/for-testing-only— PR created, labels/reviewers/check runs all processed correctly with zero errorsAlso included
.dev/local-server-with-smee.sh— proper script with smee background process and cleanup trap.dev/docker-compose.yaml— addeddevelop.watchfordocker compose --watchsupportCloses #1070
Summary by CodeRabbit
New Features
Documentation
Tests