feat: detect clean rebase on PR synchronize and preserve review labels#1060
feat: detect clean rebase on PR synchronize and preserve review labels#1060
Conversation
When a PR is synchronized, detect if it's a clean rebase (same code changes replayed on a newer base) by comparing SHA-256 hashes of git diff output. On clean rebase, preserve review labels (approved-by, lgtm-by, verified, commented-by, changes-requested-by) and post a comment listing preserved labels. Closes #1059
|
Caution Review failedPull request was closed or merged during review WalkthroughDetects clean rebases on PR synchronize by hashing git diffs before/after the push; when detected, preserves review/verified labels, posts a best-effort comment listing preserved labels, and continues normal synchronize processing while skipping label removal and adjusting verified-check handling. Changes
Sequence DiagramsequenceDiagram
participant GitHub as GitHub Webhook
participant Handler as PullRequestHandler
participant Git as Git Operations
participant Hasher as SHA256 Hasher
participant Labels as Label Manager
participant Comment as Comment Poster
GitHub->>Handler: synchronize event (before, after)
Handler->>Git: git -C <clone_dir> fetch origin (timeout=60)
Handler->>Git: git -C <clone_dir> merge-base origin/<base_ref> <before> (timeout=60)
Handler->>Git: git -C <clone_dir> merge-base origin/<base_ref> <after> (timeout=60)
par Compute before/after hashes
Handler->>Git: git -C <clone_dir> diff --binary <merge-base>..<before> (timeout=60)
Git-->>Hasher: diff bytes (before)
Hasher->>Hasher: SHA-256(before)
and
Handler->>Git: git -C <clone_dir> diff --binary <merge-base>..<after> (timeout=60)
Git-->>Hasher: diff bytes (after)
Hasher->>Hasher: SHA-256(after)
end
Hasher-->>Handler: hashes equal?
alt Clean rebase
Handler->>Comment: _post_clean_rebase_comment(preserved_labels)
Handler->>Handler: process_opened_or_synchronize_pull_request(..., is_clean_rebase=True, label_names=...)
Handler->>Labels: _sync_verified_check_for_clean_rebase(label_names)
else Not clean
Handler->>Labels: remove_labels_when_pull_request_sync()
Handler->>Handler: process_opened_or_synchronize_pull_request(..., is_clean_rebase=False)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Report bugs in Issues Welcome! 🎉This pull request will be automatically processed with the following features: 🔄 Automatic Actions
📋 Available CommandsPR Status Management
Review & Approval
Testing & Validation
Container Operations
Cherry-pick Operations
Label Management
✅ Merge RequirementsThis PR will be automatically approved when the following conditions are met:
📊 Review ProcessApprovers and ReviewersApprovers:
Reviewers:
Available Labels
AI Features
💡 Tips
For more information, please refer to the project documentation or contact the maintainers. |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@webhook_server/libs/handlers/pull_request_handler.py`:
- Around line 227-255: The current clean-rebase flow eagerly reads
pull_request.labels and builds a comment before scheduling sync tasks, so a
failure fetching labels prevents process_opened_or_synchronize_pull_request from
running; make the comment-posting best-effort by moving label reading and
create_issue_comment into a separate helper (e.g., _post_clean_rebase_comment)
that wraps the label fetch and comment creation in try/except (re-raising
asyncio.CancelledError) and logs exceptions, and ensure sync_tasks always
includes self.process_opened_or_synchronize_pull_request(pull_request=...,
is_clean_rebase=True) regardless of comment fetch/post success so CI/setup
continues even if posting the clean-rebase comment fails.
- Around line 92-168: The _is_clean_rebase logic must "fail closed": wrap the
parts that can raise (accessing pull_request.base.ref and each await
run_command(...) call inside the _is_clean_rebase function) in try/except that
catches Exception, logs a warning including the exception info, and returns
False; ensure you re-raise asyncio.CancelledError immediately if caught.
Specifically, guard the access to pull_request.base.ref, and each call to
run_command, so any unexpected exception results in an immediate return False
(with a warning) rather than propagating, while preserving the special-case
re-raise for asyncio.CancelledError.
In `@webhook_server/tests/test_clean_rebase_detection.py`:
- Around line 371-399: The test currently only asserts different hashes for two
strings and doesn't verify that _is_clean_rebase actually uses hashlib.sha256;
update the test_hashes_diff_output_with_sha256 to patch
webhook_server.libs.handlers.pull_request_handler.hashlib.sha256 with
wraps=hashlib.sha256 (using patch(..., wraps=hashlib.sha256) as mock_sha256)
before calling await handler._is_clean_rebase(mock_pull_request), then assert
result is False and that mock_sha256.call_count == 2 to prove both diffs were
hashed by that function.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 84acb62c-be97-4142-b291-2cf238a952b8
📒 Files selected for processing (3)
webhook_server/libs/handlers/pull_request_handler.pywebhook_server/tests/test_clean_rebase_detection.pywebhook_server/tests/test_pull_request_handler.py
Binary files produce generic "Binary files differ" text in git diff, which would hash identically for different binary changes, causing false positive clean rebase detection. The --binary flag outputs full binary content (base85-encoded) ensuring different changes produce different hashes.
- Wrap _is_clean_rebase() in try/except to return False on any unexpected exception (re-raises CancelledError) - Add timeout=60 to all run_command calls to prevent hangs - Extract comment posting into _post_clean_rebase_comment() helper with its own try/except so CI/setup always runs even if comment fails - Fix test to verify hashlib.sha256 is actually called
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@webhook_server/libs/handlers/pull_request_handler.py`:
- Around line 189-212: The handler is directly reading labels from
pull_request.labels and posting comments via pull_request.create_issue_comment;
instead, replace the direct label access with a call to
self.labels_handler.pull_request_labels_names(...) to get review_labels, and
send the comment through the unified API on self.github_webhook.unified_api (do
not call pull_request.create_issue_comment). Keep the same comment_body
construction logic but obtain labels and perform the write using the repo's
unified API wrapper to preserve centralized retry/auth/logging.
- Around line 128-144: The code currently assumes run_command success implies a
valid merge-base; instead, after calling run_command for both old and new heads
(checking old_merge_base_out and new_merge_base_out), verify that
old_merge_base_out.strip() and new_merge_base_out.strip() are non-empty and
return False (treat as non-clean) if either is empty before creating
old_merge_base_q / new_merge_base_q; update the handling around the variables
old_merge_base_out, new_merge_base_out, old_merge_base, new_merge_base and the
run_command calls so an empty stdout leads to logging a warning and returning
False rather than proceeding to build the diff range.
In `@webhook_server/tests/test_clean_rebase_detection.py`:
- Line 13: Tests currently mutate MagicMock.labels at the class level by
assigning to type(mock_pull_request).labels, leaking state between tests;
replace those assignments with a scoped patch using unittest.mock.patch.object
and unittest.mock.PropertyMock (e.g., use patch.object(type(mock_pull_request),
"labels", new_callable=PropertyMock) as pm: pm.return_value = [...] or use the
context/fixture form) so the labels attribute is set only for the duration of
the test and is automatically restored; update occurrences that reference
type(mock_pull_request).labels (the places around mock_pull_request in these
tests) to use this scoped patch pattern.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 1327f9bd-1cf3-449f-9900-adc214aab9d7
📒 Files selected for processing (2)
webhook_server/libs/handlers/pull_request_handler.pywebhook_server/tests/test_clean_rebase_detection.py
- Guard against empty merge-base stdout before building diff ranges - Use labels_handler.pull_request_labels_names() instead of direct pull_request.labels access in _post_clean_rebase_comment - Replace class-level type() mutations with scoped AsyncMock in tests
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@webhook_server/libs/handlers/pull_request_handler.py`:
- Around line 93-96: Replace the live PyGithub lookup used to set base_ref
(currently assigned via await asyncio.to_thread(lambda: pull_request.base.ref))
with the value from the webhook payload: read
self.hook_data["pull_request"]["base"]["ref"] directly when assigning base_ref;
keep the same variable name (base_ref: str) and preserve the surrounding code
that uses before_sha, after_sha and clone_dir
(self.github_webhook.clone_repo_dir); optionally add a simple fallback or
validation if the payload key is missing (e.g., default to an empty string) to
avoid KeyError.
In `@webhook_server/tests/test_clean_rebase_detection.py`:
- Around line 95-124: Patch the scheduled external integration call_test_oracle
to an AsyncMock in the shared test setup so process_pull_request_webhook_data()
doesn't schedule the real background task; inside the handler fixture where
PullRequestHandler is constructed, import AsyncMock and use
monkeypatch.setattr(...) (or unittest.mock.patch) to replace the module-level
call_test_oracle with AsyncMock(), ensuring any asyncio.create_task(...)
invocations use the stubbed AsyncMock instead of the real function; reference
process_pull_request_webhook_data and call_test_oracle to locate where to apply
the monkeypatch in the fixture.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8775fef3-b0e8-456c-87f2-a11146c93e3d
📒 Files selected for processing (2)
webhook_server/libs/handlers/pull_request_handler.pywebhook_server/tests/test_clean_rebase_detection.py
- Read base.ref from hook_data instead of live PyGithub API call - Stub call_test_oracle in all synchronize tests to prevent background task leaks - Rename unused pull_request param to _pull_request in _is_clean_rebase
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@webhook_server/libs/handlers/pull_request_handler.py`:
- Around line 1036-1039: The verification tip text in _prepare_tips_section is
outdated because clean rebases now preserve the verified label; update the
welcome/tips message to reflect that verified is not removed on a clean rebase
instead of saying “verified is removed on every new commit.” Locate
_prepare_tips_section and revise the string(s) that mention the verified label
to clarify the new behavior (mention that verified is preserved for clean
rebases), and ensure consistency with the logic in
_sync_verified_check_for_clean_rebase and
_process_verified_for_update_or_new_pull_request.
- Around line 284-289: The code performs two separate label reads for the same
PR during a clean rebase; fetch the PR's label snapshot once and pass it into
both helpers to avoid duplicate API calls and race windows. Call
pull_request_labels_names(pull_request) in the clean_rebase branch, store the
result (e.g., label_names) and pass label_names as an extra argument into
_post_clean_rebase_comment and into the path that calls
_sync_verified_check_for_clean_rebase (e.g., add a label_names parameter to
process_opened_or_synchronize_pull_request or to the internal
_sync_verified_check_for_clean_rebase), update those functions' signatures to
accept the label_names snapshot and use it instead of calling
pull_request_labels_names() internally.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 985bdcff-c23e-4b42-80db-10d9c2cd19ea
📒 Files selected for processing (2)
webhook_server/libs/handlers/pull_request_handler.pywebhook_server/tests/test_clean_rebase_detection.py
- Fetch labels once in synchronize handler and pass to both helpers to avoid duplicate API calls and race windows - Update verification tip in welcome message to reflect clean rebase behavior
There was a problem hiding this comment.
♻️ Duplicate comments (1)
webhook_server/libs/handlers/pull_request_handler.py (1)
285-295:⚠️ Potential issue | 🟠 MajorHIGH: Don’t let label snapshotting block clean-rebase synchronize.
Line 287 reintroduces the earlier failure mode: if
pull_request_labels_names()raises,process_opened_or_synchronize_pull_request(..., is_clean_rebase=True)never gets scheduled, so CI/setup stops on the path that is supposed to be best-effort. Reuse the already-loaded webhook payload labels here, or catch the lookup failure and fall back to the normal synchronize flow.🛠️ Possible fix
if clean_rebase: before_sha: str = self.hook_data["before"] - label_names = await self.labels_handler.pull_request_labels_names(pull_request=pull_request) + label_names = [label["name"] for label in pull_request_data["labels"]] sync_tasks = [ self._post_clean_rebase_comment( pull_request=pull_request, before_sha=before_sha, label_names=label_names ), self.process_opened_or_synchronize_pull_request( pull_request=pull_request, is_clean_rebase=True, label_names=label_names ), ]Based on learnings, "For this repository, prioritize speed and minimizing API calls in reviews and suggestions: reuse webhook payload data, batch GraphQL queries, cache IDs (labels/users), and avoid N+1 patterns."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@webhook_server/libs/handlers/pull_request_handler.py` around lines 285 - 295, The current clean-rebase path calls labels_handler.pull_request_labels_names(...) which can raise and prevent scheduling process_opened_or_synchronize_pull_request; modify the block around clean_rebase so you first try to reuse the webhook payload label snapshot (self.hook_data["pull_request"]["labels"] or similar) and only call pull_request_labels_names() as a fallback, or wrap that call in a try/except that logs the error and still appends the process_opened_or_synchronize_pull_request task with label_names=None (or the payload labels), ensuring both _post_clean_rebase_comment and process_opened_or_synchronize_pull_request are always scheduled regardless of label lookup failures.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@webhook_server/libs/handlers/pull_request_handler.py`:
- Around line 285-295: The current clean-rebase path calls
labels_handler.pull_request_labels_names(...) which can raise and prevent
scheduling process_opened_or_synchronize_pull_request; modify the block around
clean_rebase so you first try to reuse the webhook payload label snapshot
(self.hook_data["pull_request"]["labels"] or similar) and only call
pull_request_labels_names() as a fallback, or wrap that call in a try/except
that logs the error and still appends the
process_opened_or_synchronize_pull_request task with label_names=None (or the
payload labels), ensuring both _post_clean_rebase_comment and
process_opened_or_synchronize_pull_request are always scheduled regardless of
label lookup failures.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 2f9ca3e2-4574-4471-bea8-227126fafe07
📒 Files selected for processing (2)
webhook_server/libs/handlers/pull_request_handler.pywebhook_server/tests/test_clean_rebase_detection.py
Use pull_request_data["labels"] from the webhook payload instead of calling pull_request_labels_names() to avoid blocking CI if the API call fails on the clean rebase path.
|
@coderabbitai[bot] The following review comments were reviewed and a decision was made:
|
|
Tip For best results, initiate chat on the files or code changes.
🧠 Learnings usedIf you found this review helpful, would you consider giving us a shout-out on X? Thank you for using CodeRabbit! |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
New container for ghcr.io/myk-org/github-webhook-server:latest published |
Summary
synchronizeevents by comparing SHA-256 hashes ofgit diffoutput before/after the pushapproved-by-*,lgtm-by-*,verified,commented-by-*,changes-requested-by-*) on clean rebase instead of stripping themImplementation Details
_is_clean_rebase()method computes merge-base diffs for old and new heads, hashes them with SHA-256, and comparesFalse(strip labels as usual) on any git command failureshlex.quote()on all interpolated git command values for safety_process_verified_for_update_or_new_pull_request()on clean rebase to truly preserve the verified labelasyncio.gatherTest plan
test_clean_rebase_detection.pycovering:shlex.quote()usage verificationCloses #1059
Summary by CodeRabbit
New Features
Tests