Skip to content

fix: Add task correlation fields and structured log#890

Merged
myakove merged 34 commits intomainfrom
logs-server-fix
Nov 10, 2025
Merged

fix: Add task correlation fields and structured log#890
myakove merged 34 commits intomainfrom
logs-server-fix

Conversation

@myakove
Copy link
Copy Markdown
Collaborator

@myakove myakove commented Nov 7, 2025

  • Add task_id, task_type, task_status fields to LogEntry dataclass
  • Add format_task_fields() and _sanitize_log_value() helper functions
  • Update log_viewer.py to use task correlation fields in timeline
  • Update all logger.step() and logger.success() calls to use format_task_fields()
  • Update CSS, JS, and HTML templates from PR feat: Add GraphQL API support with PyGithub-compatible wrappers #878
  • Improve log parsing with enhanced regex patterns and timezone support
  • Add exception chaining and improve error handling in log_viewer.py
  • Fix mypy return type issues in github_api.py handler methods

Summary by CodeRabbit

  • New Features

    • Webhooks queued for background processing; added push flows (container/PyPI), Slack notifications, Push/Flow/PR modals, and a comment-with-details formatter.
  • Bug Fixes

    • Safer temp workspace cleanup, graceful shutdown/wait-for-tasks, improved error traces, token-metrics logging, and log sanitization/redaction.
  • UI/UX

    • Responsive, accessible DOM-driven log viewer with task grouping, modals, timelines, and extra entry fields (task id/type/status, token spend).
  • Infrastructure

    • Image/build tweaks (PYTHONUNBUFFERED) and revised CLI tooling installation in image build.
  • Tests/Tooling

    • Expanded test coverage, Ruff config, and ESLint pre-commit hook.

…ging

- Add task_id, task_type, task_status fields to LogEntry dataclass
- Add format_task_fields() and _sanitize_log_value() helper functions
- Update log_viewer.py to use task correlation fields in timeline
- Update all logger.step() and logger.success() calls to use format_task_fields()
- Update CSS, JS, and HTML templates from PR #878
- Improve log parsing with enhanced regex patterns and timezone support
- Add exception chaining and improve error handling in log_viewer.py
- Fix mypy return type issues in github_api.py handler methods
…lation

- Add systemd-machine-id-setup for systemd compatibility
- Remove hub from dnf install (package no longer available)
- Install hub manually via GitHub releases (v2.14.2)
@myakove-bot
Copy link
Copy Markdown
Collaborator

Report bugs in Issues

Welcome! 🎉

This pull request will be automatically processed with the following features:

🔄 Automatic Actions

  • Reviewer Assignment: Reviewers are automatically assigned based on the OWNERS file in the repository root
  • Size Labeling: PR size labels (XS, S, M, L, XL, XXL) are automatically applied based on changes
  • Issue Creation: A tracking issue is created for this PR and will be closed when the PR is merged or closed
  • Pre-commit Checks: pre-commit runs automatically if .pre-commit-config.yaml exists
  • Branch Labeling: Branch-specific labels are applied to track the target branch
  • Auto-verification: Auto-verified users have their PRs automatically marked as verified

📋 Available Commands

PR Status Management

  • /wip - Mark PR as work in progress (adds WIP: prefix to title)
  • /wip cancel - Remove work in progress status
  • /hold - Block PR merging (approvers only)
  • /hold cancel - Unblock PR merging
  • /verified - Mark PR as verified
  • /verified cancel - Remove verification status

Review & Approval

  • /lgtm - Approve changes (looks good to me)
  • /approve - Approve PR (approvers only)
  • /automerge - Enable automatic merging when all requirements are met (maintainers and approvers only)
  • /assign-reviewers - Assign reviewers based on OWNERS file
  • /assign-reviewer @username - Assign specific reviewer
  • /check-can-merge - Check if PR meets merge requirements

Testing & Validation

  • /retest tox - Run Python test suite with tox
  • /retest build-container - Rebuild and test container image
  • /retest python-module-install - Test Python package installation
  • /retest pre-commit - Run pre-commit hooks and checks
  • /retest conventional-title - Validate commit message format
  • /retest all - Run all available tests

Container Operations

  • /build-and-push-container - Build and push container image (tagged with PR number)
    • Supports additional build arguments: /build-and-push-container --build-arg KEY=value

Cherry-pick Operations

  • /cherry-pick <branch> - Schedule cherry-pick to target branch when PR is merged
    • Multiple branches: /cherry-pick branch1 branch2 branch3

Label Management

  • /<label-name> - Add a label to the PR
  • /<label-name> cancel - Remove a label from the PR

✅ Merge Requirements

This PR will be automatically approved when the following conditions are met:

  1. Approval: /approve from at least one approver
  2. LGTM Count: Minimum 1 /lgtm from reviewers
  3. Status Checks: All required status checks must pass
  4. No Blockers: No WIP, hold, or conflict labels
  5. Verified: PR must be marked as verified (if verification is enabled)

📊 Review Process

Approvers and Reviewers

Approvers:

  • myakove
  • rnetser

Reviewers:

  • myakove
  • rnetser
Available Labels
  • hold
  • verified
  • wip
  • lgtm
  • approve
  • automerge

💡 Tips

  • WIP Status: Use /wip when your PR is not ready for review
  • Verification: The verified label is automatically removed on each new commit
  • Cherry-picking: Cherry-pick labels are processed when the PR is merged
  • Container Builds: Container images are automatically tagged with the PR number
  • Permission Levels: Some commands require approver permissions
  • Auto-verified Users: Certain users have automatic verification and merge privileges

For more information, please refer to the project documentation or contact the maintainers.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Nov 7, 2025

Walkthrough

Consolidates handler modules under a new handlers package, moves webhook processing to background tasks, adds structured/task-aware logging and sanitization, extends log parsing/viewer (backend + frontend), introduces notification/container utilities, updates Dockerfile/tooling, and adapts tests and imports.

Changes

Cohort / File(s) Summary
Handlers package & core handlers
\webhook_server/libs/handlers/init.py`, `webhook_server/libs/handlers/*``
New handlers package; moved/refactored handler modules and imports; pervasive addition of structured/task-status logging (format_task_fields), many async conversions, caching/parallelism, redaction/masking propagation, and addition of PushHandler.
Individual handler modules (detailed)
\.../check_run_handler.py`, `.../issue_comment_handler.py`, `.../labels_handler.py`, `.../owners_files_handler.py`, `.../pull_request_handler.py`, `.../pull_request_review_handler.py`, `.../push_handler.py`, `.../runner_handler.py``
Per-module changes: import path updates, structured start/processing/completed/failed logs, async/await conversions (some signature changes), cached properties and parallel gather usage, tag/registry deletion scaffolding, run_podman_command signature extended, and new methods for push/pypi/tag flows.
GitHub API & dispatch
\webhook_server/libs/github_api.py``
Reworked event routing and process(); added github_api and initial_rate_limit_remaining attributes; token-metrics logging; per-delivery temp clone dirs with cleanup; normalized build/command args; removed Slack helper.
App lifecycle & webhook backgrounding
\webhook_server/app.py``
Webhooks now queue processing as background tasks and return 200 immediately; global background-task tracking and graceful shutdown; LOG_SERVER_ENABLED gating and require_log_server_enabled dependency; added /mcp endpoint and stricter validation/allowlist handling.
Log parsing & model
\webhook_server/libs/log_parser.py``
LogEntry extended with task_id, task_type, task_status, token_spend; added regex helpers, token-spend extraction, message cleaning/unescaping, and timezone-aware timestamp parsing.
Log viewer backend & streaming
\webhook_server/web/log_viewer.py``
Timeline now includes task fields and token_spend; improved exception chaining and HTML fallback; deque-based streaming buffer for reverse/yield; increased streaming/export thresholds and structured log-prefix builder.
Log viewer frontend & template
\web/static/js/log_viewer.js`, `web/static/css/log_viewer.css`, `web/templates/log_viewer.html``
New modal-driven Flow and PR UIs, accessibility/focus handling, DOM-safe rendering and deduplication; CSS/styles for flow/task UI; template extended with filters, controls, and modal containers; CONFIG constant added.
Helpers & new utilities
\webhook_server/utils/helpers.py`, `webhook_server/utils/comment_utils.py`, `webhook_server/utils/notification_utils.py`, `webhook_server/utils/container_utils.py`, `webhook_server/utils/constants.py``
Added sanitization helpers (_sanitize_log_value, format_task_fields, _redact_secrets, _truncate_output, strip_ansi_codes); run_command signature extended (redact_secrets, stdin_input, timeout, mask_sensitive); added comment_with_details, send_slack_message, get_container_repository_and_tag, and ROOT_APPROVERS_KEY.
Config & exceptions
\webhook_server/libs/config.py`, `webhook_server/libs/exceptions.py``
Config.get_value supports dot-notation/nested lookups and improved error handling; added _get_nested_value; removed ProcessGithubWebhookError; added docstrings to exceptions.
Dockerfile & tooling
\Dockerfile`, `pyproject.toml`, `.pre-commit-config.yaml`, `eslint.config.js`, `.gitignore`, `entrypoint.py`, `README.md``
Dockerfile: installs hub from tarball, sets PYTHONUNBUFFERED=1, and tweaks installs; pyproject reorganized dev/test groups and added Ruff config; ESLint pre-commit hook added; .gitignore and README updated.
Tests (many files)
\webhook_server/tests/**``
Tests updated for handler import path moves, async/background semantics, new sanitization/token/task tests, notification/container tests, consolidated imports, and adjusted expectations (HTML fallback, logger.step/completion, token_spend extraction).
Misc scripts & docs
\scripts/generate_changelog.py``, small files
Non-functional doc/comment tweaks; linting/config updates and small formatting changes.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

  • Areas needing extra attention:
    • webhook_server/app.py — background task lifecycle, immediate-200 semantics, /mcp endpoint, allowlist gating, and shutdown cancellation.
    • webhook_server/libs/github_api.py — event dispatch, temp-dir lifecycle, token-metrics accounting, and PR/commit derivation logic.
    • webhook_server/libs/log_parser.py & webhook_server/web/log_viewer.py — regex extraction, message-cleaning side-effects, token_spend propagation, timezone handling, and streaming/reversal memory behavior.
    • webhook_server/utils/helpers.pyrun_command signature change and redaction/masking semantics; verify all call sites.
    • webhook_server/libs/handlers/* — async conversions, structured logging side-effects, caching/parallel gather usage, and new tag/registry deletion flows.
    • Frontend (web/static/js, templates, css) — ensure backend data shapes match frontend expectations and accessibility/focus logic.
    • Tests — many patches and AsyncMock adjustments; validate critical end-to-end behavior.

Possibly related PRs

Suggested reviewers

  • rnetser
  • dbasunag

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 73.19% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: Add task correlation fields and structured log' clearly describes the main changes: adding task correlation fields and implementing structured logging throughout the codebase.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch logs-server-fix

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@myakove myakove changed the title Apply PR #878 changes: Add task correlation fields and structured log fix: Add task correlation fields and structured log Nov 7, 2025
- Replace deprecated tool.uv.dev-dependencies with dependency-groups.dev
- Add pytest-xdist and pytest-asyncio to dependency-groups.tests
- Update test_log_parser.py tests to match new workflow step detection logic:
  - is_workflow_step now checks for task_id and task_status instead of STEP level
  - test_log_entry_to_dict includes new task correlation fields
  - extract_workflow_steps tests updated to use task fields
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
webhook_server/web/static/css/log_viewer.css (1)

447-470: Style the modal “Close” buttons

The JS error modals render their dismissal button with class btn-secondary, but this stylesheet never defines that class, so the button falls back to the browser default styling. Please add a .btn-secondary (and hover) rule so those controls match the rest of the UI.

Apply this diff:

 .btn {
   padding: 10px 20px;
   background-color: var(--button-bg);
   color: white;
   border: none;
   border-radius: 4px;
   cursor: pointer;
   margin-right: 10px;
   transition: background-color 0.3s ease;
 }
 
 .btn:hover {
   background-color: var(--button-hover);
 }
+
+.btn-secondary {
+  padding: 8px 16px;
+  background-color: transparent;
+  color: var(--button-bg);
+  border: 1px solid var(--button-bg);
+  border-radius: 4px;
+  cursor: pointer;
+  transition:
+    background-color 0.3s ease,
+    color 0.3s ease;
+}
+
+.btn-secondary:hover {
+  background-color: var(--button-bg);
+  color: white;
+}
🧹 Nitpick comments (1)
Dockerfile (1)

65-71: Clean up hub install artifacts correctly

Untarring into the default CWD drops hub-linux-amd64-2.14.2/ under /, while rm -rf ${BIN_DIR}/hub-linux-amd64-2.14.2* never touches it or the downloaded tarball. The image keeps these stray files and future builds may run into stale leftovers. Extract to a temp directory (or ${BIN_DIR}) and delete both the temp tree and the archive once the binary is moved.

-  && tar xvf ${BIN_DIR}/hub-linux-amd64.tgz \
-  && mv hub-linux-amd64-2.14.2/bin/hub ${BIN_DIR}/hub \
-  && chmod +x ${BIN_DIR}/hub \
-  && rm -rf ${BIN_DIR}/hub-linux-amd64-2.14.2*
+  && tmp_dir="$(mktemp -d)" \
+  && tar xvf ${BIN_DIR}/hub-linux-amd64.tgz -C "${tmp_dir}" \
+  && mv "${tmp_dir}"/hub-linux-amd64-2.14.2/bin/hub ${BIN_DIR}/hub \
+  && chmod +x ${BIN_DIR}/hub \
+  && rm -rf "${tmp_dir}" ${BIN_DIR}/hub-linux-amd64.tgz
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d83bbb7 and f294792.

📒 Files selected for processing (8)
  • Dockerfile (2 hunks)
  • webhook_server/libs/github_api.py (2 hunks)
  • webhook_server/libs/log_parser.py (7 hunks)
  • webhook_server/utils/helpers.py (2 hunks)
  • webhook_server/web/log_viewer.py (9 hunks)
  • webhook_server/web/static/css/log_viewer.css (13 hunks)
  • webhook_server/web/static/js/log_viewer.js (19 hunks)
  • webhook_server/web/templates/log_viewer.html (1 hunks)
🧰 Additional context used
🧠 Learnings (9)
📓 Common learnings
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 840
File: webhook_server/docs/pr-flow-api.md:17-26
Timestamp: 2025-08-05T21:15:07.060Z
Learning: The PR Flow API endpoint (/logs/api/pr-flow/{hook_id}) in webhook_server/docs/pr-flow-api.md has no built-in authentication and is designed to be deployed in trusted networks or behind a reverse proxy for security, rather than using API-level authentication mechanisms.
📚 Learning: 2024-10-29T08:09:57.157Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 612
File: webhook_server_container/libs/github_api.py:2089-2100
Timestamp: 2024-10-29T08:09:57.157Z
Learning: In `webhook_server_container/libs/github_api.py`, when the function `_keep_approved_by_approvers_after_rebase` is called, existing approval labels have already been cleared after pushing new changes, so there's no need to check for existing approvals within this function.

Applied to files:

  • webhook_server/libs/github_api.py
📚 Learning: 2024-10-08T09:19:56.185Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 586
File: webhook_server_container/libs/github_api.py:1947-1956
Timestamp: 2024-10-08T09:19:56.185Z
Learning: In `webhook_server_container/libs/github_api.py`, the indentation style used in the `set_pull_request_automerge` method is acceptable as per the project's coding standards.

Applied to files:

  • webhook_server/libs/github_api.py
📚 Learning: 2025-10-30T00:18:06.176Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 878
File: webhook_server/libs/github_api.py:111-118
Timestamp: 2025-10-30T00:18:06.176Z
Learning: In webhook_server/libs/github_api.py, when creating temporary directories or performing operations that need repository names, prefer using self.repository_name (from webhook payload, always available) over dereferencing self.repository.name or self.repository_by_github_app.name, which may be None. This avoids AttributeError and keeps the code simple and reliable.

Applied to files:

  • webhook_server/libs/github_api.py
📚 Learning: 2024-10-14T14:13:21.316Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 588
File: webhook_server_container/libs/github_api.py:1632-1637
Timestamp: 2024-10-14T14:13:21.316Z
Learning: In the `ProcessGithubWehook` class in `webhook_server_container/libs/github_api.py`, avoid using environment variables to pass tokens because multiple commands with multiple tokens can run at the same time.

Applied to files:

  • webhook_server/libs/github_api.py
📚 Learning: 2024-10-14T14:12:28.924Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 588
File: webhook_server_container/libs/github_api.py:1638-1639
Timestamp: 2024-10-14T14:12:28.924Z
Learning: In `webhook_server_container/libs/github_api.py`, it is acceptable for `self.repository.owner.email` to be `None` since internal commands can handle this case.

Applied to files:

  • webhook_server/libs/github_api.py
📚 Learning: 2024-10-29T10:42:50.163Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 612
File: webhook_server_container/libs/github_api.py:925-926
Timestamp: 2024-10-29T10:42:50.163Z
Learning: In `webhook_server_container/libs/github_api.py`, the method `self._keep_approved_by_approvers_after_rebase()` must be called after removing labels when synchronizing a pull request. Therefore, it should be placed outside the `ThreadPoolExecutor` to ensure it runs sequentially after label removal.

Applied to files:

  • webhook_server/libs/github_api.py
📚 Learning: 2025-05-10T22:07:53.544Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 773
File: webhook_server/libs/pull_request_handler.py:553-557
Timestamp: 2025-05-10T22:07:53.544Z
Learning: In the GitHub webhook server codebase, `root_approvers` and `root_reviewers` are properties of the GithubWebhook class, not methods, so they can be used with `.copy()` without parentheses.

Applied to files:

  • webhook_server/libs/github_api.py
📚 Learning: 2025-08-05T21:15:07.060Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 840
File: webhook_server/docs/pr-flow-api.md:17-26
Timestamp: 2025-08-05T21:15:07.060Z
Learning: The PR Flow API endpoint (/logs/api/pr-flow/{hook_id}) in webhook_server/docs/pr-flow-api.md has no built-in authentication and is designed to be deployed in trusted networks or behind a reverse proxy for security, rather than using API-level authentication mechanisms.

Applied to files:

  • webhook_server/web/static/js/log_viewer.js
🧬 Code graph analysis (1)
webhook_server/libs/github_api.py (2)
webhook_server/utils/helpers.py (1)
  • format_task_fields (110-129)
webhook_server/libs/push_handler.py (1)
  • process_push_webhook_data (26-44)
🪛 Ruff (0.14.3)
webhook_server/web/log_viewer.py

85-85: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


88-88: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


217-217: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


220-220: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


362-362: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


488-488: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


534-534: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


713-713: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

Comment thread webhook_server/libs/github_api.py
Comment thread webhook_server/libs/log_parser.py Outdated
Comment thread webhook_server/web/static/js/log_viewer.js Outdated
…phQL)

- Refactor process_webhook to use asyncio.create_task for background processing
- Change return type to JSONResponse for immediate 200 OK response
- Add LOG_SERVER_ENABLED constant and require_log_server_enabled dependency
- Update lifespan to fail-close on IP verification errors (RuntimeError)
- Add background task cleanup in lifespan shutdown
- Add Query validators to log viewer endpoints
- Update log_viewer.py export limit from 50000 to 100000
- Fix tests to match new background error handling behavior
- Fix tests to match new fail-close IP verification behavior
- Fix F821 undefined name errors in test files (add missing imports)
- Fix logging.success not emitted for CAN_BE_MERGED_STR check_run events
- Fix original_index mismatch in log_viewer.js after filtering
- Fix regex patterns for task fields to handle escaped brackets
- Apply changes from feature/graphql-migration branch to app.py
- Update log_viewer.py export limit to match Query validator
- Fix all tests to match new background error handling behavior
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
pyproject.toml (1)

93-94: Resolve naming collision and clarify psutil scope in dependency definitions.

The review comment is substantively correct. Your codebase violates PEP 735 best practices by defining both [project.optional-dependencies] tests (line 94) and [dependency-groups] tests (line 108) with the same name. Avoid name collisions: don't name a dependency-group the same as an extra, as tools may error or behave inconsistently.

Additionally, psutil appears in both the main dependencies (line 78) and the tests dependency-group (line 108), creating ambiguity about its scope.

Recommended actions:

  1. Remove [project.optional-dependencies] tests — Since PEP 508 extras are for optional dependencies that are part of a published distribution, while PEP 735 dependency-groups are for local/dev/environment/tooling groups that are NOT published as package metadata, your test group belongs only in [dependency-groups].

  2. Clarify psutil's scope — If psutil is test-only, remove it from line 78; if it's a production dependency, remove it from line 108.

webhook_server/web/log_viewer.py (1)

637-666: Newest log entries are no longer returned

The new chunked streaming walks each file from the top, emits the first chunk immediately, and get_log_entries() stops once it hits limit. That means we now return the oldest block of the current log and never reach the fresh lines at the tail (same for exports/timelines), breaking the viewer’s core use case. Buffer the tail before yielding so we still serve the most recent entries while keeping the memory cap.

Apply this diff to stream from each file’s tail while respecting max_entries:

-                file_entries: list[LogEntry] = []
-
-                # Parse file in one go (files are typically reasonable size individually)
-                with open(log_file, encoding="utf-8") as f:
-                    for _, line in enumerate(f, 1):
-                        if total_yielded >= max_entries:
-                            break
-
-                        entry = self.log_parser.parse_log_entry(line)
-                        if entry:
-                            file_entries.append(entry)
-
-                        # Process in chunks to avoid memory buildup for large files
-                        if len(file_entries) >= chunk_size:
-                            # Sort chunk by timestamp (newest first) and yield
-                            file_entries.sort(key=lambda x: x.timestamp, reverse=True)
-                            for entry in file_entries:
-                                yield entry
-                                total_yielded += 1
-                                if total_yielded >= max_entries:
-                                    break
-                            file_entries.clear()  # Free memory
-
-                # Yield remaining entries from this file
-                if file_entries and total_yielded < max_entries:
-                    file_entries.sort(key=lambda x: x.timestamp, reverse=True)
-                    for entry in file_entries:
-                        if total_yielded >= max_entries:
-                            break
-                        yield entry
-                        total_yielded += 1
+                remaining_capacity = max_entries - total_yielded
+                if remaining_capacity <= 0:
+                    break
+
+                buffer: deque[LogEntry] = deque(maxlen=remaining_capacity)
+
+                with open(log_file, encoding="utf-8") as f:
+                    for line in f:
+                        entry = self.log_parser.parse_log_entry(line)
+                        if entry:
+                            buffer.append(entry)
+
+                for entry in reversed(buffer):
+                    if total_yielded >= max_entries:
+                        break
+                    yield entry
+                    total_yielded += 1
webhook_server/utils/helpers.py (1)

280-296: Use total_seconds() to report accurate reset delay

.seconds drops whole days and wraps negative deltas, so if the reset is more than 24 h out or already in the past you’ll log a wildly wrong wait (e.g., “23:59:xx” instead of “0:00:xx”). That misleads operators now that PyGithub returns timezone-aware timestamps. Swap to delta.total_seconds() and clamp at zero before formatting.(pygithub.readthedocs.io)

-    time_for_limit_reset: int = (rate_limit.rate.reset - datetime.datetime.now(tz=datetime.UTC)).seconds
+    delta = rate_limit.rate.reset - datetime.datetime.now(tz=datetime.UTC)
+    time_for_limit_reset = max(int(delta.total_seconds()), 0)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f294792 and 69bfa02.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (38)
  • pyproject.toml (3 hunks)
  • scripts/generate_changelog.py (1 hunks)
  • webhook_server/app.py (14 hunks)
  • webhook_server/libs/check_run_handler.py (4 hunks)
  • webhook_server/libs/config.py (1 hunks)
  • webhook_server/libs/github_api.py (4 hunks)
  • webhook_server/libs/issue_comment_handler.py (6 hunks)
  • webhook_server/libs/labels_handler.py (6 hunks)
  • webhook_server/libs/log_parser.py (9 hunks)
  • webhook_server/libs/owners_files_handler.py (3 hunks)
  • webhook_server/libs/pull_request_handler.py (18 hunks)
  • webhook_server/libs/push_handler.py (5 hunks)
  • webhook_server/libs/runner_handler.py (20 hunks)
  • webhook_server/tests/conftest.py (2 hunks)
  • webhook_server/tests/test_app.py (12 hunks)
  • webhook_server/tests/test_branch_protection.py (1 hunks)
  • webhook_server/tests/test_config.py (1 hunks)
  • webhook_server/tests/test_config_schema.py (21 hunks)
  • webhook_server/tests/test_edge_cases_validation.py (13 hunks)
  • webhook_server/tests/test_github_api.py (2 hunks)
  • webhook_server/tests/test_helpers.py (2 hunks)
  • webhook_server/tests/test_labels_handler.py (1 hunks)
  • webhook_server/tests/test_log_api.py (1 hunks)
  • webhook_server/tests/test_log_parser.py (9 hunks)
  • webhook_server/tests/test_memory_optimization.py (3 hunks)
  • webhook_server/tests/test_owners_files_handler.py (1 hunks)
  • webhook_server/tests/test_performance_benchmarks.py (4 hunks)
  • webhook_server/tests/test_prepare_retest_wellcome_comment.py (1 hunks)
  • webhook_server/tests/test_pull_request_handler.py (2 hunks)
  • webhook_server/tests/test_pull_request_review_handler.py (1 hunks)
  • webhook_server/tests/test_runner_handler.py (1 hunks)
  • webhook_server/tests/test_schema_validator.py (3 hunks)
  • webhook_server/utils/app_utils.py (2 hunks)
  • webhook_server/utils/github_repository_settings.py (5 hunks)
  • webhook_server/utils/helpers.py (4 hunks)
  • webhook_server/utils/webhook.py (1 hunks)
  • webhook_server/web/log_viewer.py (11 hunks)
  • webhook_server/web/static/js/log_viewer.js (19 hunks)
✅ Files skipped from review due to trivial changes (3)
  • webhook_server/tests/test_branch_protection.py
  • webhook_server/tests/test_pull_request_review_handler.py
  • scripts/generate_changelog.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • webhook_server/libs/github_api.py
🧰 Additional context used
🧠 Learnings (18)
📚 Learning: 2024-11-26T14:30:22.906Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 631
File: webhook_server_container/tests/test_github_api.py:321-327
Timestamp: 2024-11-26T14:30:22.906Z
Learning: In `webhook_server_container/tests/test_github_api.py`, when using `pytest.mark.parametrize` with indirect parameters, the `pytest.param` function may require nested lists to match the expected input structure.

Applied to files:

  • webhook_server/tests/test_log_api.py
📚 Learning: 2024-10-29T10:42:50.163Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 612
File: webhook_server_container/libs/github_api.py:925-926
Timestamp: 2024-10-29T10:42:50.163Z
Learning: In `webhook_server_container/libs/github_api.py`, the method `self._keep_approved_by_approvers_after_rebase()` must be called after removing labels when synchronizing a pull request. Therefore, it should be placed outside the `ThreadPoolExecutor` to ensure it runs sequentially after label removal.

Applied to files:

  • webhook_server/tests/test_labels_handler.py
  • webhook_server/tests/test_pull_request_handler.py
📚 Learning: 2024-10-29T08:09:57.157Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 612
File: webhook_server_container/libs/github_api.py:2089-2100
Timestamp: 2024-10-29T08:09:57.157Z
Learning: In `webhook_server_container/libs/github_api.py`, when the function `_keep_approved_by_approvers_after_rebase` is called, existing approval labels have already been cleared after pushing new changes, so there's no need to check for existing approvals within this function.

Applied to files:

  • webhook_server/tests/test_pull_request_handler.py
  • webhook_server/libs/pull_request_handler.py
  • webhook_server/tests/test_github_api.py
  • webhook_server/tests/test_app.py
📚 Learning: 2024-10-08T09:19:56.185Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 586
File: webhook_server_container/libs/github_api.py:1947-1956
Timestamp: 2024-10-08T09:19:56.185Z
Learning: In `webhook_server_container/libs/github_api.py`, the indentation style used in the `set_pull_request_automerge` method is acceptable as per the project's coding standards.

Applied to files:

  • webhook_server/tests/test_pull_request_handler.py
  • webhook_server/libs/pull_request_handler.py
  • webhook_server/tests/test_github_api.py
  • webhook_server/utils/github_repository_settings.py
📚 Learning: 2025-10-30T00:18:06.176Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 878
File: webhook_server/libs/github_api.py:111-118
Timestamp: 2025-10-30T00:18:06.176Z
Learning: In webhook_server/libs/github_api.py, when creating temporary directories or performing operations that need repository names, prefer using self.repository_name (from webhook payload, always available) over dereferencing self.repository.name or self.repository_by_github_app.name, which may be None. This avoids AttributeError and keeps the code simple and reliable.

Applied to files:

  • webhook_server/libs/config.py
  • webhook_server/libs/owners_files_handler.py
  • webhook_server/tests/test_app.py
  • webhook_server/utils/github_repository_settings.py
📚 Learning: 2024-10-09T09:16:45.452Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 579
File: webhook_server_container/libs/github_api.py:1098-1101
Timestamp: 2024-10-09T09:16:45.452Z
Learning: In the Python method `_run_tox` within `webhook_server_container/libs/github_api.py`, the variable `_tox_tests` is already comma-separated, so removing spaces with `_tox_tests.replace(" ", "")` is appropriate to handle any accidental spaces when specifying Tox environments.

Applied to files:

  • webhook_server/tests/test_prepare_retest_wellcome_comment.py
  • webhook_server/libs/runner_handler.py
📚 Learning: 2025-10-28T13:04:00.466Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 878
File: webhook_server/libs/handlers/runner_handler.py:491-571
Timestamp: 2025-10-28T13:04:00.466Z
Learning: In webhook_server/libs/handlers/runner_handler.py, the run_build_container method is designed with the pattern that push=True is always called with set_check=False in production code, so no check-run status needs to be finalized after push operations.

Applied to files:

  • webhook_server/libs/runner_handler.py
  • webhook_server/libs/push_handler.py
📚 Learning: 2024-10-15T13:18:38.590Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 602
File: webhook_server_container/libs/github_api.py:2053-2062
Timestamp: 2024-10-15T13:18:38.590Z
Learning: In the `run_podman_command` method in `webhook_server_container/libs/github_api.py`, the `rc` variable is a boolean that returns `True` when the command succeeds and `False` when it fails.

Applied to files:

  • webhook_server/libs/runner_handler.py
📚 Learning: 2025-05-10T21:39:34.243Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 773
File: webhook_server/libs/pull_request_handler.py:281-308
Timestamp: 2025-05-10T21:39:34.243Z
Learning: In the refactored codebase, the `run_podman_command` method in `RunnerHandler` returns a tuple `(bool, str, str)` where the boolean `True` represents success and `False` represents failure, which is the opposite of traditional shell exit codes.

Applied to files:

  • webhook_server/libs/runner_handler.py
📚 Learning: 2025-02-25T12:01:42.999Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 666
File: webhook_server_container/libs/github_api.py:2254-2255
Timestamp: 2025-02-25T12:01:42.999Z
Learning: In the ProcessGithubWehook class, the `self.conventional_title` attribute is set during initialization by the `_repo_data_from_config()` method, and doesn't need additional existence checks.

Applied to files:

  • webhook_server/libs/runner_handler.py
📚 Learning: 2024-10-14T14:12:28.924Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 588
File: webhook_server_container/libs/github_api.py:1638-1639
Timestamp: 2024-10-14T14:12:28.924Z
Learning: In `webhook_server_container/libs/github_api.py`, it is acceptable for `self.repository.owner.email` to be `None` since internal commands can handle this case.

Applied to files:

  • webhook_server/libs/owners_files_handler.py
  • webhook_server/tests/test_app.py
📚 Learning: 2024-11-21T13:34:45.218Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 624
File: webhook_server_container/libs/github_api.py:596-604
Timestamp: 2024-11-21T13:34:45.218Z
Learning: In the codebase, aggregation of reviewers and approvers from all OWNERS files is handled within the `assign_reviewers` method.

Applied to files:

  • webhook_server/libs/owners_files_handler.py
  • webhook_server/libs/pull_request_handler.py
📚 Learning: 2024-10-15T10:37:45.791Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 598
File: webhook_server_container/libs/github_api.py:1860-1874
Timestamp: 2024-10-15T10:37:45.791Z
Learning: In the `process_retest_command` method in `webhook_server_container/libs/github_api.py`, `_target_tests` is defined before use.

Applied to files:

  • webhook_server/tests/test_github_api.py
📚 Learning: 2025-05-10T22:07:53.544Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 773
File: webhook_server/libs/pull_request_handler.py:553-557
Timestamp: 2025-05-10T22:07:53.544Z
Learning: In the GitHub webhook server codebase, `root_approvers` and `root_reviewers` are properties of the GithubWebhook class, not methods, so they can be used with `.copy()` without parentheses.

Applied to files:

  • webhook_server/tests/test_github_api.py
  • webhook_server/tests/test_owners_files_handler.py
📚 Learning: 2025-05-10T22:07:53.544Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 773
File: webhook_server/libs/pull_request_handler.py:553-557
Timestamp: 2025-05-10T22:07:53.544Z
Learning: In the webhook server codebase, `root_approvers` and `root_reviewers` are defined as properties in the `GithubWebhook` class, not methods, so they can be accessed without parentheses and used with methods like `.copy()` directly.

Applied to files:

  • webhook_server/tests/test_owners_files_handler.py
📚 Learning: 2024-10-14T14:13:21.316Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 588
File: webhook_server_container/libs/github_api.py:1632-1637
Timestamp: 2024-10-14T14:13:21.316Z
Learning: In the `ProcessGithubWehook` class in `webhook_server_container/libs/github_api.py`, avoid using environment variables to pass tokens because multiple commands with multiple tokens can run at the same time.

Applied to files:

  • webhook_server/libs/push_handler.py
  • webhook_server/app.py
📚 Learning: 2025-05-13T12:06:27.297Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 778
File: webhook_server/libs/pull_request_handler.py:327-330
Timestamp: 2025-05-13T12:06:27.297Z
Learning: In the GitHub webhook server, synchronous GitHub API calls (like create_issue_comment, add_to_assignees, etc.) in async methods should be awaited using asyncio.to_thread or loop.run_in_executor to prevent blocking the event loop.

Applied to files:

  • webhook_server/app.py
  • webhook_server/libs/issue_comment_handler.py
📚 Learning: 2025-08-05T21:15:07.060Z
Learnt from: myakove
Repo: myk-org/github-webhook-server PR: 840
File: webhook_server/docs/pr-flow-api.md:17-26
Timestamp: 2025-08-05T21:15:07.060Z
Learning: The PR Flow API endpoint (/logs/api/pr-flow/{hook_id}) in webhook_server/docs/pr-flow-api.md has no built-in authentication and is designed to be deployed in trusted networks or behind a reverse proxy for security, rather than using API-level authentication mechanisms.

Applied to files:

  • webhook_server/app.py
  • webhook_server/web/static/js/log_viewer.js
🧬 Code graph analysis (17)
webhook_server/tests/test_log_api.py (2)
webhook_server/libs/log_parser.py (1)
  • LogEntry (15-46)
webhook_server/web/log_viewer.py (1)
  • LogViewerController (19-906)
webhook_server/libs/config.py (1)
webhook_server/utils/helpers.py (1)
  • get_github_repo_api (146-150)
webhook_server/libs/runner_handler.py (1)
webhook_server/utils/helpers.py (2)
  • format_task_fields (110-129)
  • run_command (153-207)
webhook_server/libs/owners_files_handler.py (1)
webhook_server/utils/helpers.py (1)
  • format_task_fields (110-129)
webhook_server/libs/labels_handler.py (1)
webhook_server/utils/helpers.py (1)
  • format_task_fields (110-129)
webhook_server/libs/check_run_handler.py (1)
webhook_server/utils/helpers.py (1)
  • format_task_fields (110-129)
webhook_server/libs/pull_request_handler.py (2)
webhook_server/utils/helpers.py (1)
  • format_task_fields (110-129)
webhook_server/libs/owners_files_handler.py (1)
  • OwnersFileHandler (22-367)
webhook_server/tests/test_github_api.py (1)
webhook_server/libs/github_api.py (1)
  • add_api_users_to_auto_verified_and_merged_users (283-292)
webhook_server/libs/push_handler.py (2)
webhook_server/utils/helpers.py (2)
  • format_task_fields (110-129)
  • run_command (153-207)
webhook_server/libs/runner_handler.py (1)
  • run_build_container (312-461)
webhook_server/app.py (4)
webhook_server/utils/helpers.py (2)
  • get_logger_with_params (25-89)
  • prepare_log_prefix (373-410)
webhook_server/libs/github_api.py (3)
  • prepare_log_prefix (294-302)
  • GithubWebhook (47-455)
  • process (118-280)
webhook_server/libs/exceptions.py (1)
  • RepositoryNotFoundInConfigError (1-2)
webhook_server/web/log_viewer.py (1)
  • export_logs (268-363)
webhook_server/tests/test_app.py (3)
webhook_server/app.py (1)
  • lifespan (70-178)
webhook_server/tests/test_webhook.py (1)
  • mock_config (319-329)
webhook_server/tests/test_github_repository_and_webhook_settings.py (1)
  • mock_config (66-70)
webhook_server/tests/test_edge_cases_validation.py (2)
webhook_server/tests/test_log_api.py (1)
  • controller (31-38)
webhook_server/tests/test_frontend_performance.py (1)
  • controller (18-21)
webhook_server/tests/test_memory_optimization.py (3)
webhook_server/libs/log_parser.py (1)
  • LogEntry (15-46)
webhook_server/web/log_viewer.py (2)
  • LogViewerController (19-906)
  • _stream_log_entries (586-671)
webhook_server/tests/test_log_api.py (1)
  • controller (31-38)
webhook_server/tests/conftest.py (1)
webhook_server/libs/log_parser.py (1)
  • LogEntry (15-46)
webhook_server/tests/test_helpers.py (3)
webhook_server/libs/config.py (1)
  • Config (11-88)
webhook_server/libs/exceptions.py (1)
  • NoApiTokenError (11-12)
webhook_server/utils/helpers.py (8)
  • extract_key_from_dict (132-143)
  • get_api_with_highest_rate_limit (220-273)
  • get_apis_and_tokes_from_config (210-217)
  • get_future_results (302-316)
  • get_github_repo_api (146-150)
  • get_logger_with_params (25-89)
  • log_rate_limit (276-299)
  • run_command (153-207)
webhook_server/libs/issue_comment_handler.py (1)
webhook_server/utils/helpers.py (1)
  • format_task_fields (110-129)
webhook_server/tests/test_log_parser.py (1)
webhook_server/libs/log_parser.py (4)
  • LogParser (49-341)
  • LogEntry (15-46)
  • is_workflow_step (94-107)
  • extract_workflow_steps (109-120)
🪛 Ruff (0.14.3)
webhook_server/libs/config.py

48-48: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)

webhook_server/web/log_viewer.py

85-85: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


88-88: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


217-217: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


220-220: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


362-362: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


488-488: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


534-534: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


713-713: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

webhook_server/app.py

141-145: Abstract raise to an inner function

(TRY301)


141-145: Avoid specifying long messages outside the exception class

(TRY003)


262-262: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


332-332: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

🔇 Additional comments (25)
webhook_server/tests/test_config.py (1)

2-5: LGTM! Clean import consolidation.

Moving the shutil import to module level eliminates redundant local imports and follows Python best practices for import organization.

webhook_server/tests/test_config_schema.py (2)

2-2: LGTM! Import consolidation follows best practices.

Moving shutil to module level eliminates redundant local imports in finally blocks throughout the test suite.


147-147: LGTM! Simplified file I/O calls.

Using the default read mode (removing explicit "r") is cleaner and equally clear, as open() defaults to read mode in Python.

Also applies to: 183-183, 200-200, 223-223, 244-244, 265-265, 287-287, 313-313, 333-333, 353-353, 370-370, 486-486, 509-509, 535-535, 556-556, 573-573, 591-591, 608-608, 624-624, 639-639

pyproject.toml (2)

17-21: Ruff linting configuration is well-structured.

The addition of standard linting rules (E, F, W, I, B, UP, PLC0415, ARG) with sensible per-file ignores for test modules follows best practices.


46-46: Typo fix is correct.

The change from "reposotories" to "repositories" fixes a spelling error in the project description.

webhook_server/utils/webhook.py (1)

1-3: LGTM: Standard typing migration.

The migration of Callable from typing to collections.abc aligns with modern Python best practices (PEP 585). No functional changes, just improved import organization.

webhook_server/tests/test_schema_validator.py (1)

11-11: LGTM: Modern type hints and simplified file opening.

The changes adopt Python 3.10+ union syntax (str | Path instead of Union[str, Path]) and simplify the file opening by relying on the default read mode. These are clean modernizations with no functional impact.

Also applies to: 242-242, 253-253

webhook_server/utils/github_repository_settings.py (1)

4-7: LGTM: Type hint modernization and improved log readability.

The changes include:

  • Migration of Callable from typing to collections.abc (consistent with modern Python)
  • Removal of string literal forward references for Future type hints (no longer needed with PEP 563)
  • Split long log messages for better readability

All changes are non-functional and improve code quality.

Also applies to: 74-75, 260-260, 323-323, 356-358

webhook_server/tests/test_performance_benchmarks.py (1)

5-5: LGTM: Modern exception handling and code quality improvements.

Key changes:

  • Line 452: Updated from asyncio.TimeoutError to TimeoutError (the built-in). This aligns with Python 3.11+ where asyncio.TimeoutError is deprecated in favor of the built-in TimeoutError.
  • Lines 277, 458: Underscore-prefixed loop variables (_i, _entry) properly indicate intentionally unused variables.
  • Lines 208-211: Multi-line string concatenation improves readability.
  • Line 5: Top-level gc import eliminates redundant imports.

All changes improve code quality and Python version compatibility.

Also applies to: 208-211, 277-277, 452-452, 458-458

webhook_server/tests/test_runner_handler.py (1)

1-1: LGTM: Consistent typing migration.

The migration of Generator from typing to collections.abc is consistent with the typing modernization across the PR.

webhook_server/tests/test_pull_request_handler.py (1)

2-3: LGTM: Improved import organization.

Adding CHERRY_PICKED_LABEL_PREFIX to the top-level imports eliminates redundant in-function imports, improving code organization and consistency.

Also applies to: 12-12

webhook_server/tests/test_labels_handler.py (1)

1-1: LGTM: Centralized asyncio import.

Moving the asyncio import to the module level eliminates redundant in-function imports and follows standard Python import conventions.

webhook_server/tests/test_owners_files_handler.py (1)

5-5: LGTM! Import organization improved.

Moving GithubException to module-level imports eliminates redundant in-test imports and improves code organization.

webhook_server/tests/test_prepare_retest_wellcome_comment.py (1)

25-28: LGTM! Improved string readability.

The multi-line string formatting with explicit concatenation improves readability while preserving the exact content of expected test strings.

Also applies to: 36-37, 45-48, 56-63, 71-72

webhook_server/tests/test_log_api.py (1)

13-13: LGTM! Import consolidation follows best practices.

Moving imports to module level eliminates redundant in-test imports and improves test file organization.

Also applies to: 17-17, 19-19

webhook_server/utils/app_utils.py (1)

84-85: LGTM! Exception chaining preserves error context.

Adding from e to exception raises maintains the full traceback chain, which is essential for debugging and aligns with Python best practices (PEP 3134).

Also applies to: 119-119

webhook_server/tests/test_github_api.py (2)

8-8: LGTM! Import organization improved.

Moving GithubException to module-level imports follows best practices and eliminates redundant in-test imports.


649-649: LGTM! Property evaluation correctly forced.

Assigning the property to _ correctly forces evaluation of add_api_users_to_auto_verified_and_merged_users, which has side effects (populating the auto-verified users list). This aligns with the property's behavior in webhook_server/libs/github_api.py.

webhook_server/libs/owners_files_handler.py (2)

2-2: LGTM! Import modernization.

Importing Coroutine from collections.abc instead of typing is the preferred approach in modern Python (3.9+) and improves consistency with standard library conventions.


16-16: LGTM! Structured logging enhances observability.

The addition of format_task_fields provides task correlation fields (task_id, task_type, task_status) that enable better tracking and debugging of reviewer assignment workflows. The consistent use of # type: ignore[attr-defined] is appropriate since step() is dynamically added to the logger.

Also applies to: 246-290

webhook_server/tests/test_memory_optimization.py (3)

3-3: LGTM! Import organization improved.

Consolidating imports at module level follows Python best practices and improves test file maintainability.

Also applies to: 5-6, 9-9, 11-11, 14-14


99-99: LGTM! Unused variable naming convention.

Prefixing the loop variable with _ correctly indicates it's intentionally unused, following Python conventions.


288-290: LGTM! Improved string formatting.

Splitting the long string across multiple lines enhances readability while preserving the exact content.

webhook_server/libs/labels_handler.py (2)

26-26: LGTM! Structured logging enhances observability.

The addition of format_task_fields provides task correlation fields that enable better tracking of label management workflows. The consistent use of # type: ignore[attr-defined] is appropriate since step() is dynamically added to the logger.

Also applies to: 50-53, 69-72, 221-224, 245-248


126-126: LGTM! Unused variable naming convention.

Renaming to _threshold correctly indicates the loop variable is intentionally unused, following Python conventions and avoiding potential shadowing issues.

Comment thread webhook_server/libs/handlers/check_run_handler.py
Comment thread webhook_server/libs/config.py Outdated
Comment thread webhook_server/libs/pull_request_handler.py Outdated
Comment thread webhook_server/libs/handlers/runner_handler.py Outdated
Comment thread webhook_server/web/log_viewer.py
Comment thread webhook_server/web/static/js/log_viewer.js
@myakove
Copy link
Copy Markdown
Collaborator Author

myakove commented Nov 7, 2025

/verified
/lgtm
/approve
/wip
/hold

@myakove-bot myakove-bot changed the title fix: Add task correlation fields and structured log WIP: fix: Add task correlation fields and structured log Nov 7, 2025
@myakove
Copy link
Copy Markdown
Collaborator Author

myakove commented Nov 7, 2025

/verified cancel
/lgtm cancel
/approve cancel
/wip cancel
/hold cancel

- Add else clause in process() to log when get_pull_request() returns None
- Log WARNING when no PR found with event type
- Log SUCCESS completion with token metrics for consistency
- Add DEBUG logs to get_pull_request() for all lookup strategies
- Log when GithubException is caught during PR lookup
- Log when each strategy (payload, commit SHA, check_run) is attempted

Fixes silent webhook processing that previously logged "Processing webhook"
then nothing when no PR could be found. Affects pull_request, check_run,
issue_comment, and pull_request_review events.

Improves observability by showing:
- Which PR lookup strategies were attempted
- Why each strategy failed
- When webhooks complete successfully with no PR
@myakove
Copy link
Copy Markdown
Collaborator Author

myakove commented Nov 9, 2025

/verified
/approve

Convert naive datetime objects to UTC-aware datetimes in log parser
to fix timezone comparison issues in the log server.

This ensures consistent timestamp handling across the application by
guaranteeing all parsed timestamps have explicit timezone information.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
webhook_server/libs/log_parser.py (2)

238-251: Consider duplicate task field edge case.

The extraction uses count=1 in each .sub() call, which only removes the first occurrence of each task field. If a log message somehow contains duplicate task fields (e.g., [task_id=foo] ... [task_id=bar]), only the first is extracted and removed, leaving the duplicate in the message.

For structured logging, duplicates should be rare or impossible, but if they can occur (e.g., through string concatenation or nested contexts), you may want to:

  • Remove all occurrences by omitting count=1, or
  • Validate and warn on duplicates

344-350: Update docstring to reflect unused parameter.

The parameter is named _pattern with an underscore prefix (indicating intentional disuse) and the implementation hardcodes "*.log" at line 360, completely ignoring the parameter value. However, the docstring still documents a pattern parameter.

Apply this diff to align the docstring with the implementation:

     async def monitor_log_directory(self, log_dir: Path, _pattern: str = "*.log") -> AsyncGenerator[LogEntry, None]:
         """
         Monitor a directory for log files and yield new entries from all files.
 
         Args:
             log_dir: Directory path containing log files
-            pattern: Glob pattern for log files (default: "*.log")
+            _pattern: Unused parameter (kept for API compatibility)
 
         Yields:
             LogEntry objects from all monitored log files
         """

Alternatively, if the pattern should be respected, use it at line 360:

-        log_files.extend(log_dir.glob("*.log"))
+        log_files.extend(log_dir.glob(_pattern))
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d5654b7 and 7e5eff8.

📒 Files selected for processing (1)
  • webhook_server/libs/log_parser.py (9 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
webhook_server/libs/log_parser.py (1)
webhook_server/web/static/js/log_viewer.js (5)
  • timestamp (160-160)
  • timestamp (1510-1510)
  • message (172-172)
  • message (1114-1114)
  • message (1194-1194)
🔇 Additional comments (6)
webhook_server/libs/log_parser.py (6)

27-30: LGTM! Task correlation and token tracking fields are well-structured.

The new fields integrate cleanly with the existing LogEntry schema, maintain consistent typing with | None optionals, and are properly serialized in to_dict().

Also applies to: 44-47


90-94: Escaped bracket handling correctly implemented.

The past review concern about escaped brackets has been fully addressed:

  • Patterns use ((?:\\.|[^\]])+) to correctly capture escaped characters within task field values
  • _unescape_task_value() properly converts escape sequences back to literal brackets
  • All extraction points (lines 240, 245, 250) apply unescaping

This ensures that task field values containing ] characters (e.g., [task_id=foo]bar]) are correctly parsed as foo]bar with no remnants left in the message.

Also applies to: 238-258


149-156: LGTM! Robust timezone handling.

The timestamp parsing correctly:

  • Normalizes 'Z' suffix to '+00:00' for fromisoformat compatibility
  • Ensures all timestamps are timezone-aware by defaulting to UTC for naive timestamps
  • Handles existing timezone offsets without modification

96-99: LGTM! Token spend extraction handles both formats.

The TOKEN_SPEND_PATTERN correctly handles:

  • Original: "Token spend: 35 API calls"
  • Masked: "token ***** 35 API calls"

The regex flexibility (\s*:?\s* for optional colon and whitespace) ensures both formats are captured.

Also applies to: 260-279


100-113: LGTM! Workflow step detection now uses task correlation.

The updated logic correctly leverages the new task fields to identify workflow milestones. Requiring both task_id and task_status effectively filters out internal logs and surfaces only meaningful business events in the timeline.


296-297: LGTM! Good code hygiene improvements.

Explicitly specifying encoding="utf-8" prevents platform-dependent encoding issues, and the _line_num prefix correctly signals an intentionally unused loop variable.

Also applies to: 327-327

Updated 5 test assertions to expect UTC timezone-aware datetime objects
instead of naive datetimes. This aligns with the fix in log_parser.py that
ensures all parsed timestamps are timezone-aware (UTC).

All 37 tests in test_log_parser.py now pass.
The pattern parameter was incorrectly prefixed with underscore
(indicating unused) and the method was hardcoding "*.log" instead
of using the parameter value.

Changes:
- Renamed _pattern → pattern (removed underscore prefix)
- Changed log_dir.glob("*.log") → log_dir.glob(pattern)
- Makes the API more flexible while maintaining backward compatibility
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
webhook_server/tests/test_log_parser.py (1)

579-579: Consider adding timezone information to timestamps for consistency.

The workflow test timestamps don't include tzinfo=datetime.UTC, unlike the earlier parsing tests. While this works for these tests since they're just creating LogEntry objects directly, it's inconsistent with the rest of the test file and may cause confusion or comparison issues if these entries are used alongside parsed entries.

Apply this diff to add timezone information:

-            timestamp=datetime.datetime(2025, 7, 31, 12, 0, 0),
+            timestamp=datetime.datetime(2025, 7, 31, 12, 0, 0, tzinfo=datetime.UTC),

(Apply to all workflow test timestamp definitions)

Also applies to: 595-595, 603-603, 620-620, 629-629, 636-636, 645-645, 670-670, 677-677

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7e5eff8 and ee384d6.

📒 Files selected for processing (1)
  • webhook_server/tests/test_log_parser.py (13 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
webhook_server/tests/test_log_parser.py (1)
webhook_server/libs/log_parser.py (6)
  • LogParser (51-375)
  • parse_log_entry (128-182)
  • LogEntry (15-48)
  • is_workflow_step (100-113)
  • extract_workflow_steps (115-126)
  • extract_token_spend (260-279)
🔇 Additional comments (7)
webhook_server/tests/test_log_parser.py (7)

6-6: LGTM!

The added imports are appropriately used for testing error logging scenarios and mocking file operations.

Also applies to: 8-8


30-30: LGTM!

The timestamp assertions correctly validate timezone-aware timestamps with UTC timezone, consistent with the parser's normalization behavior.

Also applies to: 65-65, 86-86, 109-109, 130-130


77-79: LGTM!

The test data correctly includes ANSI color codes to validate the parser's ability to strip color codes from production logs.

Also applies to: 100-102


159-171: LGTM!

The reformatted log content improves readability while maintaining the same test coverage.


259-259: LGTM!

Using the built-in TimeoutError is the correct and idiomatic approach for handling asyncio.wait_for() timeouts.


534-537: LGTM!

The test correctly validates that all new task correlation and token spend fields are included in the to_dict() serialization.


699-760: LGTM!

The token spend extraction tests provide comprehensive coverage of all format variations (original, masked, masked with colon) and edge cases (not found, invalid format). The integration test at the end validates the full parsing pipeline.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
webhook_server/libs/log_parser.py (1)

163-181: Consider removing token spend text from message for consistency.

Task correlation fields are extracted and removed from the message (line 163), but token spend is extracted without removal (line 166). This creates an inconsistency where token_spend=35 is stored in the field AND "Token spend: 35 API calls" remains in the message text.

Consider whether token spend should be stripped from the message to match the task field behavior, improving structured data consistency.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ee384d6 and 56e9aa9.

📒 Files selected for processing (1)
  • webhook_server/libs/log_parser.py (9 hunks)
🔇 Additional comments (7)
webhook_server/libs/log_parser.py (7)

6-9: LGTM! Modern import style.

Using collections.abc.AsyncGenerator instead of typing.AsyncGenerator follows current Python best practices (PEP 585).


27-30: LGTM! Clean schema extension.

The new task correlation fields and token_spend are properly integrated into both the dataclass and serialization method.

Also applies to: 44-47


90-98: Escaped bracket handling correctly implemented.

The regex patterns now properly handle escaped brackets with ((?:\\.|[^\]])+), which matches either escaped sequences or non-bracket characters. Combined with the _unescape_task_value helper (line 256-258), this fully addresses the past review concern about values containing ] characters.


100-113: LGTM! Improved milestone detection.

Checking for both task_id and task_status provides more precise workflow milestone identification than the previous level-based approach.


149-156: LGTM! Robust timezone handling.

Normalizing 'Z' to '+00:00' and ensuring timezone-aware timestamps prevents common datetime comparison issues.


218-253: LGTM! Clean task field extraction.

The method correctly extracts task correlation fields using precompiled patterns, unescapes values, and removes tokens from the message. The two-step approach (regex match + unescape) properly handles edge cases.


255-258: LGTM! Simple and correct unescaping.

The static helper correctly unescapes bracket sequences in task field values.

@myakove
Copy link
Copy Markdown
Collaborator Author

myakove commented Nov 10, 2025

/verified
/approve

@myakove myakove merged commit ab1e28d into main Nov 10, 2025
8 of 9 checks passed
@myakove-bot
Copy link
Copy Markdown
Collaborator

Successfully removed PR tag: ghcr.io/myk-org/github-webhook-server:pr-890.

@myakove-bot
Copy link
Copy Markdown
Collaborator

New container for ghcr.io/myk-org/github-webhook-server:latest published

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants