feat: add approval workflow gates to TaskEngine by Aureliolo · Pull Request #365 · Aureliolo/synthorg

Aureliolo · 2026-03-13T22:42:36Z

Summary

Approval gate integration: Wire SecOps ESCALATE verdicts and request_human_approval tool calls into the engine execution loop, parking agent context when human approval is required
RequestHumanApprovalTool: Agent-callable tool that creates ApprovalItem in the approval store and signals execution parking via metadata
ApprovalGate service: Coordinates context serialization (via ParkService), persistence (via ParkedContextRepository), and resume with decision message injection
Loop integration: Both ReactLoop and PlanExecuteLoop check for escalations after tool execution and return PARKED termination reason
ToolInvoker escalation tracking: Detects ESCALATE verdicts and parking metadata, exposes pending_escalations with proper tuple[EscalationInfo, ...] typing
API controller hardening: requested_by bound to auth user (not request body), prompt injection mitigation in resume messages, broadened WebSocket error handling
WebSocket fix: Frontend approval store now uses approval_id key from backend payload and fetches full items via API
Observability: 8 structured event constants for approval gate lifecycle

Closes #258, closes #259

Review coverage

Pre-reviewed by 15 specialized agents (docs-consistency, code-reviewer, python-reviewer, test-analyzer, silent-failure-hunter, comment-analyzer, type-design-analyzer, logging-audit, resilience-audit, conventions-enforcer, security-reviewer, api-contract-drift, test-quality-reviewer, async-concurrency-reviewer, issue-resolution-verifier).

43 findings identified and addressed across Critical (14), Major (23), and Medium (6) severities. Key fixes:

Security: identity spoofing, prompt injection, schema length limits
Lifecycle: premature context deletion, orphaned records, zombie parking states
Type safety: tuple[Any, ...] → tuple[EscalationInfo, ...]
Frontend: WebSocket payload mismatch (real-time updates were completely broken)
Tests: markers, precise assertions, new error path coverage

Test plan

uv run ruff check src/ tests/ — clean
uv run mypy src/ tests/ — clean (926 files, strict)
uv run pytest tests/ -n auto --cov=ai_company --cov-fail-under=80 — 7540 passed, 94.60% coverage
Verify WebSocket real-time updates work in dashboard (manual)
Verify approval create/approve/reject flows via API (manual)

Implement the approval gate layer that parks agent execution when human approval is required and provides infrastructure for resuming after a decision. Also adds the `request_human_approval` tool for agents to explicitly request approval. - Add EscalationInfo/ResumePayload models and approval gate events - Add ApprovalGate service (park/resume via ParkService) - Add RequestHumanApprovalTool (creates ApprovalItem, signals parking) - Add ToolInvoker escalation tracking (ESCALATE verdicts + tool metadata) - Integrate approval gate into execute_tool_calls and both loop types - Wire into AgentEngine, AppState, and approvals controller resume path Closes #258 Closes #259

github-actions · 2026-03-13T22:42:46Z

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

coderabbitai · 2026-03-13T22:42:56Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8ecd92da-7556-4e7d-8ab0-1ec05fd56931

📥 Commits

Reviewing files that changed from the base of the PR and between 0664529 and 91823f4.

📒 Files selected for processing (1)

src/ai_company/api/controllers/approvals.py

📝 Walkthrough

Summary by CodeRabbit

Release Notes

New Features
- Human approval workflow system for sensitive agent actions with request/approve/reject capabilities
- Parked context support for agents awaiting approval decisions
- Real-time WebSocket updates for approval status changes
- Enhanced approval item expiration with callback handling
Documentation
- Added approval gate integration details to API, engine, and tools documentation
- Expanded logging conventions with structured event guidance and examples
Bug Fixes
- Improved approval decision conflict detection and error handling
- Enhanced WebSocket payload validation for approval events

Walkthrough

This PR implements approval workflow gates in the engine by introducing an ApprovalGate service that coordinates parking and resumption of agent execution pending human approval. It adds a request_human_approval tool, integrates approval checks into execution loops, updates the API layer for approval handling, and extends frontend WebSocket handling to fetch full approval items.

Changes

Cohort / File(s)	Summary
Approval Gate Core `src/ai_company/engine/approval_gate.py`, `src/ai_company/engine/approval_gate_models.py`	New ApprovalGate service to manage parking/resumption with EscalationInfo and ResumePayload models; handles context serialization, persistence, and resume flow with comprehensive error handling.
Tool & Invoker Integration `src/ai_company/tools/approval_tool.py`, `src/ai_company/tools/invoker.py`, `src/ai_company/tools/registry.py`, `src/ai_company/tools/__init__.py`	RequestHumanApprovalTool added for escalation initiation; ToolInvoker extended with pending_escalations tracking for security and parking-based escalations; ToolRegistry.all_tools() method added.
Engine Loop Integration `src/ai_company/engine/agent_engine.py`, `src/ai_company/engine/loop_helpers.py`, `src/ai_company/engine/plan_execute_loop.py`, `src/ai_company/engine/react_loop.py`, `src/ai_company/engine/__init__.py`	Approval gate wired into execution loops; execute_tool_calls extended with approval gate parameter and parking flow; PlanExecuteLoop and ReactLoop now accept approval_gate and pass it through; AgentEngine creates and configures approval gate when approval_store is present.
API Layer `src/ai_company/api/approval_store.py`, `src/ai_company/api/controllers/approvals.py`, `src/ai_company/api/dto.py`, `src/ai_company/api/state.py`	ApprovalStore.save_if_pending() added for conditional persistence; approvals controller updated with resume triggering and conflict detection; CreateApprovalRequest.requested_by removed (set from auth user); AppState now holds ApprovalGate reference.
Factory & Configuration `src/ai_company/engine/_security_factory.py`	New factory module providing make_security_interceptor and registry_with_approval_tool; wires approval store into SecurityInterceptionStrategy and augments tool registry with approval tool.
Park/Resume Services `src/ai_company/security/timeout/park_service.py`, `src/ai_company/security/timeout/parked_context.py`	ParkService and ParkedContext updated to support optional task_id (None for taskless agents).
Observability `src/ai_company/observability/events/approval_gate.py`	New module defining APPROVAL_GATE_\* event constants for initialization, escalation detection, context parking/resumption, and failure tracking.
Frontend `web/src/stores/approvals.ts`	WebSocket event handlers refactored to fetch full approval items by approval_id instead of working with partial payloads; added async flow for submitted/approved/rejected/expired events.
Documentation & Tests `CLAUDE.md`, `README.md`, `docs/design/engine.md`, `tests/unit/engine/test_approval_gate.py`, `tests/unit/engine/test_approval_gate_models.py`, `tests/unit/engine/test_loop_helpers_approval.py`, `tests/unit/tools/test_approval_tool.py`, `tests/unit/tools/test_invoker_escalation.py`, `tests/unit/observability/test_events.py`, `web/src/__tests__/stores/approvals.test.ts`, `tests/unit/api/controllers/test_approvals.py`, `tests/unit/api/test_dto.py`	Updated documentation reflecting approval gate integration; comprehensive test coverage for approval gate lifecycle, escalation tracking, tool execution, and frontend WS handling.

Sequence Diagram(s)

sequenceDiagram
    participant Agent as Agent Loop
    participant ToolInvoker as Tool Invoker
    participant Gate as ApprovalGate
    participant Store as ParkService
    participant Repo as ParkedContextRepository
    participant API as Approval API

    Agent->>ToolInvoker: execute_tool_calls(with approval_gate)
    ToolInvoker->>ToolInvoker: invoke tools
    ToolInvoker->>Gate: check pending_escalations
    Gate->>Agent: has escalations?
    
    alt Escalation Detected
        Agent->>Gate: should_park(escalations)
        Gate-->>Agent: EscalationInfo
        Agent->>Store: serialize context
        Store-->>Agent: serialized context
        Agent->>Repo: persist parked context
        Repo-->>Agent: ParkedContext with approval_id
        Agent-->>Agent: return PARKED ExecutionResult
    else No Escalation
        Agent-->>Agent: continue execution normally
    end

sequenceDiagram
    participant API as Approval API
    participant Store as ApprovalStore
    participant Gate as ApprovalGate
    participant ParkSvc as ParkService
    participant Repo as ParkedContextRepository
    participant AgentLoop as Agent Loop

    API->>Store: approve(approval_id, decision)
    Store->>Store: save_if_pending(item)
    Store-->>API: updated item or None
    
    alt Approval Saved
        API->>Gate: resume_context(approval_id)
        Gate->>ParkSvc: deserialize context
        Gate->>Repo: load parked context
        Repo-->>Gate: ParkedContext
        Gate-->>API: (AgentContext, decision)
        API->>API: _trigger_resume with decision_reason
        API->>AgentLoop: inject resume message
        AgentLoop-->>AgentLoop: continue with approval decision
        Gate->>Repo: delete parked context
    else Conflict (Not PENDING)
        API-->>API: raise ConflictError
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

feat: implement approval workflow gates in engine #258: Implements approval workflow gates in engine—directly addresses the core requirement for engine-level blocking and resuming execution pending human approval, including SecOps integration and autonomy level enforcement.
feat: implement request_human_approval tool for approval workflow #259: Implements the request_human_approval tool for approval workflow—directly addresses the tool-based escalation mechanism for initiating approval requests.

Possibly related PRs

feat: incremental AgentEngine → TaskEngine status sync #331: Both PRs modify AgentEngine to extend its initialization and execution path; the related PR adds incremental TaskEngine synchronization while this PR integrates approval gate and parking flows with PARKED termination handling.
feat: add autonomy levels and approval timeout policies (#42, #126) #197: Both PRs modify the same park-and-resume infrastructure (ParkService, ParkedContext, parked-context persistence) and approval/timeout observability events, representing related phases of approval workflow implementation.
feat: web dashboard pages — views, components, tests, and review fixes #354: Both PRs modify the web approvals frontend store (approvals.ts) and WebSocket handling to improve approval item fetching and state synchronization.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 42.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat: add approval workflow gates to TaskEngine' is clear, specific, and directly reflects the primary change: integrating approval gating into the engine execution pipeline.
Description check	✅ Passed	The description is comprehensive and related to the changeset, detailing approval gate integration, RequestHumanApprovalTool, ApprovalGate service, loop integration, ToolInvoker escalation tracking, API hardening, WebSocket fixes, and observability additions.
Linked Issues check	✅ Passed	The PR fully implements both linked issues: `#258` wires SecOps ESCALATE verdicts into the engine loop with context parking [ApprovalGate, loop_helpers, ReactLoop, PlanExecuteLoop], and `#259` delivers the RequestHumanApprovalTool with ApprovalItem creation and resume injection [approval_tool, approval_gate].
Out of Scope Changes check	✅ Passed	All changes are scoped to approval workflow gating: core approval gate implementation, engine loop integration, API controller updates, ToolInvoker escalation tracking, WebSocket corrections, and comprehensive test coverage. No unrelated refactoring or feature creep detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/task-engine-approval

✨ Simplify code

Create PR with simplified code
Commit simplified code in branch feat/task-engine-approval

📝 Coding Plan

Generate coding plan for human review comments

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-03-13T22:43:08Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a comprehensive human approval workflow into the TaskEngine, allowing agents to pause execution and request human intervention for sensitive actions. It establishes a dedicated approval gate service, integrates it deeply into the agent execution loops, and provides a new tool for agents to explicitly request approvals. The changes also include significant API and frontend enhancements to support this new functionality, ensuring secure and observable approval processes.

Highlights

Approval Gate Integration: Integrated SecOps ESCALATE verdicts and request_human_approval tool calls into the engine execution loop, parking agent context when human approval is required.
RequestHumanApprovalTool: Introduced an agent-callable tool that creates an ApprovalItem in the approval store and signals execution parking via metadata.
ApprovalGate Service: Developed a new service to coordinate context serialization, persistence, and resume with decision message injection for approval workflows.
Execution Loop Integration: Both ReactLoop and PlanExecuteLoop now check for escalations after tool execution and return a PARKED termination reason when approval is needed.
ToolInvoker Escalation Tracking: Enhanced ToolInvoker to detect ESCALATE verdicts and parking metadata, exposing pending_escalations with proper typing.
API Controller Hardening: Improved security by binding requested_by to the authenticated user, mitigating prompt injection in resume messages, and broadening WebSocket error handling.
WebSocket Frontend Fix: Corrected the frontend approval store to use the approval_id key from the backend payload and fetch full items via API, resolving real-time update issues.
Observability: Added 8 structured event constants to track the approval gate lifecycle for enhanced monitoring.

Changelog

CLAUDE.md
- Updated descriptions for api/, engine/, and tools/ directories to reflect approval gate integration.
- Added APPROVAL_GATE_ESCALATION_DETECTED to the list of observability event constants.
README.md
- Updated the project status to indicate that approval workflow gates are now implemented.
docs/design/engine.md
- Added documentation explaining how the ApprovalGate handles escalations, context parking, and resume within the engine's run process.
src/ai_company/api/approval_store.py
- Added robust error handling for the _on_expire callback to prevent failures from propagating.
src/ai_company/api/controllers/approvals.py
- Improved error handling for WebSocket event publishing.
- Introduced a new helper function _log_approval_decision for consistent logging of approval outcomes.
- Enforced that the requested_by field in create_approval is populated from the authenticated user, not the request body, and added an UnauthorizedError check.
- Refactored approve and reject methods to use the new _log_approval_decision helper.
src/ai_company/api/state.py
- Imported ApprovalGate and added it as a managed dependency within the AppState.
src/ai_company/engine/init.py
- Imported and exposed ApprovalGate, EscalationInfo, and ResumePayload from the new approval gate modules.
src/ai_company/engine/agent_engine.py
- Modified the AgentEngine constructor to initialize the ApprovalGate and integrate it into the execution loop.
- Added methods _make_approval_gate and _make_default_loop to configure the approval gate and default ReactLoop.
- Updated _make_tool_invoker to dynamically register the RequestHumanApprovalTool if an approval store is available.
src/ai_company/engine/approval_gate.py
- Added new file implementing the ApprovalGate service, responsible for parking and resuming agent contexts based on approval decisions.
src/ai_company/engine/approval_gate_models.py
- Added new file defining EscalationInfo and ResumePayload Pydantic models for data transfer within the approval gate system.
src/ai_company/engine/loop_helpers.py
- Imported approval gate related events and models.
- Modified execute_tool_calls to incorporate approval_gate checks and park the agent context if an escalation is detected.
- Added a new helper function _park_for_approval to encapsulate the logic for parking an agent context.
src/ai_company/engine/plan_execute_loop.py
- Modified the PlanExecuteLoop constructor to accept an optional approval_gate instance.
- Updated _handle_step_tool_calls to pass the approval_gate to execute_tool_calls for escalation handling.
src/ai_company/engine/react_loop.py
- Modified the ReactLoop constructor to accept an optional approval_gate instance.
- Updated _process_turn_response to pass the approval_gate to execute_tool_calls for escalation handling.
src/ai_company/observability/events/approval_gate.py
- Added new file defining constants for various approval gate lifecycle events.
src/ai_company/tools/init.py
- Imported and exposed the new RequestHumanApprovalTool.
src/ai_company/tools/approval_tool.py
- Added new file implementing the RequestHumanApprovalTool, allowing agents to request human approval for actions.
src/ai_company/tools/invoker.py
- Added _pending_escalations to track escalations and a pending_escalations property for external access.
- Modified _check_security to populate _pending_escalations when a security verdict is ESCALATE and includes an approval_id.
- Refactored invoke and invoke_all methods to clear _pending_escalations at the start of each call.
- Introduced _track_parking_metadata to detect and record parking requests from tool execution results.
src/ai_company/tools/registry.py
- Added an all_tools method to retrieve all registered tool instances.
tests/unit/engine/test_approval_gate.py
- Added new unit tests for the ApprovalGate service, covering parking, resuming, and message building.
tests/unit/engine/test_approval_gate_models.py
- Added new unit tests for the EscalationInfo and ResumePayload models, verifying their structure and immutability.
tests/unit/engine/test_loop_helpers_approval.py
- Added new unit tests for the approval gate integration within loop helpers, specifically execute_tool_calls.
tests/unit/observability/test_events.py
- Updated the test to include the new approval_gate domain module in the discovery process.
tests/unit/tools/test_approval_tool.py
- Added new unit tests for the RequestHumanApprovalTool, covering item creation, metadata, risk classification, and error handling.
tests/unit/tools/test_invoker_escalation.py
- Added new unit tests for ToolInvoker's escalation tracking, including security verdicts and tool metadata.
web/src/stores/approvals.ts
- Updated handleWsEvent to correctly process WebSocket events using approval_id from the backend payload.
- Implemented fetching full approval items via API for new submissions and status changes.
- Added logic to remove approval items from the local list if they are deleted or expired on the backend.

Activity

The pull request was pre-reviewed by 15 specialized agents, including docs-consistency, code-reviewer, python-reviewer, security-reviewer, and test-quality-reviewer.
A total of 43 findings were identified and addressed across Critical (14), Major (23), and Medium (6) severities.
Key fixes included security vulnerabilities (identity spoofing, prompt injection, schema length limits), lifecycle issues (premature context deletion, orphaned records, zombie parking states), type safety improvements, a critical WebSocket payload mismatch fix, and enhanced test coverage.
The author has performed static analysis checks (ruff check, mypy) and unit tests (pytest) with high coverage (94.60%).
Manual verification steps are planned for WebSocket real-time updates and approval flows via API.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a comprehensive approval workflow system, a significant and well-executed feature. The changes are extensive, touching the agent engine, API, tools, and frontend, and include the addition of an ApprovalGate service, a RequestHumanApprovalTool, and robust context parking/resuming logic. The implementation quality is high, with strong attention to security (e.g., preventing identity spoofing in the approvals API), robustness (e.g., improved WebSocket event handling), and observability. The inclusion of thorough unit tests for all new components is commendable. I have one suggestion regarding inconsistent logging of fatal errors in the ToolInvoker for improved diagnostics.

gemini-code-assist · 2026-03-13T22:45:15Z

+        except MemoryError, RecursionError:
            raise


Inconsistent logging for non-recoverable errors. This except block for MemoryError and RecursionError no longer logs the exception before re-raising, but other methods in this class like _safe_deepcopy_args and _execute_tool still do. For consistent error handling and diagnostics, it's better to either log these fatal errors consistently in all methods where they are caught and re-raised, or handle logging only at a higher level. I'd suggest restoring the logging here to match the behavior of the other methods.

except (MemoryError, RecursionError) as exc: logger.exception( TOOL_INVOKE_NON_RECOVERABLE, tool_call_id=tool_call.id, tool_name=tool_call.name, error=f"{type(exc).__name__}: {exc}", ) raise

greptile-apps · 2026-03-13T22:46:56Z

Greptile Summary

This PR wires approval workflow gates into the TaskEngine execution loop, adding RequestHumanApprovalTool, ApprovalGate, and the supporting plumbing to park agent contexts when escalations occur and resume them after a human decision. The architecture is sound — EscalationInfo/ResumePayload models are well-typed, build_resume_message uses repr() to isolate user-supplied content, save_if_pending closes a race condition on concurrent decisions, and the frontend WebSocket handler is correctly rewritten to use the approval_id key and re-fetch full items from the API.

Critical issue:

Python 3 SyntaxError across all new files: Every new except clause that re-raises fatal errors uses except MemoryError, RecursionError: — the Python 2 bare-comma form that is a SyntaxError in Python 3. Any module containing this pattern will fail to import. Affects approval_store.py, approval_gate.py, loop_helpers.py, approval_tool.py, invoker.py, and approvals.py (controller). The fix is except (MemoryError, RecursionError): in all cases. (The existing pre-PR code in invoker.py already uses the correct parenthesised form.)

Other findings:

APPROVAL_GATE_CONTEXT_PARKED is emitted in the "no task_execution" debug path inside _park_for_approval before park_context() is called, producing a spurious or duplicate event on both failure and success paths.
In the frontend approvals.ts, total.value is incremented unconditionally when a approval.submitted WS event arrives while filters are active, even though the new item may not satisfy the current filter criteria — this can cause the displayed total to drift from reality.

Confidence Score: 1/5

Not safe to merge — all new modules will fail to import in Python 3 due to a systematic syntax error.
The PR's architecture and security improvements are well-thought-out, but all six new/modified backend files consistently use except MemoryError, RecursionError: — a Python 2 bare-comma except form that is a SyntaxError in Python 3. Any attempt to import approval_store, approval_gate, loop_helpers, approval_tool, invoker (new sections), or approvals (controller) will raise SyntaxError before any code runs. This single pattern blocks the entire feature from functioning and must be resolved before the PR can land.
src/ai_company/api/approval_store.py, src/ai_company/engine/approval_gate.py, src/ai_company/engine/loop_helpers.py, src/ai_company/tools/approval_tool.py, src/ai_company/tools/invoker.py, src/ai_company/api/controllers/approvals.py — all contain except MemoryError, RecursionError: (Python 3 SyntaxError).

Important Files Changed

Filename	Overview
src/ai_company/api/approval_store.py	Adds `save_if_pending` for race-condition-safe concurrent decisions and wraps the `_on_expire` callback in a try/except. New except block uses Python 2 bare-comma syntax (`except MemoryError, RecursionError:`) which is a SyntaxError in Python 3 — the entire module will fail to import.
src/ai_company/api/controllers/approvals.py	Adds `_trigger_resume` (log-only stub for future scheduler), binds `requested_by` to auth user (security improvement), uses `save_if_pending` for optimistic locking, and broadens WebSocket error handling. `_trigger_resume` correctly avoids consuming the parked record. Python 2 except syntax also present at line 100.
src/ai_company/engine/approval_gate.py	New `ApprovalGate` service coordinates parking and resume. Good lifecycle management (deserialization failure preserves parked record; delete failure logs but still returns context). `build_resume_message` uses `repr()` to isolate user-supplied content — solid prompt-injection mitigation. All four `except MemoryError, RecursionError:` clauses are Python 3 SyntaxErrors.
src/ai_company/engine/loop_helpers.py	Adds `_park_for_approval` helper and threads `approval_gate` through `execute_tool_calls`. Five `except MemoryError, RecursionError:` clauses are Python 3 SyntaxErrors. Additionally, `APPROVAL_GATE_CONTEXT_PARKED` is emitted before parking actually occurs in the "no task_execution" debug path.
src/ai_company/tools/approval_tool.py	New `RequestHumanApprovalTool` creates `ApprovalItem` in the store and returns `requires_parking` metadata. Correctly uses `APPROVAL_GATE_RISK_CLASSIFIED` for no-classifier fallback and `APPROVAL_GATE_ESCALATION_DETECTED` for success log. Two `except MemoryError, RecursionError:` clauses are Python 3 SyntaxErrors.
src/ai_company/tools/invoker.py	Adds `pending_escalations` property and `_track_parking_metadata` method. Correctly clears escalations at the start of `invoke()` and `invoke_all()`, and sorts escalations deterministically by tool-call index after `invoke_all`. Three `except MemoryError, RecursionError:` clauses are Python 3 SyntaxErrors (existing code in same file already uses the correct parenthesised form).
src/ai_company/api/dto.py	Adds `action_type` format validator (`category:action`), removes `requested_by` from create payload (now bound server-side from auth), and adds `max_length=4096` to `ApproveRequest.comment`. Well-tested and clean.
src/ai_company/engine/approval_gate_models.py	New frozen Pydantic models `EscalationInfo` and `ResumePayload` with `NotBlankStr` validation on all string fields. Clean, well-typed, and immutable. No issues.
web/src/stores/approvals.ts	Rewrites WS handler to use `approval_id` key from backend payload and fetch full items via API. The fire-and-forget async IIFE pattern is appropriate. One logic issue: `total` is unconditionally incremented on `approval.submitted` even when active filters are set and the new item may not match them, causing count drift.
src/ai_company/engine/react_loop.py	Adds optional `approval_gate` constructor parameter and threads it into `execute_tool_calls`. Minimal, clean change; no issues.

Sequence Diagram

sequenceDiagram
    participant Agent
    participant ToolInvoker
    participant ApprovalGate
    participant ParkService
    participant ParkedContextRepo
    participant ApprovalStore
    participant ReactLoop
    participant HumanReviewer
    participant ApprovalsController

    Note over Agent,ReactLoop: Escalation detected during tool execution

    Agent->>ToolInvoker: invoke(tool_call)
    alt SecOps ESCALATE verdict
        ToolInvoker->>ToolInvoker: append to _pending_escalations (approval_id from verdict)
    else RequestHumanApprovalTool
        ToolInvoker->>ApprovalStore: add(ApprovalItem)
        ToolInvoker->>ToolInvoker: _track_parking_metadata() → append EscalationInfo
    end

    ReactLoop->>ToolInvoker: pending_escalations
    ReactLoop->>ApprovalGate: should_park(escalations)
    ApprovalGate-->>ReactLoop: EscalationInfo (first escalation)

    ReactLoop->>ApprovalGate: park_context(escalation, context, agent_id, task_id)
    ApprovalGate->>ParkService: park(context, approval_id, ...)
    ParkService-->>ApprovalGate: ParkedContext (serialized JSON)
    ApprovalGate->>ParkedContextRepo: save(parked)
    ApprovalGate-->>ReactLoop: ParkedContext

    ReactLoop-->>Agent: ExecutionResult(PARKED, metadata={approval_id})

    Note over HumanReviewer,ApprovalsController: Human makes a decision

    HumanReviewer->>ApprovalsController: POST /approvals/{id}/approve or /reject
    ApprovalsController->>ApprovalStore: save_if_pending(updated_item)
    ApprovalsController->>ApprovalsController: _log_approval_decision()
    ApprovalsController->>ApprovalsController: _trigger_resume() — logs intent only
    Note right of ApprovalsController: Future scheduler will call<br/>ApprovalGate.resume_context()<br/>(not yet implemented)

    Note over ApprovalGate,ParkedContextRepo: Resume path (future scheduler)
    ApprovalGate->>ParkedContextRepo: get_by_approval(approval_id)
    ParkedContextRepo-->>ApprovalGate: ParkedContext
    ApprovalGate->>ParkService: resume(parked)
    ParkService-->>ApprovalGate: AgentContext
    ApprovalGate->>ParkedContextRepo: delete(parked.id)
    ApprovalGate-->>Agent: (AgentContext, parked_id) + resume_message injected

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/ai_company/api/approval_store.py
Line: 172-173

Comment:
**Python 3 `except` tuple syntax missing parentheses — SyntaxError across all new files**

`except MemoryError, RecursionError:` is Python 2 syntax. In Python 3 the comma-separated form (without parentheses) is a `SyntaxError`; the tuple must be explicitly wrapped. Any module that contains this form will fail to import entirely.

The same bug appears throughout all the new files added in this PR:
- `src/ai_company/api/approval_store.py:172`
- `src/ai_company/api/controllers/approvals.py:100`
- `src/ai_company/engine/approval_gate.py:127, 149, 202, 216`
- `src/ai_company/engine/loop_helpers.py:69, 114, 186, 303, 387`
- `src/ai_company/tools/approval_tool.py:162, 238`
- `src/ai_company/tools/invoker.py:212, 295, 639`

Note: the existing (pre-PR) code in `invoker.py` already uses the correct form `except (MemoryError, RecursionError) as exc:` at lines 453, 539, 575, and 698 — confirming this is an inconsistency introduced only in the new additions.

Fix every occurrence by wrapping the two types in parentheses:

```suggestion
                except (MemoryError, RecursionError):
                    raise
```

The pattern to apply globally is:
```python
# Wrong (Python 2)
except MemoryError, RecursionError:

# Correct (Python 3)
except (MemoryError, RecursionError):
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/ai_company/engine/loop_helpers.py
Line: 372-378

Comment:
**`APPROVAL_GATE_CONTEXT_PARKED` emitted before parking has occurred**

This debug log fires when `ctx.task_execution` is `None` — i.e. before `park_context()` is even called. `APPROVAL_GATE_CONTEXT_PARKED` implies the context *has* been parked, so any log aggregation or alerting rule watching for `approval_gate.context.parked` will see a spurious event even if the subsequent `park_context()` call fails.

An informational constant such as `APPROVAL_GATE_ESCALATION_DETECTED` (or a dedicated "parking skipped task_id" diagnostic) is more accurate here. The successful-park event is already emitted inside `ApprovalGate.park_context()`, so this line also produces a duplicate on the happy path.

```suggestion
        logger.debug(
            APPROVAL_GATE_ESCALATION_DETECTED,
            approval_id=escalation.approval_id,
            agent_id=agent_id,
            note="No task_execution on context — task_id will be None",
        )
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: web/src/stores/approvals.ts
Line: 82-91

Comment:
**`total` incremented unconditionally when filters are active, regardless of whether the new item matches them**

When `activeFilters.value` is truthy, `total.value++` is incremented for every `approval.submitted` event even though the newly submitted item may not satisfy the current filter (e.g. the user is filtering by `status: 'rejected'` while a new `pending` item arrives). This inflates the displayed count and can cause pagination to request pages that don't exist.

A stricter approach would be to skip the increment when the item's status/risk_level/action_type don't match the active filter criteria, or to do a quick attribute check on the already-fetched `item` object:

```typescript
if (activeFilters.value) {
  // Only bump total if the item matches the active filter constraints
  const f = activeFilters.value
  const matches =
    (!f.status      || item.status     === f.status) &&
    (!f.risk_level  || item.risk_level === f.risk_level) &&
    (!f.action_type || item.action_type === f.action_type)
  if (matches) total.value++
} else {
  approvals.value = [item, ...approvals.value]
  total.value++
}
```

How can I resolve this? If you propose a fix, please make it concise.

_{Last reviewed commit: 91823f4}

Copilot

Pull request overview

This PR adds an approval-gating mechanism to the agent execution engine so SecOps “ESCALATE” verdicts and explicit request_human_approval tool calls can park execution pending a human decision, along with API/UI plumbing and observability events to support the workflow.

Changes:

Introduces ApprovalGate + models, and integrates escalation/parking checks into ReAct and Plan-and-Execute loops.
Adds RequestHumanApprovalTool and extends ToolInvoker to track and expose pending_escalations.
Hardens approvals API identity handling and fixes frontend WebSocket approval updates by fetching full items via API.

Reviewed changes

Copilot reviewed 25 out of 25 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
web/src/stores/approvals.ts	Updates WS approval event handling to use `approval_id` and fetch full items from API
tests/unit/tools/test_invoker_escalation.py	Adds tests for escalation tracking in `ToolInvoker`
tests/unit/tools/test_approval_tool.py	Adds tests for `RequestHumanApprovalTool` behavior/validation
tests/unit/observability/test_events.py	Registers `approval_gate` module in event discovery tests
tests/unit/engine/test_loop_helpers_approval.py	Adds tests for loop helper parking behavior when escalations occur
tests/unit/engine/test_approval_gate_models.py	Adds tests for `EscalationInfo` / `ResumePayload` validation
tests/unit/engine/test_approval_gate.py	Adds tests for `ApprovalGate` park/resume and resume-message formatting
src/ai_company/tools/registry.py	Adds `ToolRegistry.all_tools()` for enumerating tool instances
src/ai_company/tools/invoker.py	Tracks escalations (SecOps + parking metadata) via `pending_escalations`
src/ai_company/tools/approval_tool.py	Implements `RequestHumanApprovalTool` (creates `ApprovalItem` + parking metadata)
src/ai_company/tools/init.py	Exports `RequestHumanApprovalTool`
src/ai_company/observability/events/approval_gate.py	Adds approval-gate lifecycle event constants
src/ai_company/engine/react_loop.py	Wires optional `ApprovalGate` into ReAct loop tool execution
src/ai_company/engine/plan_execute_loop.py	Wires optional `ApprovalGate` into Plan-and-Execute step tool execution
src/ai_company/engine/loop_helpers.py	Adds parking path returning `TerminationReason.PARKED` after tool calls
src/ai_company/engine/approval_gate_models.py	Adds frozen Pydantic models for escalation + resume payload
src/ai_company/engine/approval_gate.py	Adds `ApprovalGate` service (park/resume + resume-message builder)
src/ai_company/engine/agent_engine.py	Constructs `ApprovalGate`, injects into default loop, registers approval tool when configured
src/ai_company/engine/init.py	Exposes approval-gate types from the engine package
src/ai_company/api/state.py	Adds `approval_gate` to app state
src/ai_company/api/controllers/approvals.py	Hardens `requested_by` binding to auth user; refactors decision logging; broadens WS publish error handling
src/ai_company/api/approval_store.py	Wraps `on_expire` callback in exception handling
docs/design/engine.md	Documents new approval-gate parking behavior in the engine pipeline
README.md	Updates status text to reflect approval gates being implemented
CLAUDE.md	Updates repo/module documentation to mention approval gate + approval tool

Comments suppressed due to low confidence (1)

src/ai_company/api/controllers/approvals.py:105

_publish_approval_event() now catches a broad Exception. If asyncio.CancelledError is an Exception in the runtime, this will swallow request cancellation and continue processing. Consider explicitly re-raising cancellation (and any other control-flow exceptions you treat as non-recoverable) alongside MemoryError/RecursionError.

    try:
        channels_plugin = _get_channels_plugin(request)
        channels_plugin.publish(
            event.model_dump_json(),
            channels=[CHANNEL_APPROVALS],
        )
    except MemoryError, RecursionError:
        raise
    except Exception:
        logger.warning(
            API_APPROVAL_PUBLISH_FAILED,
            approval_id=item.id,
            event_type=event_type.value,
            exc_info=True,
        )

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+                        risk_level=verdict.risk_level,
+                        reason=verdict.reason,
+                    ),
+                )


+        turns,
+        metadata={
+            "approval_id": escalation.approval_id,
+            "parking_failed": str(parking_failed),


+   parking is needed. If so, the context is serialized via `ParkService`,
+   persisted, and the loop returns a `PARKED` result.


+        self._park_service = park_service
+        self._parked_context_repo = parked_context_repo
+        logger.debug(
+            APPROVAL_GATE_ESCALATION_DETECTED,


+        logger.debug(
+            APPROVAL_GATE_ESCALATION_DETECTED,
+            action_type=action_type,
+            note="No risk classifier — defaulting to HIGH",
+        )


+          // Item may have been deleted — remove from local list
+          approvals.value = approvals.value.filter((a) => a.id !== approvalId)
+          total.value = Math.max(0, total.value - 1)


coderabbitai

Actionable comments posted: 12

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tests/unit/observability/test_events.py (1)
175-225: 🧹 Nitpick | 🔵 Trivial

Add direct assertions for the new approval_gate module.

This only proves the module is discoverable. If src/ai_company/observability/events/approval_gate.py ships with a missing constant or a misspelled value, this file still passes as long as the module exists and the remaining strings are unique. Please add a test_approval_gate_events_exist() block like the other domains.

Based on learnings, "Event names must always use constants from domain-specific modules under ai_company.observability.events."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/observability/test_events.py` around lines 175 - 225, Add a new
unit test in tests/unit/observability/test_events.py named
test_approval_gate_events_exist that imports
ai_company.observability.events.approval_gate and asserts the specific expected
event constants (or their string values) are present and correct (similar style
to other domain-specific tests), so that existence and correctness of constants
in approval_gate.py are validated rather than merely the module being
discoverable; reference the existing test_all_domain_modules_discovered for
placement and use approval_gate module symbols to construct the assertions.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/ai_company/api/approval_store.py`:
- Around line 151-160: The except clause currently catches the built-in
MemoryError but not the project-specific MemoryError defined in
ai_company.memory.errors, so re-raised project errors fall into the generic
Exception handler; import the project error (e.g., from ai_company.memory.errors
import MemoryError as ProjectMemoryError) and update the except line to
explicitly re-raise both RecursionError and the project MemoryError (or include
both types in the except tuple used for re-raising) before the generic Exception
that logs via logger.exception(API_APPROVAL_EXPIRED, approval_id=item.id,
note="on_expire callback failed"); ensure the symbol _on_expire remains where
called and no error types are swallowed.

In `@src/ai_company/engine/agent_engine.py`:
- Around line 150-152: The default approval gate created by
_make_approval_gate() is missing the parked_context_repo so
ApprovalGate.resume_context() is a no-op and PARKED runs can't be resumed;
update the factory so the ApprovalGate created in _make_approval_gate (and any
place that constructs ApprovalGate in this class) receives the persistent
parked-context storage (self._approval_store or its parked_context_repo
attribute) as the parked_context_repo argument when constructing ApprovalGate,
ensuring resume_context() works and PARKED runs are resumable.

In `@src/ai_company/engine/approval_gate.py`:
- Around line 214-231: The current resume flow ignores the boolean return from
ParkedContextRepository.delete() and always logs APPROVAL_GATE_CONTEXT_RESUMED;
update the code in the method containing the await
self._parked_context_repo.delete(parked.id) call (e.g., the resume handler in
approval_gate.py) to capture the delete result, and if it returns False treat it
as a cleanup failure: log/exception with APPROVAL_GATE_RESUME_DELETE_FAILED
including approval_id and parked_id (and do not log
APPROVAL_GATE_CONTEXT_RESUMED), otherwise proceed to log
APPROVAL_GATE_CONTEXT_RESUMED and return (context, parked.id); preserve existing
exception handling for MemoryError/RecursionError.

In `@src/ai_company/engine/loop_helpers.py`:
- Around line 345-406: The metadata currently stores parking_failed as a string
in _park_for_approval; change it to store a boolean instead (i.e.,
parking_failed: parking_failed) when calling build_result so downstream
consumers receive a true boolean under the "parking_failed" metadata key; ensure
any serialization/consumption points that expect a string are updated to accept
a boolean (references: function _park_for_approval, local variable
parking_failed, build_result call, metadata key "parking_failed").

In `@src/ai_company/tools/approval_tool.py`:
- Around line 96-176: The execute() method is too long and should be split into
small helpers: keep execute() to orchestrate only (extract validation, item
construction, persistence, and result shaping into private methods).
Specifically, replace the inline logic with calls to helpers such as
_validate_action_type(action_type) (already exists),
_build_approval_item(action_type, title, description) to construct the
ApprovalItem (move the ApprovalItem import and all field population there),
_persist_approval_item(item) to run the try/except around await
self._approval_store.add(item) (re-raise MemoryError/RecursionError and on
Exception call logger.exception(APPROVAL_GATE_ESCALATION_FAILED, ...) and return
an error ToolExecutionResult), and _format_success_result(approval_id,
action_type, risk_level, title) to return the success ToolExecutionResult and
log APPROVAL_GATE_ESCALATION_DETECTED. After extraction, execute() should call
these helpers and return early on validation/persistence errors so the function
body remains under the 50-line limit.

In `@tests/unit/engine/test_approval_gate.py`:
- Around line 51-274: Tests repeat setup of MagicMock/AsyncMock objects
(park_service, repo, parked) across many test coroutines in TestParking and
TestResumeContext; extract these into pytest fixtures to DRY the test file.
Create fixtures named e.g. park_service, parked_mock (or parked), and
parked_repo (or repo) that return the configured MagicMock/AsyncMock instances
and use them by adding them as parameters to tests that currently call
ApprovalGate(...)/park_context()/resume_context(), updating tests like
test_calls_park_service, test_persists_to_repo_when_available,
test_successful_resume, etc., to accept the fixtures and remove the repeated
wiring inside each test while preserving any test-specific side_effects or
return overrides.
- Around line 279-310: The three tests test_approved_without_reason,
test_rejected_with_reason, and test_approved_with_reason duplicate the same
structure calling ApprovalGate.build_resume_message and asserting substrings;
refactor them into a single `@pytest.mark.parametrize` test that iterates over
cases (id, approved boolean, decided_by, decision_reason, expected_substrings)
and calls ApprovalGate.build_resume_message once per case, asserting each
expected substring is in the returned msg; update or remove the original three
test functions and keep descriptive parameter values to preserve coverage.

In `@tests/unit/engine/test_loop_helpers_approval.py`:
- Line 3: The test unnecessarily uses PropertyMock to stub pending_escalations;
replace type(invoker).pending_escalations =
PropertyMock(return_value=escalations) with a simple instance attribute
assignment (e.g., invoker.pending_escalations = escalations or setattr(invoker,
"pending_escalations", escalations)) in the helper and the other occurrence
around lines 55–64, and remove PropertyMock from the imports in the file
(adjusting the import list to only AsyncMock, MagicMock, patch).

In `@web/src/stores/approvals.ts`:
- Around line 72-80: The catch currently treats any error from
approvalsApi.getApproval as a delete; change it to inspect the error response
and only remove/ignore the item when the error is a definitive "not found" (HTTP
404 or 410) — for other errors (timeouts, 5xx, network) leave approvals.value
and total.value untouched and trigger the existing list refresh/fallback path
instead of decrementing state; locate both call sites to
approvalsApi.getApproval (the try/catch around getApproval and the similar block
at lines handling the other event) and update their catch handlers to branch on
error status (404/410 vs others) so only 404/410 perform the remove/ignore
behavior and other errors preserve current state and perform a refresh.
- Around line 71-77: When a websocket-driven update fetches an approval via
approvalsApi.getApproval(approvalId), do not unconditionally prepend it when
activeFilters.value is falsy; instead run the current filter predicate against
the fetched item (use the same filter logic used to build approvals.value) and
insert the item only if it matches, or remove it from approvals.value if it no
longer matches, and update total.value accordingly (increment when inserting a
new matching item, decrement when removing a previously-present matching item).
Apply the same re-filter/insert-or-remove logic to the other WS-handling block
(the code around the approvals update at lines similar to 86-90) so both
websocket update paths respect activeFilters.value (including empty objects) and
keep total.value in sync.

---

Outside diff comments:
In `@tests/unit/observability/test_events.py`:
- Around line 175-225: Add a new unit test in
tests/unit/observability/test_events.py named test_approval_gate_events_exist
that imports ai_company.observability.events.approval_gate and asserts the
specific expected event constants (or their string values) are present and
correct (similar style to other domain-specific tests), so that existence and
correctness of constants in approval_gate.py are validated rather than merely
the module being discoverable; reference the existing
test_all_domain_modules_discovered for placement and use approval_gate module
symbols to construct the assertions.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: afb7d111-796d-4bf3-800d-74573c983a99

📥 Commits

Reviewing files that changed from the base of the PR and between 494013f and b999b65.

📒 Files selected for processing (25)

CLAUDE.md
README.md
docs/design/engine.md
src/ai_company/api/approval_store.py
src/ai_company/api/controllers/approvals.py
src/ai_company/api/state.py
src/ai_company/engine/__init__.py
src/ai_company/engine/agent_engine.py
src/ai_company/engine/approval_gate.py
src/ai_company/engine/approval_gate_models.py
src/ai_company/engine/loop_helpers.py
src/ai_company/engine/plan_execute_loop.py
src/ai_company/engine/react_loop.py
src/ai_company/observability/events/approval_gate.py
src/ai_company/tools/__init__.py
src/ai_company/tools/approval_tool.py
src/ai_company/tools/invoker.py
src/ai_company/tools/registry.py
tests/unit/engine/test_approval_gate.py
tests/unit/engine/test_approval_gate_models.py
tests/unit/engine/test_loop_helpers_approval.py
tests/unit/observability/test_events.py
tests/unit/tools/test_approval_tool.py
tests/unit/tools/test_invoker_escalation.py
web/src/stores/approvals.ts

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: Agent
GitHub Check: Build Backend
GitHub Check: Build Web
GitHub Check: Greptile Review
GitHub Check: Test (Python 3.14)
GitHub Check: Analyze (python)

🧰 Additional context used

📓 Path-based instructions (8)

web/src/**/*.{vue,ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Web dashboard: Vue 3 + PrimeVue + Tailwind CSS, organized by feature in src/components/, src/stores/, src/views/. Enforce with ESLint and vue-tsc type-checking.

Files:

web/src/stores/approvals.ts

web/src/stores/**/*.ts

📄 CodeRabbit inference engine (CLAUDE.md)

Frontend state management: Pinia stores organized by feature in src/stores/

Files:

web/src/stores/approvals.ts

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Python version 3.14+ with PEP 649 native lazy annotations required.
Do NOT use from __future__ import annotations — Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax: except A, B: (no parentheses) — ruff enforces this on Python 3.14.
Line length must be 88 characters, enforced by ruff.
NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001. Tests must use test-provider, test-small-001, etc.

Files:

tests/unit/engine/test_loop_helpers_approval.py
tests/unit/observability/test_events.py
tests/unit/tools/test_invoker_escalation.py
src/ai_company/engine/plan_execute_loop.py
src/ai_company/engine/approval_gate.py
src/ai_company/engine/loop_helpers.py
src/ai_company/api/state.py
src/ai_company/observability/events/approval_gate.py
src/ai_company/tools/invoker.py
src/ai_company/engine/__init__.py
src/ai_company/tools/__init__.py
tests/unit/tools/test_approval_tool.py
src/ai_company/api/approval_store.py
src/ai_company/tools/registry.py
src/ai_company/engine/approval_gate_models.py
tests/unit/engine/test_approval_gate_models.py
src/ai_company/tools/approval_tool.py
src/ai_company/api/controllers/approvals.py
tests/unit/engine/test_approval_gate.py
src/ai_company/engine/agent_engine.py
src/ai_company/engine/react_loop.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Tests must use markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow
Test coverage minimum: 80% (enforced in CI).
Async tests: use asyncio_mode = "auto" — no manual @pytest.mark.asyncio needed.
Test timeout: 30 seconds per test.
Prefer @pytest.mark.parametrize for testing similar cases.

Files:

tests/unit/engine/test_loop_helpers_approval.py
tests/unit/observability/test_events.py
tests/unit/tools/test_invoker_escalation.py
tests/unit/tools/test_approval_tool.py
tests/unit/engine/test_approval_gate_models.py
tests/unit/engine/test_approval_gate.py

src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: All public functions must have type hints. Enforce via mypy strict mode.
Google-style docstrings required on public classes and functions, enforced by ruff D rules.
Create new objects instead of mutating existing ones. For non-Pydantic internal collections, use copy.deepcopy() at construction plus MappingProxyType wrapping for read-only enforcement.
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and use copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization).
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 with BaseModel, model_validator, computed_field, ConfigDict. Use @computed_field for derived values instead of storing redundant fields (e.g., TokenUsage.total_tokens).
Use NotBlankStr from core.types for all identifier/name fields in Pydantic models, including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants, instead of manual whitespace validators.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Functions must be less than 50 lines; files must be less than 800 lines.
Handle errors explicitly, never silently swallow exceptions.
Validate at system boundaries: user input, external APIs, config files.
NEVER use import logging, logging.getLogger(), or print() in application code.
Always use logger as the variable name for loggers (not _logger, not log).
Event names must always use constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget). Import dire...

Files:

src/ai_company/engine/plan_execute_loop.py
src/ai_company/engine/approval_gate.py
src/ai_company/engine/loop_helpers.py
src/ai_company/api/state.py
src/ai_company/observability/events/approval_gate.py
src/ai_company/tools/invoker.py
src/ai_company/engine/__init__.py
src/ai_company/tools/__init__.py
src/ai_company/api/approval_store.py
src/ai_company/tools/registry.py
src/ai_company/engine/approval_gate_models.py
src/ai_company/tools/approval_tool.py
src/ai_company/api/controllers/approvals.py
src/ai_company/engine/agent_engine.py
src/ai_company/engine/react_loop.py

src/ai_company/**/[!_]*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Every module with business logic MUST have: from ai_company.observability import get_logger then logger = get_logger(__name__)

Files:

src/ai_company/engine/plan_execute_loop.py
src/ai_company/engine/approval_gate.py
src/ai_company/engine/loop_helpers.py
src/ai_company/api/state.py
src/ai_company/observability/events/approval_gate.py
src/ai_company/tools/invoker.py
src/ai_company/api/approval_store.py
src/ai_company/tools/registry.py
src/ai_company/engine/approval_gate_models.py
src/ai_company/tools/approval_tool.py
src/ai_company/api/controllers/approvals.py
src/ai_company/engine/agent_engine.py
src/ai_company/engine/react_loop.py

docs/design/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

When approved deviations occur, update the relevant docs/design/ page to reflect the new reality.

Files:

docs/design/engine.md

docs/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Documentation source in docs/ built with Zensical. Design spec in docs/design/ (7 pages). Architecture in docs/architecture/. Roadmap in docs/roadmap/. Security in docs/security.md.

Files:

docs/design/engine.md

🧠 Learnings (9)

📚 Learning: 2026-03-13T21:03:58.906Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Event names must always use constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`, `BUDGET_RECORD_ADDED` from `events.budget`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`

Applied to files:

tests/unit/observability/test_events.py
src/ai_company/engine/loop_helpers.py
src/ai_company/observability/events/approval_gate.py
CLAUDE.md

📚 Learning: 2026-03-13T21:03:58.906Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/ai_company/**/[!_]*.py : Every module with business logic MUST have: `from ai_company.observability import get_logger` then `logger = get_logger(__name__)`

Applied to files:

src/ai_company/api/state.py
CLAUDE.md

📚 Learning: 2026-03-13T21:03:58.906Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Structured logging: always use `logger.info(EVENT, key=value)` — never `logger.info("msg %s", val)`

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-13T21:03:58.906Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : All state transitions must log at INFO level.

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-13T21:03:58.906Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : NEVER use `import logging`, `logging.getLogger()`, or `print()` in application code.

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-13T21:03:58.906Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : All error paths must log at WARNING or ERROR with context before raising.

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-13T21:03:58.906Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Always use `logger` as the variable name for loggers (not `_logger`, not `log`).

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-13T21:03:58.906Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : DEBUG logging for object creation, internal flow, and entry/exit of key functions.

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-13T21:03:58.906Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Pure data models, enums, and re-exports do NOT need logging.

Applied to files:

CLAUDE.md

🧬 Code graph analysis (18)

web/src/stores/approvals.ts (1)

web/src/api/types.ts (1)

WsEvent (519-524)

tests/unit/engine/test_loop_helpers_approval.py (7)

src/ai_company/engine/approval_gate.py (3)

ApprovalGate (39-263)

should_park (71-90)

park_context (92-161)

src/ai_company/engine/approval_gate_models.py (1)

EscalationInfo (14-33)

src/ai_company/engine/loop_helpers.py (1)

execute_tool_calls (240-342)

src/ai_company/engine/loop_protocol.py (2)

ExecutionResult (79-140)

TerminationReason (28-36)

src/ai_company/providers/enums.py (1)

FinishReason (15-22)

src/ai_company/providers/models.py (3)

CompletionResponse (257-306)

ToolCall (96-119)

ToolResult (122-135)

src/ai_company/engine/run_result.py (1)

termination_reason (64-66)

tests/unit/tools/test_invoker_escalation.py (5)

src/ai_company/providers/models.py (1)

ToolCall (96-119)

src/ai_company/security/models.py (2)

SecurityVerdict (35-67)

SecurityVerdictType (23-32)

src/ai_company/tools/base.py (3)

BaseTool (57-184)

ToolExecutionResult (25-54)

description (138-140)

src/ai_company/tools/invoker.py (5)

ToolInvoker (73-756)

registry (123-125)

pending_escalations (128-136)

invoke (343-365)

invoke_all (694-756)

src/ai_company/tools/registry.py (1)

ToolRegistry (30-126)

src/ai_company/engine/plan_execute_loop.py (2)

src/ai_company/api/state.py (1)

approval_gate (139-141)

src/ai_company/engine/approval_gate.py (1)

ApprovalGate (39-263)

src/ai_company/engine/approval_gate.py (5)

src/ai_company/persistence/repositories.py (1)

ParkedContextRepository (199-267)

src/ai_company/security/timeout/park_service.py (3)

ParkService (29-144)

park (37-106)

resume (108-144)

src/ai_company/security/timeout/parked_context.py (1)

ParkedContext (19-64)

src/ai_company/engine/approval_gate_models.py (1)

EscalationInfo (14-33)

src/ai_company/engine/context.py (1)

AgentContext (87-307)

src/ai_company/api/state.py (1)

src/ai_company/engine/approval_gate.py (1)

ApprovalGate (39-263)

src/ai_company/tools/invoker.py (2)

src/ai_company/core/enums.py (1)

ApprovalRiskLevel (443-449)

src/ai_company/engine/approval_gate_models.py (1)

EscalationInfo (14-33)

src/ai_company/engine/__init__.py (3)

src/ai_company/api/state.py (1)

approval_gate (139-141)

src/ai_company/engine/approval_gate.py (1)

ApprovalGate (39-263)

src/ai_company/engine/approval_gate_models.py (2)

EscalationInfo (14-33)

ResumePayload (36-51)

src/ai_company/tools/__init__.py (1)

src/ai_company/tools/approval_tool.py (1)

RequestHumanApprovalTool (31-208)

tests/unit/tools/test_approval_tool.py (3)

src/ai_company/api/approval_store.py (3)

ApprovalStore (27-162)

get (61-73)

add (42-59)

src/ai_company/security/timeout/risk_tier_classifier.py (1)

DefaultRiskTierClassifier (62-101)

src/ai_company/tools/approval_tool.py (2)

RequestHumanApprovalTool (31-208)

execute (96-176)

src/ai_company/api/approval_store.py (2)

src/ai_company/api/app.py (1)

_on_expire (74-99)

src/ai_company/memory/errors.py (1)

MemoryError (13-14)

src/ai_company/tools/registry.py (1)

src/ai_company/tools/base.py (2)

BaseTool (57-184)

name (123-125)

src/ai_company/engine/approval_gate_models.py (1)

src/ai_company/tools/base.py (1)

action_type (133-135)

tests/unit/engine/test_approval_gate_models.py (1)

src/ai_company/engine/approval_gate_models.py (2)

EscalationInfo (14-33)

ResumePayload (36-51)

src/ai_company/tools/approval_tool.py (5)

src/ai_company/core/enums.py (1)

ToolCategory (294-308)

src/ai_company/tools/base.py (6)

BaseTool (57-184)

ToolExecutionResult (25-54)

description (138-140)

category (128-130)

action_type (133-135)

parameters_schema (143-151)

src/ai_company/api/approval_store.py (2)

ApprovalStore (27-162)

add (42-59)

src/ai_company/security/timeout/risk_tier_classifier.py (1)

DefaultRiskTierClassifier (62-101)

src/ai_company/memory/errors.py (1)

MemoryError (13-14)

tests/unit/engine/test_approval_gate.py (3)

src/ai_company/engine/approval_gate.py (5)

ApprovalGate (39-263)

should_park (71-90)

park_context (92-161)

resume_context (163-231)

build_resume_message (234-263)

src/ai_company/engine/approval_gate_models.py (1)

EscalationInfo (14-33)

src/ai_company/security/timeout/park_service.py (3)

ParkService (29-144)

park (37-106)

resume (108-144)

src/ai_company/engine/agent_engine.py (6)

src/ai_company/engine/approval_gate.py (1)

ApprovalGate (39-263)

src/ai_company/security/timeout/park_service.py (1)

ParkService (29-144)

src/ai_company/core/agent.py (1)

AgentIdentity (266-342)

src/ai_company/tools/registry.py (2)

ToolRegistry (30-126)

all_tools (102-104)

src/ai_company/tools/approval_tool.py (1)

RequestHumanApprovalTool (31-208)

src/ai_company/security/timeout/risk_tier_classifier.py (1)

DefaultRiskTierClassifier (62-101)

src/ai_company/engine/react_loop.py (3)

tests/unit/engine/conftest.py (1)

engine (449-460)

src/ai_company/api/state.py (1)

approval_gate (139-141)

src/ai_company/engine/approval_gate.py (1)

ApprovalGate (39-263)

🪛 LanguageTool

CLAUDE.md

[style] ~154-~154: A comma is missing here.
Context: ...nder ai_company.observability.events (e.g. PROVIDER_CALL_START from `events.prov...

(EG_NO_COMMA)

🔇 Additional comments (24)

src/ai_company/engine/__init__.py (1)

9-10: LGTM!

The new public API exports for ApprovalGate, EscalationInfo, and ResumePayload are correctly imported and added to __all__ in alphabetical order, maintaining consistency with the existing export pattern.

docs/design/engine.md (1)

450-467: LGTM!

The documentation accurately captures the new escalation-handling flow and PARKED termination behavior. The description aligns with the implementation in loop_helpers.py and ApprovalGate, including the interaction with ToolInvoker.pending_escalations, ParkService, and the approval-timeout policy.

tests/unit/tools/test_approval_tool.py (4)

1-53: LGTM!

Well-structured test file with proper pytest markers (unit, timeout(30)). The fixtures correctly set up the ApprovalStore and RequestHumanApprovalTool instances. The TestToolCreation class comprehensively validates tool properties including name, action_type, and parameters schema.

56-144: LGTM!

Thorough execution tests covering the happy path, metadata validation, default risk level, content verification, and the task_id=None scenario. The assertions correctly verify both the ToolExecutionResult metadata and the persisted ApprovalItem state.

177-204: LGTM!

Good use of @pytest.mark.parametrize to cover multiple invalid action_type formats efficiently. The test cases cover edge cases like missing category, missing action, extra colons, and whitespace-only values.

220-256: LGTM!

The error handling test correctly verifies graceful degradation when the store fails. The monkeypatch approach is appropriate for simulating store failures in unit tests.

src/ai_company/api/controllers/approvals.py (4)

97-105: LGTM!

Correct use of PEP 758 except syntax (except MemoryError, RecursionError:) and proper error handling pattern — re-raising critical errors while logging warnings for recoverable failures.

148-166: LGTM!

Clean helper function that centralizes approval decision logging with appropriate event constants. The docstring correctly notes that context resumption is out of scope for this controller.

260-278: Good security hardening: requested_by bound to authenticated user.

The change correctly enforces that requested_by is populated from the authenticated user's username rather than the request body, preventing potential spoofing. The UnauthorizedError is appropriately raised when authentication is missing.

367-371: LGTM!

Correctly delegates to _log_approval_decision for consistent observability across approval decisions.

src/ai_company/engine/approval_gate_models.py (2)

14-33: LGTM!

Well-designed frozen Pydantic model with proper use of NotBlankStr for all identifier fields and ApprovalRiskLevel enum for type safety. The docstring clearly documents each attribute's purpose.

36-51: LGTM!

Clean model design for approval decision payloads. The optional decision_reason field correctly uses NotBlankStr | None to ensure non-blank values when provided.

src/ai_company/engine/loop_helpers.py (3)

11-13: LGTM!

Correct import of event constant from the domain-specific observability module. Based on learnings: "Event names must always use constants from domain-specific modules under ai_company.observability.events".

240-248: LGTM!

Clean API extension with a keyword-only approval_gate parameter defaulting to None. This preserves backward compatibility while enabling the escalation parking flow when configured.

329-342: LGTM!

The escalation check is correctly placed after tool execution completes. The flow properly delegates to _park_for_approval when an escalation warrants parking.

src/ai_company/engine/react_loop.py (2)

65-70: LGTM!

Clean constructor extension with keyword-only parameter and None default, maintaining backward compatibility.

217-224: LGTM!

The approval_gate is correctly forwarded to execute_tool_calls, enabling the escalation parking flow when configured.

src/ai_company/api/state.py (4)

14-14: LGTM!

Correct import added with noqa: TC001 for type-checking compliance.

39-49: LGTM!

The __slots__ tuple is correctly updated with the new _approval_gate entry, maintaining alphabetical order.

61-66: LGTM!

Constructor correctly accepts the optional approval_gate parameter and stores it in the private attribute.

138-141: LGTM!

The property correctly returns ApprovalGate | None rather than using _require_service, which aligns with the design that approval gate is optional and doesn't require a 503 error when unconfigured.

tests/unit/engine/test_approval_gate.py (3)

12-30: Solid test module scaffolding and helper setup.

pytestmark usage and _make_escalation helper keep the tests consistent and readable.

33-46: should_park behavior coverage is concise and correct.

The empty-tuple and first-escalation assertions match the intended contract exactly.

51-274: Good depth on parking/resume lifecycle paths.

These tests cover success and failure branches (serialization, persistence, deserialization, cleanup failure) with the right behavioral assertions.

coderabbitai · 2026-03-13T22:53:33Z

+                try:
+                    self._on_expire(expired)
+                except MemoryError, RecursionError:
+                    raise
+                except Exception:
+                    logger.exception(
+                        API_APPROVAL_EXPIRED,
+                        approval_id=item.id,
+                        note="on_expire callback failed",
+                    )


⚠️ Potential issue | 🟠 Major

Import the project MemoryError here.

src/ai_company/memory/errors.py defines the project-specific MemoryError, and src/ai_company/api/app.py deliberately re-raises it from _on_expire(). Because this module never imports that type, Line 153 only catches the built-in MemoryError; the project exception falls into except Exception and gets downgraded to a log.

Suggested fix

from ai_company.api.errors import ConflictError from ai_company.core.approval import ApprovalItem # noqa: TC001 from ai_company.core.enums import ( ApprovalRiskLevel, ApprovalStatus, ) +from ai_company.memory.errors import MemoryError as StoreMemoryError ... - except MemoryError, RecursionError: + except StoreMemoryError, MemoryError, RecursionError: raise

As per coding guidelines, "Handle errors explicitly, never silently swallow exceptions."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/ai_company/api/approval_store.py` around lines 151 - 160, The except clause currently catches the built-in MemoryError but not the project-specific MemoryError defined in ai_company.memory.errors, so re-raised project errors fall into the generic Exception handler; import the project error (e.g., from ai_company.memory.errors import MemoryError as ProjectMemoryError) and update the except line to explicitly re-raise both RecursionError and the project MemoryError (or include both types in the except tuple used for re-raising) before the generic Exception that logs via logger.exception(API_APPROVAL_EXPIRED, approval_id=item.id, note="on_expire callback failed"); ensure the symbol _on_expire remains where called and no error types are swallowed.

coderabbitai · 2026-03-13T22:53:33Z

+        try:
+            await self._parked_context_repo.delete(parked.id)
+        except MemoryError, RecursionError:
+            raise
+        except Exception:
+            logger.exception(
+                APPROVAL_GATE_RESUME_DELETE_FAILED,
+                approval_id=approval_id,
+                parked_id=parked.id,
+                note="Context resumed but parked record not cleaned up",
+            )
+
+        logger.info(
+            APPROVAL_GATE_CONTEXT_RESUMED,
+            approval_id=approval_id,
+            parked_id=parked.id,
+        )
+        return context, parked.id


⚠️ Potential issue | 🟠 Major

Check the repository's delete() result before claiming resume success.

ParkedContextRepository.delete() returns bool, but this method ignores a False result and still logs APPROVAL_GATE_CONTEXT_RESUMED. If cleanup fails silently, the parked record is still present and the same approval can be resumed again.

Possible fix

try: - await self._parked_context_repo.delete(parked.id) + deleted = await self._parked_context_repo.delete(parked.id) except MemoryError, RecursionError: raise except Exception: logger.exception( APPROVAL_GATE_RESUME_DELETE_FAILED, approval_id=approval_id, parked_id=parked.id, note="Context resumed but parked record not cleaned up", ) + else: + if not deleted: + logger.warning( + APPROVAL_GATE_RESUME_DELETE_FAILED, + approval_id=approval_id, + parked_id=parked.id, + note="Context resumed but parked record not cleaned up", + )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/ai_company/engine/approval_gate.py` around lines 214 - 231, The current resume flow ignores the boolean return from ParkedContextRepository.delete() and always logs APPROVAL_GATE_CONTEXT_RESUMED; update the code in the method containing the await self._parked_context_repo.delete(parked.id) call (e.g., the resume handler in approval_gate.py) to capture the delete result, and if it returns False treat it as a cleanup failure: log/exception with APPROVAL_GATE_RESUME_DELETE_FAILED including approval_id and parked_id (and do not log APPROVAL_GATE_CONTEXT_RESUMED), otherwise proceed to log APPROVAL_GATE_CONTEXT_RESUMED and return (context, parked.id); preserve existing exception handling for MemoryError/RecursionError.

coderabbitai · 2026-03-13T22:53:34Z

+    def test_approved_without_reason(self) -> None:
+        msg = ApprovalGate.build_resume_message(
+            "approval-1",
+            approved=True,
+            decided_by="admin",
+        )
+        assert "APPROVED" in msg
+        assert "approval-1" in msg
+        assert "admin" in msg
+        assert "data only" in msg
+
+    def test_rejected_with_reason(self) -> None:
+        msg = ApprovalGate.build_resume_message(
+            "approval-1",
+            approved=False,
+            decided_by="reviewer",
+            decision_reason="Too risky for production",
+        )
+        assert "REJECTED" in msg
+        assert "approval-1" in msg
+        assert "reviewer" in msg
+        assert "Too risky for production" in msg
+
+    def test_approved_with_reason(self) -> None:
+        msg = ApprovalGate.build_resume_message(
+            "approval-1",
+            approved=True,
+            decided_by="admin",
+            decision_reason="Looks good",
+        )
+        assert "APPROVED" in msg
+        assert "Looks good" in msg


🧹 Nitpick | 🔵 Trivial

Parameterize message-variant assertions to reduce duplication.

These three tests are structurally the same and are a good fit for a single parameterized test.

♻️ Suggested refactor

class TestBuildResumeMessage: - def test_approved_without_reason(self) -> None: - ... - - def test_rejected_with_reason(self) -> None: - ... - - def test_approved_with_reason(self) -> None: - ... + `@pytest.mark.parametrize`( + ("approved", "decided_by", "decision_reason", "expected_tokens"), + [ + (True, "admin", None, ["APPROVED", "approval-1", "admin", "data only"]), + ( + False, + "reviewer", + "Too risky for production", + ["REJECTED", "approval-1", "reviewer", "Too risky for production"], + ), + (True, "admin", "Looks good", ["APPROVED", "Looks good"]), + ], + ) + def test_build_resume_message_variants( + self, + approved: bool, + decided_by: str, + decision_reason: str | None, + expected_tokens: list[str], + ) -> None: + msg = ApprovalGate.build_resume_message( + "approval-1", + approved=approved, + decided_by=decided_by, + decision_reason=decision_reason, + ) + for token in expected_tokens: + assert token in msg

As per coding guidelines, "Prefer @pytest.mark.parametrize for testing similar cases."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def test_approved_without_reason(self) -> None:

msg = ApprovalGate.build_resume_message(

"approval-1",

approved=True,

decided_by="admin",

)

assert "APPROVED" in msg

assert "approval-1" in msg

assert "admin" in msg

assert "data only" in msg

def test_rejected_with_reason(self) -> None:

msg = ApprovalGate.build_resume_message(

"approval-1",

approved=False,

decided_by="reviewer",

decision_reason="Too risky for production",

)

assert "REJECTED" in msg

assert "approval-1" in msg

assert "reviewer" in msg

assert "Too risky for production" in msg

def test_approved_with_reason(self) -> None:

msg = ApprovalGate.build_resume_message(

"approval-1",

approved=True,

decided_by="admin",

decision_reason="Looks good",

)

assert "APPROVED" in msg

assert "Looks good" in msg

`@pytest.mark.parametrize`(

("approved", "decided_by", "decision_reason", "expected_tokens"),

[

(True, "admin", None, ["APPROVED", "approval-1", "admin", "data only"]),

(

False,

"reviewer",

"Too risky for production",

["REJECTED", "approval-1", "reviewer", "Too risky for production"],

),

(True, "admin", "Looks good", ["APPROVED", "Looks good"]),

],

)

def test_build_resume_message_variants(

self,

approved: bool,

decided_by: str,

decision_reason: str | None,

expected_tokens: list[str],

) -> None:

msg = ApprovalGate.build_resume_message(

"approval-1",

approved=approved,

decided_by=decided_by,

decision_reason=decision_reason,

)

for token in expected_tokens:

assert token in msg

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/unit/engine/test_approval_gate.py` around lines 279 - 310, The three tests test_approved_without_reason, test_rejected_with_reason, and test_approved_with_reason duplicate the same structure calling ApprovalGate.build_resume_message and asserting substrings; refactor them into a single `@pytest.mark.parametrize` test that iterates over cases (id, approved boolean, decided_by, decision_reason, expected_substrings) and calls ApprovalGate.build_resume_message once per case, asserting each expected substring is in the returned msg; update or remove the original three test functions and keep descriptive parameter values to preserve coverage.

codecov · 2026-03-13T22:56:14Z

Codecov Report

❌ Patch coverage is 75.30864% with 80 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.75%. Comparing base (494013f) to head (91823f4).
⚠️ Report is 5 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
src/ai_company/engine/_security_factory.py	40.38%	28 Missing and 3 partials ⚠️
src/ai_company/api/controllers/approvals.py	53.84%	10 Missing and 2 partials ⚠️
src/ai_company/api/approval_store.py	33.33%	8 Missing and 2 partials ⚠️
src/ai_company/tools/invoker.py	66.66%	7 Missing and 1 partial ⚠️
src/ai_company/tools/approval_tool.py	89.83%	6 Missing ⚠️
src/ai_company/engine/approval_gate.py	94.02%	4 Missing ⚠️
src/ai_company/api/dto.py	75.00%	2 Missing and 1 partial ⚠️
src/ai_company/engine/agent_engine.py	80.00%	2 Missing and 1 partial ⚠️
src/ai_company/engine/loop_helpers.py	85.00%	2 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #365      +/-   ##
==========================================
- Coverage   93.90%   93.75%   -0.16%     
==========================================
  Files         447      452       +5     
  Lines       20819    21082     +263     
  Branches     2011     2034      +23     
==========================================
+ Hits        19551    19765     +214     
- Misses        981     1021      +40     
- Partials      287      296       +9

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Critical: empty task_id ValidationError, parking metadata bypass, PARKED→ERROR on park failure, wire parked_context_repo, TOCTOU race via save_if_pending, resume trigger wiring in approve/reject. Security: prompt injection mitigation, max_length on comment, ttl_seconds cap, action_type format validation, remove phantom requested_by field. Logging: wrong event constants for init/risk/park-info, consistent MemoryError logging, new INITIALIZED/RISK_CLASSIFIED/RESUME_TRIGGERED constants. Code quality: extract security/tool factories (962→860 lines), split execute() into helpers, deterministic escalation order, boolean parking_failed, check delete() result, risk classifier error handling. Frontend: async handler type fix, WS test payload/await fix, filter-aware WS, 404 vs transient error discrimination, total desync fix, empty catch logging. Tests: park failure expects ERROR, module-level markers, extracted fixtures, simplified PropertyMock, updated resume message assertions. Docs: persistence qualifier, CLAUDE.md events, docstring fixes.

coderabbitai

Actionable comments posted: 7

♻️ Duplicate comments (2)

web/src/stores/approvals.ts (1)

76-95: ⚠️ Potential issue | 🟡 Minor

WS approval.submitted handler still doesn't verify filter match.

When activeFilters.value is truthy, the code bumps total without checking if the fetched item actually matches the active filters. This can cause total to drift from the actual filtered count.

Consider verifying the fetched item against active filters and only incrementing total (and optionally inserting into approvals.value) when it matches:

🔧 Suggested approach

           case 'approval.submitted':
             if (!approvals.value.some((a) => a.id === approvalId)) {
               try {
                 const item = await approvalsApi.getApproval(approvalId)
-                if (activeFilters.value) {
-                  // Filters are active — item may not match; just bump total
-                  total.value++
-                } else {
+                const matchesFilters = !activeFilters.value || itemMatchesFilters(item, activeFilters.value)
+                if (matchesFilters) {
                   approvals.value = [item, ...approvals.value]
                   total.value++
                 }

You would need to implement an itemMatchesFilters helper that checks the item's status, risk_level, etc. against activeFilters.value.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@web/src/stores/approvals.ts` around lines 76 - 95, The handler for
'approval.submitted' currently increments total.value when activeFilters.value
is set without validating the new item against those filters; implement an
itemMatchesFilters(item, filters) helper and after fetching the item via
approvalsApi.getApproval(approvalId) call it when activeFilters.value is truthy
— only increment total.value (and insert item into approvals.value if you want
it visible) when itemMatchesFilters returns true; keep existing 404/410 handling
and only fall back to the previous behavior when no filters are active.

tests/unit/engine/test_approval_gate.py (1)

296-329: 🧹 Nitpick | 🔵 Trivial

Consider parameterizing similar message tests.

These three tests follow the same pattern and could be consolidated using @pytest.mark.parametrize. However, the current form is readable and the tests are few, so this is optional.

Parameterized version (optional)

`@pytest.mark.parametrize`(
    ("approved", "decided_by", "decision_reason", "expected_tokens"),
    [
        (True, "admin", None, ["APPROVED", "approval-1", "admin", "[SYSTEM:"]),
        (False, "reviewer", "Too risky for production", 
         ["REJECTED", "approval-1", "reviewer", "Too risky for production", "USER-SUPPLIED REASON", "untrusted data"]),
        (True, "admin", "Looks good", ["APPROVED", "Looks good", "USER-SUPPLIED REASON"]),
    ],
)
def test_build_resume_message_variants(
    self,
    approved: bool,
    decided_by: str,
    decision_reason: str | None,
    expected_tokens: list[str],
) -> None:
    msg = ApprovalGate.build_resume_message(
        "approval-1",
        approved=approved,
        decided_by=decided_by,
        decision_reason=decision_reason,
    )
    for token in expected_tokens:
        assert token in msg

As per coding guidelines: "Prefer @pytest.mark.parametrize for testing similar cases."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/unit/engine/test_approval_gate.py` around lines 296 - 329, The three
repetitive tests (test_approved_without_reason, test_rejected_with_reason,
test_approved_with_reason) should be consolidated into a single parametrized
test that calls ApprovalGate.build_resume_message with different (approved,
decided_by, decision_reason) inputs and asserts expected substrings; add a
pytest.mark.parametrize decorator (e.g., test_build_resume_message_variants)
with rows for each case and loop over expected_tokens asserting each is in msg,
then remove the three original test_... functions.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@CLAUDE.md`:
- Line 154: The sentence listing event constants after "Event names" is missing
a comma after "e.g."; update the phrase "e.g. `PROVIDER_CALL_START` ..." to
"e.g., `PROVIDER_CALL_START` ..." so the abbreviation is punctuated consistently
in the long list (look for the "Event names" paragraph and the "e.g." before the
`PROVIDER_CALL_START`/events constant list).

In `@src/ai_company/api/controllers/approvals.py`:
- Around line 219-225: Replace the success event used in the exception handler:
create a new event constant APPROVAL_GATE_RESUME_FAILED in
ai_company.observability.events.approval_gate, import it into approvals.py, and
change the logger.warning call inside the except block to use
APPROVAL_GATE_RESUME_FAILED (keeping the same message, approval_id, note and
exc_info=True) instead of APPROVAL_GATE_RESUME_TRIGGERED so failure logging is
correctly categorized.

In `@src/ai_company/engine/approval_gate.py`:
- Around line 213-232: The warning "delete() returned False" can be emitted
incorrectly when delete() actually raised and was logged by logger.exception;
fix by tracking whether delete raised: introduce a boolean (e.g. delete_errored
= False) before calling self._parked_context_repo.delete, set delete_errored =
True inside the broad except Exception block that calls
logger.exception(APPROVAL_GATE_RESUME_DELETE_FAILED, ...), and change the final
check to only log the warning when not deleted and not delete_errored (if not
deleted and not delete_errored: logger.warning(...)). This uses the existing
symbols self._parked_context_repo.delete, deleted, logger.exception,
logger.warning, and APPROVAL_GATE_RESUME_DELETE_FAILED.

In `@src/ai_company/engine/loop_helpers.py`:
- Around line 373-378: The debug log currently emits the
APPROVAL_GATE_CONTEXT_PARKED event while only noting a missing task_id; change
this to a semantically correct diagnostic: either remove the parked event
constant and call logger.debug with a plain message about the missing
task_execution/task_id, or add a new event constant (e.g.,
APPROVAL_GATE_CONTEXT_PARK_NO_TASK) to
ai_company.observability.events.approval_gate and use that here instead of
APPROVAL_GATE_CONTEXT_PARKED; update the logger.debug invocation in
loop_helpers.py (the call that passes approval_id, agent_id and note="No
task_execution on context — task_id will be None") to use the chosen option so
observability events remain accurate.

In `@src/ai_company/tools/invoker.py`:
- Around line 633-634: The construction of ApprovalRiskLevel from
result.metadata can raise ValueError for invalid strings; update the code that
builds the object (the ApprovalRiskLevel(...) call using
result.metadata.get("risk_level", "high")) to validate or convert the metadata
value first: read risk_str = result.metadata.get("risk_level"), check if
risk_str is a valid member of ApprovalRiskLevel (or wrap
ApprovalRiskLevel(risk_str) in a try/except ValueError), and on failure log a
clear metadata validation warning and fall back to a safe default (e.g., "high")
before passing into ApprovalRiskLevel; ensure the change is applied where
ApprovalRiskLevel is instantiated so invalid metadata never raises an uncaught
ValueError.

In `@web/src/__tests__/stores/approvals.test.ts`:
- Around line 286-290: Add an afterEach hook that calls vi.restoreAllMocks() to
fully restore spies between tests (in addition to the existing
vi.clearAllMocks()); also change the axios.isAxiosError spy at the second
location to capture the original implementation (like the first instance) and
delegate to it for non-matching errors by storing (await
import('axios')).default.isAxiosError in originalIsAxiosError and calling
originalIsAxiosError(err) when err !== axiosError so the spy preserves the
original behavior.

In `@web/src/stores/approvals.ts`:
- Around line 96-117: The handler for
'approval.approved'/'approval.rejected'/'approval.expired' currently replaces
the item in approvals.value regardless of current activeFilters; instead, after
fetching the updated approval via approvalsApi.getApproval(approvalId) check
whether the updated item matches the current activeFilters (use whatever
predicate/function you use elsewhere to test filter membership) and if it does
replace the item in approvals.value, otherwise remove it from approvals.value
and decrement total.value accordingly; keep the existing 404/410 removal branch
as-is and perform this filter-check/replace-or-remove logic in the same switch
case that currently updates approvals.value.

---

Duplicate comments:
In `@tests/unit/engine/test_approval_gate.py`:
- Around line 296-329: The three repetitive tests (test_approved_without_reason,
test_rejected_with_reason, test_approved_with_reason) should be consolidated
into a single parametrized test that calls ApprovalGate.build_resume_message
with different (approved, decided_by, decision_reason) inputs and asserts
expected substrings; add a pytest.mark.parametrize decorator (e.g.,
test_build_resume_message_variants) with rows for each case and loop over
expected_tokens asserting each is in msg, then remove the three original
test_... functions.

In `@web/src/stores/approvals.ts`:
- Around line 76-95: The handler for 'approval.submitted' currently increments
total.value when activeFilters.value is set without validating the new item
against those filters; implement an itemMatchesFilters(item, filters) helper and
after fetching the item via approvalsApi.getApproval(approvalId) call it when
activeFilters.value is truthy — only increment total.value (and insert item into
approvals.value if you want it visible) when itemMatchesFilters returns true;
keep existing 404/410 handling and only fall back to the previous behavior when
no filters are active.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: cf611785-f95c-4d33-8547-1af4d9797c41

📥 Commits

Reviewing files that changed from the base of the PR and between b999b65 and 3b5068f.

📒 Files selected for processing (22)

CLAUDE.md
docs/design/engine.md
src/ai_company/api/approval_store.py
src/ai_company/api/controllers/approvals.py
src/ai_company/api/dto.py
src/ai_company/api/ws_models.py
src/ai_company/engine/_security_factory.py
src/ai_company/engine/agent_engine.py
src/ai_company/engine/approval_gate.py
src/ai_company/engine/loop_helpers.py
src/ai_company/observability/events/approval_gate.py
src/ai_company/security/timeout/park_service.py
src/ai_company/security/timeout/parked_context.py
src/ai_company/tools/approval_tool.py
src/ai_company/tools/invoker.py
tests/unit/api/controllers/test_approvals.py
tests/unit/api/test_dto.py
tests/unit/engine/test_approval_gate.py
tests/unit/engine/test_loop_helpers_approval.py
tests/unit/observability/test_events.py
web/src/__tests__/stores/approvals.test.ts
web/src/stores/approvals.ts

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Build Backend
GitHub Check: Build Web
GitHub Check: Greptile Review
GitHub Check: Test (Python 3.14)

🧰 Additional context used

📓 Path-based instructions (9)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Python version 3.14+ with PEP 649 native lazy annotations required.
Do NOT use from __future__ import annotations — Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax: except A, B: (no parentheses) — ruff enforces this on Python 3.14.
Line length must be 88 characters, enforced by ruff.
NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001. Tests must use test-provider, test-small-001, etc.

Files:

tests/unit/api/controllers/test_approvals.py
src/ai_company/security/timeout/park_service.py
src/ai_company/security/timeout/parked_context.py
src/ai_company/api/controllers/approvals.py
src/ai_company/engine/_security_factory.py
tests/unit/engine/test_loop_helpers_approval.py
src/ai_company/observability/events/approval_gate.py
src/ai_company/api/ws_models.py
src/ai_company/tools/invoker.py
src/ai_company/engine/approval_gate.py
src/ai_company/api/dto.py
src/ai_company/tools/approval_tool.py
tests/unit/api/test_dto.py
src/ai_company/engine/agent_engine.py
src/ai_company/api/approval_store.py
tests/unit/engine/test_approval_gate.py
src/ai_company/engine/loop_helpers.py
tests/unit/observability/test_events.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Tests must use markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow
Test coverage minimum: 80% (enforced in CI).
Async tests: use asyncio_mode = "auto" — no manual @pytest.mark.asyncio needed.
Test timeout: 30 seconds per test.
Prefer @pytest.mark.parametrize for testing similar cases.

Files:

tests/unit/api/controllers/test_approvals.py
tests/unit/engine/test_loop_helpers_approval.py
tests/unit/api/test_dto.py
tests/unit/engine/test_approval_gate.py
tests/unit/observability/test_events.py

src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: All public functions must have type hints. Enforce via mypy strict mode.
Google-style docstrings required on public classes and functions, enforced by ruff D rules.
Create new objects instead of mutating existing ones. For non-Pydantic internal collections, use copy.deepcopy() at construction plus MappingProxyType wrapping for read-only enforcement.
For dict/list fields in frozen Pydantic models, rely on frozen=True for field reassignment prevention and use copy.deepcopy() at system boundaries (tool execution, LLM provider serialization, inter-agent delegation, persistence serialization).
Use frozen Pydantic models for config/identity; use separate mutable-via-copy models for runtime state that evolves. Never mix static config fields with mutable runtime fields in one model.
Use Pydantic v2 with BaseModel, model_validator, computed_field, ConfigDict. Use @computed_field for derived values instead of storing redundant fields (e.g., TokenUsage.total_tokens).
Use NotBlankStr from core.types for all identifier/name fields in Pydantic models, including optional (NotBlankStr | None) and tuple (tuple[NotBlankStr, ...]) variants, instead of manual whitespace validators.
Prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code (multiple tool invocations, parallel agent calls). Prefer structured concurrency over bare create_task.
Functions must be less than 50 lines; files must be less than 800 lines.
Handle errors explicitly, never silently swallow exceptions.
Validate at system boundaries: user input, external APIs, config files.
NEVER use import logging, logging.getLogger(), or print() in application code.
Always use logger as the variable name for loggers (not _logger, not log).
Event names must always use constants from domain-specific modules under ai_company.observability.events (e.g., PROVIDER_CALL_START from events.provider, BUDGET_RECORD_ADDED from events.budget). Import dire...

Files:

src/ai_company/security/timeout/park_service.py
src/ai_company/security/timeout/parked_context.py
src/ai_company/api/controllers/approvals.py
src/ai_company/engine/_security_factory.py
src/ai_company/observability/events/approval_gate.py
src/ai_company/api/ws_models.py
src/ai_company/tools/invoker.py
src/ai_company/engine/approval_gate.py
src/ai_company/api/dto.py
src/ai_company/tools/approval_tool.py
src/ai_company/engine/agent_engine.py
src/ai_company/api/approval_store.py
src/ai_company/engine/loop_helpers.py

src/ai_company/**/[!_]*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Every module with business logic MUST have: from ai_company.observability import get_logger then logger = get_logger(__name__)

Files:

src/ai_company/security/timeout/park_service.py
src/ai_company/security/timeout/parked_context.py
src/ai_company/api/controllers/approvals.py
src/ai_company/observability/events/approval_gate.py
src/ai_company/api/ws_models.py
src/ai_company/tools/invoker.py
src/ai_company/engine/approval_gate.py
src/ai_company/api/dto.py
src/ai_company/tools/approval_tool.py
src/ai_company/engine/agent_engine.py
src/ai_company/api/approval_store.py
src/ai_company/engine/loop_helpers.py

web/src/**/*.{vue,ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Web dashboard: Vue 3 + PrimeVue + Tailwind CSS, organized by feature in src/components/, src/stores/, src/views/. Enforce with ESLint and vue-tsc type-checking.

Files:

web/src/stores/approvals.ts
web/src/__tests__/stores/approvals.test.ts

web/src/stores/**/*.ts

📄 CodeRabbit inference engine (CLAUDE.md)

Frontend state management: Pinia stores organized by feature in src/stores/

Files:

web/src/stores/approvals.ts

web/src/__tests__/**/*.ts

📄 CodeRabbit inference engine (CLAUDE.md)

Web dashboard tests: Vitest unit tests organized by feature in tests/

Files:

web/src/__tests__/stores/approvals.test.ts

docs/design/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

When approved deviations occur, update the relevant docs/design/ page to reflect the new reality.

Files:

docs/design/engine.md

docs/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Documentation source in docs/ built with Zensical. Design spec in docs/design/ (7 pages). Architecture in docs/architecture/. Roadmap in docs/roadmap/. Security in docs/security.md.

Files:

docs/design/engine.md

🧠 Learnings (16)

📚 Learning: 2026-03-13T21:03:58.906Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/ai_company/**/[!_]*.py : Every module with business logic MUST have: `from ai_company.observability import get_logger` then `logger = get_logger(__name__)`

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-13T21:03:58.906Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Event names must always use constants from domain-specific modules under `ai_company.observability.events` (e.g., `PROVIDER_CALL_START` from `events.provider`, `BUDGET_RECORD_ADDED` from `events.budget`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`

Applied to files:

CLAUDE.md
src/ai_company/observability/events/approval_gate.py
src/ai_company/engine/agent_engine.py
src/ai_company/engine/loop_helpers.py
tests/unit/observability/test_events.py

📚 Learning: 2026-03-13T21:03:58.906Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Structured logging: always use `logger.info(EVENT, key=value)` — never `logger.info("msg %s", val)`

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-13T21:03:58.906Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : All state transitions must log at INFO level.

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-13T21:03:58.906Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : NEVER use `import logging`, `logging.getLogger()`, or `print()` in application code.

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-13T21:03:58.906Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : All error paths must log at WARNING or ERROR with context before raising.

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-13T21:03:58.906Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Always use `logger` as the variable name for loggers (not `_logger`, not `log`).

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-13T21:03:58.906Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : DEBUG logging for object creation, internal flow, and entry/exit of key functions.

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-13T21:03:58.906Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Pure data models, enums, and re-exports do NOT need logging.

Applied to files:

CLAUDE.md

📚 Learning: 2026-03-13T21:03:58.907Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.907Z
Learning: When review agents find valid issues (including pre-existing issues in surrounding code, suggestions, and findings adjacent to the PR's changes), fix them all. No deferring, no "out of scope" skipping.

Applied to files:

web/src/stores/approvals.ts

📚 Learning: 2026-03-13T21:03:58.907Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.907Z
Learning: Applies to web/src/stores/**/*.ts : Frontend state management: Pinia stores organized by feature in src/stores/

Applied to files:

web/src/stores/approvals.ts

📚 Learning: 2026-03-13T21:03:58.907Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.907Z
Learning: Applies to web/src/__tests__/**/*.ts : Web dashboard tests: Vitest unit tests organized by feature in __tests__/

Applied to files:

web/src/__tests__/stores/approvals.test.ts

📚 Learning: 2026-03-13T21:03:58.906Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to src/**/*.py : Use Pydantic v2 with `BaseModel`, `model_validator`, `computed_field`, `ConfigDict`. Use `computed_field` for derived values instead of storing redundant fields (e.g., `TokenUsage.total_tokens`).

Applied to files:

src/ai_company/api/dto.py

📚 Learning: 2026-03-13T21:03:58.906Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to tests/**/*.py : Prefer `pytest.mark.parametrize` for testing similar cases.

Applied to files:

tests/unit/engine/test_approval_gate.py

📚 Learning: 2026-03-13T21:03:58.906Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to tests/**/*.py : Tests must use markers: `pytest.mark.unit`, `pytest.mark.integration`, `pytest.mark.e2e`, `pytest.mark.slow`

Applied to files:

tests/unit/observability/test_events.py

📚 Learning: 2026-03-13T21:03:58.906Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to tests/**/*.py : Test timeout: 30 seconds per test.

Applied to files:

tests/unit/observability/test_events.py

🧬 Code graph analysis (14)

src/ai_company/security/timeout/park_service.py (1)

src/ai_company/engine/parallel_models.py (1)

task_id (87-89)

src/ai_company/security/timeout/parked_context.py (2)

src/ai_company/engine/parallel_models.py (1)

task_id (87-89)

src/ai_company/tools/base.py (1)

description (138-140)

src/ai_company/api/controllers/approvals.py (5)

src/ai_company/api/state.py (2)

approval_gate (139-141)

AppState (23-163)

src/ai_company/engine/approval_gate.py (3)

ApprovalGate (40-274)

resume_context (162-239)

build_resume_message (242-274)

src/ai_company/memory/errors.py (1)

MemoryError (13-14)

src/ai_company/api/approval_store.py (2)

get (61-73)

save_if_pending (125-142)

src/ai_company/api/errors.py (2)

UnauthorizedError (59-65)

ConflictError (41-47)

tests/unit/engine/test_loop_helpers_approval.py (4)

src/ai_company/engine/approval_gate.py (3)

ApprovalGate (40-274)

should_park (71-90)

park_context (92-160)

src/ai_company/engine/loop_helpers.py (1)

execute_tool_calls (241-343)

src/ai_company/engine/loop_protocol.py (2)

ExecutionResult (79-140)

TerminationReason (28-36)

src/ai_company/providers/models.py (1)

CompletionResponse (257-306)

web/src/stores/approvals.ts (2)

src/ai_company/api/ws_models.py (1)

WsEvent (48-73)

web/src/api/types.ts (1)

WsEvent (519-524)

src/ai_company/tools/invoker.py (3)

src/ai_company/core/enums.py (1)

ApprovalRiskLevel (443-449)

src/ai_company/engine/approval_gate_models.py (1)

EscalationInfo (14-33)

src/ai_company/tools/errors.py (1)

ToolExecutionError (55-56)

web/src/__tests__/stores/approvals.test.ts (2)

web/src/stores/approvals.ts (1)

useApprovalStore (8-136)

web/src/api/types.ts (1)

WsEvent (519-524)

src/ai_company/engine/approval_gate.py (5)

src/ai_company/persistence/repositories.py (1)

ParkedContextRepository (199-267)

src/ai_company/security/timeout/park_service.py (3)

ParkService (29-144)

park (37-106)

resume (108-144)

src/ai_company/security/timeout/parked_context.py (1)

ParkedContext (19-66)

src/ai_company/engine/approval_gate_models.py (1)

EscalationInfo (14-33)

src/ai_company/engine/context.py (1)

AgentContext (87-307)

src/ai_company/api/dto.py (1)

src/ai_company/engine/parallel_models.py (1)

task_id (87-89)

src/ai_company/tools/approval_tool.py (6)

src/ai_company/core/enums.py (1)

ToolCategory (294-308)

src/ai_company/observability/_logger.py (1)

get_logger (8-28)

src/ai_company/tools/base.py (6)

BaseTool (57-184)

ToolExecutionResult (25-54)

description (138-140)

category (128-130)

action_type (133-135)

parameters_schema (143-151)

src/ai_company/api/approval_store.py (1)

add (42-59)

src/ai_company/security/timeout/risk_tier_classifier.py (1)

DefaultRiskTierClassifier (62-101)

src/ai_company/memory/errors.py (1)

MemoryError (13-14)

tests/unit/api/test_dto.py (3)

src/ai_company/tools/base.py (2)

action_type (133-135)

description (138-140)

src/ai_company/core/enums.py (1)

ApprovalRiskLevel (443-449)

src/ai_company/api/dto.py (1)

CreateApprovalRequest (196-249)

src/ai_company/engine/agent_engine.py (5)

src/ai_company/engine/_security_factory.py (2)

make_security_interceptor (39-117)

registry_with_approval_tool (120-148)

src/ai_company/engine/approval_gate.py (1)

ApprovalGate (40-274)

src/ai_company/persistence/repositories.py (1)

ParkedContextRepository (199-267)

src/ai_company/engine/loop_protocol.py (1)

ExecutionLoop (151-189)

src/ai_company/engine/react_loop.py (1)

ReactLoop (55-269)

src/ai_company/api/approval_store.py (4)

src/ai_company/core/approval.py (1)

ApprovalItem (24-96)

src/ai_company/core/enums.py (1)

ApprovalStatus (434-440)

src/ai_company/api/app.py (1)

_on_expire (74-99)

src/ai_company/memory/errors.py (1)

MemoryError (13-14)

tests/unit/engine/test_approval_gate.py (3)

src/ai_company/engine/approval_gate.py (3)

ApprovalGate (40-274)

park_context (92-160)

build_resume_message (242-274)

src/ai_company/persistence/repositories.py (1)

ParkedContextRepository (199-267)

src/ai_company/security/timeout/park_service.py (3)

ParkService (29-144)

park (37-106)

resume (108-144)

🪛 LanguageTool

CLAUDE.md

[style] ~154-~154: A comma is missing here.
Context: ...nder ai_company.observability.events (e.g. PROVIDER_CALL_START from `events.prov...

(EG_NO_COMMA)

🔇 Additional comments (37)

src/ai_company/api/ws_models.py (1)

45-45: LGTM!

The new APPROVAL_RESUMED enum member follows the established naming convention and aligns with the other approval-related WebSocket event types.

src/ai_company/security/timeout/parked_context.py (1)

41-43: LGTM!

The change to make task_id optional (NotBlankStr | None) correctly supports the taskless agents scenario. The default of None and updated description align with the PR objectives.

src/ai_company/security/timeout/park_service.py (1)

37-45: LGTM!

The signature change to accept optional task_id is well-implemented. The equality check at lines 92-97 correctly handles None comparison (Python's None == None evaluates to True), and the docstring clearly documents the taskless agent use case.

web/src/stores/approvals.ts (1)

87-93: LGTM on error handling differentiation.

The error handling now correctly distinguishes between definitive 404/410 responses (item genuinely gone) and transient errors (timeouts, 5xx). This addresses the previous review concern about treating all fetch errors as deletes.

Also applies to: 104-115

src/ai_company/api/dto.py (2)

221-232: LGTM!

The action_type field validator correctly enforces the category:action format. Using strip() to check for whitespace-only parts and the _ACTION_TYPE_PARTS constant for the magic number are good practices.

217-217: LGTM!

The ttl_seconds upper bound of 604800 (7 days) provides reasonable protection against excessive TTL values while maintaining flexibility.

CLAUDE.md (1)

101-101: LGTM!

Documentation updates accurately reflect the new approval gate integration across the API, engine, and tools modules.

Also applies to: 107-107, 115-115

src/ai_company/engine/_security_factory.py (2)

39-117: LGTM!

The make_security_interceptor function has proper defensive checks for configuration mismatches (autonomy without security config), appropriate logging before raising errors, and clean construction of the security stack with conditional detector inclusion.

120-148: LGTM!

The registry_with_approval_tool factory cleanly handles the optional approval store case and uses deferred imports to avoid circular dependencies. The pattern of building a new registry with existing tools plus the approval tool maintains immutability.

src/ai_company/tools/approval_tool.py (4)

97-133: LGTM!

The execute method is well-structured as an orchestrator, delegating to focused helper methods for validation, risk classification, persistence, and result building. This addresses the previous review concern about method length.

162-163: LGTM on PEP 758 exception syntax.

The except MemoryError, RecursionError: syntax correctly uses the Python 3.14 PEP 758 format (comma-separated, no parentheses) as required by the coding guidelines.

Also applies to: 238-239

135-178: LGTM!

The _persist_item helper properly handles persistence errors, re-raises critical exceptions (MemoryError, RecursionError), and logs failures with context before returning an error result.

229-252: LGTM!

The _classify_risk method correctly implements fail-safe behavior by defaulting to HIGH when classification fails or no classifier is configured, aligning with the D19 design spec mentioned in DefaultRiskTierClassifier.

src/ai_company/observability/events/approval_gate.py (1)

1-16: LGTM!

The event constants are well-structured, following the approval_gate.* naming convention consistently. The coverage spans the full approval gate lifecycle (initialization, escalation detection, context parking/resume, and failure cases).

tests/unit/observability/test_events.py (1)

135-135: LGTM!

The module-level pytestmark consolidation is cleaner than per-class markers, and adding "approval_gate" to the expected domain modules ensures the new event constants file is properly discovered by the test suite.

Also applies to: 176-178

src/ai_company/engine/approval_gate.py (1)

241-274: Good prompt injection mitigation in build_resume_message.

The use of repr() wrapping for user-supplied values and explicit labeling of untrusted data is a solid defense against prompt injection in the resume flow.

src/ai_company/engine/agent_engine.py (3)

139-142: LGTM — approval gate wiring is correctly sequenced.

The initialization order ensures _approval_store and _parked_context_repo are assigned before _make_approval_gate() is called, and the approval gate is created before _make_default_loop(). This addresses the previously flagged issue about wiring the parked_context_repo.

597-617: LGTM — approval gate factory properly guards on approval store.

The factory correctly returns None when no approval store is configured, ensuring the execution loop skips approval-gate checks in that case. The late import of ParkService avoids circular dependencies.

631-655: LGTM — tool invoker properly augmented with approval tool.

The registry augmentation via registry_with_approval_tool correctly passes the identity and task_id, enabling the approval tool when an approval store is configured.

tests/unit/api/controllers/test_approvals.py (1)

22-27: LGTM — action_type format updated to match new validation.

The change from "code_merge" to "code:merge" aligns with the CreateApprovalRequest validator that now enforces the "category:action" format.

tests/unit/api/test_dto.py (1)

40-45: LGTM — action_type format consistently updated across all metadata tests.

All four test cases now use "deploy:release" to comply with the "category:action" format requirement, allowing the tests to properly exercise the metadata validation logic without failing on action_type validation.

Also applies to: 51-56, 62-67, 71-76

docs/design/engine.md (1)

450-468: LGTM — documentation accurately describes the new approval gate flow.

The additions clearly explain the escalation detection, parking mechanism, and how PARKED termination interacts with the task lifecycle. The documentation aligns well with the implementation in ApprovalGate and loop_helpers.

src/ai_company/api/approval_store.py (2)

125-142: LGTM — save_if_pending correctly mitigates TOCTOU race.

The method applies lazy expiration check before comparing status, ensuring concurrent expirations don't result in saving a decision on an already-expired item. Returning None when the item is no longer PENDING allows callers to detect and handle concurrent decisions.

170-179: LGTM — exception handling in _check_expiration is appropriate.

The try/except properly re-raises MemoryError and RecursionError as critical errors, while logging other callback failures without disrupting the expiration flow. This aligns with the coding guideline to handle errors explicitly.

tests/unit/engine/test_loop_helpers_approval.py (1)

1-205: LGTM! Comprehensive test coverage for approval gate integration.

The test file properly covers the key scenarios:

No approval gate returns context normally

With gate but no escalation returns context

Escalation triggers parking with correct metadata

Park failure correctly returns ERROR (not PARKED)

Past review feedback has been addressed: PropertyMock removed in favor of simple attribute assignment, and park failure now expects TerminationReason.ERROR as appropriate.

src/ai_company/engine/loop_helpers.py (2)

241-343: Well-structured approval gate integration.

The execute_tool_calls function cleanly integrates the approval gate check after tool results are processed. The flow correctly:

Checks for escalations only when approval_gate is provided

Delegates to _park_for_approval for the parking logic

Returns updated context when no parking is needed

346-419: Proper error handling in parking flow.

The _park_for_approval helper correctly:

Returns ERROR (not PARKED) when parking fails — this ensures non-resumable failures are not masked

Uses boolean parking_failed in metadata (addresses past review feedback)

Re-raises MemoryError and RecursionError appropriately

src/ai_company/api/controllers/approvals.py (3)

101-109: Correct PEP 758 except syntax and error handling.

The exception handling properly re-raises MemoryError and RecursionError while allowing other exceptions to be logged as warnings without disrupting the approval flow.

319-322: Authentication enforcement for create_approval.

Good security hardening — requested_by is now bound to the authenticated user's username rather than accepting it from the request payload, mitigating potential spoofing.

410-418: TOCTOU race condition mitigated with save_if_pending.

Using save_if_pending instead of a separate check-then-save properly handles concurrent approval decisions. The ConflictError (409) response is appropriate for this scenario.

src/ai_company/tools/invoker.py (3)

120-136: Escalation tracking property is well-documented.

The pending_escalations property clearly documents when it's populated (ESCALATE verdicts, parking metadata) and when it's cleared (start of invoke/invoke_all). The tuple return type ensures immutability for callers.

Note: The past review concern about "approval gating discovered too late" is an architectural limitation — concurrent tool calls can complete before parking is checked. This would require a fundamentally different invocation model to address.

773-779: Deterministic escalation ordering.

Sorting escalations by original tool-call index ensures consistent behavior regardless of concurrent execution order. This is important for reproducible parking decisions.

601-652: Defensive escalation tracking with fail-closed behavior.

The _track_parking_metadata helper properly:

Checks both requires_parking=True and presence of approval_id

Re-raises MemoryError/RecursionError

Raises ToolExecutionError on tracking failure to prevent silent bypass of approval gate

tests/unit/engine/test_approval_gate.py (2)

34-54: Good use of fixtures to reduce test setup duplication.

The park_service, parked_mock, and repo fixtures cleanly extract the repeated mock wiring that was noted in past reviews. This improves maintainability.

57-290: Comprehensive test coverage for ApprovalGate.

The tests thoroughly cover:

should_park: empty input and multiple escalations

park_context: service calls, repo persistence, no-repo path, serialization errors, repo save errors

resume_context: successful resume, unknown approval, no repo, deletion after resume, deserialization failure, delete failure resilience

Edge cases like delete failure not losing context (lines 269-290) are particularly valuable for reliability.

web/src/__tests__/stores/approvals.test.ts (2)

6-6: Good async test synchronization for handleWsEvent.

Using flushPromises here is a solid fit since handleWsEvent schedules async work and returns synchronously.

163-279: WS event coverage is strong and aligned with the new approval_id contract.

These cases now validate fetch-on-event behavior, duplicate prevention, and total-count handling under filtered and error conditions well.

Also applies to: 341-366

coderabbitai · 2026-03-13T23:39:00Z

 - **Never** use `import logging` / `logging.getLogger()` / `print()` in application code
 - **Variable name**: always `logger` (not `_logger`, not `log`)
- **Event names**: always use constants from the domain-specific module under `ai_company.observability.events` (e.g. `PROVIDER_CALL_START` from `events.provider`, `BUDGET_RECORD_ADDED` from `events.budget`, `CFO_ANOMALY_DETECTED` from `events.cfo`, `CONFLICT_DETECTED` from `events.conflict`, `MEETING_STARTED` from `events.meeting`, `CLASSIFICATION_START` from `events.classification`, `CONSOLIDATION_START` from `events.consolidation`, `ORG_MEMORY_QUERY_START` from `events.org_memory`, `API_REQUEST_STARTED` from `events.api`, `API_ROUTE_NOT_FOUND` from `events.api`, `CODE_RUNNER_EXECUTE_START` from `events.code_runner`, `DOCKER_EXECUTE_START` from `events.docker`, `MCP_INVOKE_START` from `events.mcp`, `SECURITY_EVALUATE_START` from `events.security`, `HR_HIRING_REQUEST_CREATED` from `events.hr`, `PERF_METRIC_RECORDED` from `events.performance`, `TRUST_EVALUATE_START` from `events.trust`, `PROMOTION_EVALUATE_START` from `events.promotion`, `PROMPT_BUILD_START` from `events.prompt`, `MEMORY_RETRIEVAL_START` from `events.memory`, `MEMORY_BACKEND_CONNECTED` from `events.memory`, `MEMORY_ENTRY_STORED` from `events.memory`, `MEMORY_BACKEND_SYSTEM_ERROR` from `events.memory`, `AUTONOMY_ACTION_AUTO_APPROVED` from `events.autonomy`, `TIMEOUT_POLICY_EVALUATED` from `events.timeout`, `PERSISTENCE_AUDIT_ENTRY_SAVED` from `events.persistence`, `TASK_ENGINE_STARTED` from `events.task_engine`, `COORDINATION_STARTED` from `events.coordination`, `COMMUNICATION_DISPATCH_START` from `events.communication`, `COMPANY_STARTED` from `events.company`, `CONFIG_LOADED` from `events.config`, `CORRELATION_ID_CREATED` from `events.correlation`, `DECOMPOSITION_STARTED` from `events.decomposition`, `DELEGATION_STARTED` from `events.delegation`, `EXECUTION_LOOP_STARTED` from `events.execution`, `GIT_OPERATION_START` from `events.git`, `PARALLEL_EXECUTION_STARTED` from `events.parallel`, `PERSONALITY_LOADED` from `events.personality`, `QUOTA_CHECKED` from `events.quota`, `ROLE_ASSIGNED` from `events.role`, `ROUTING_STARTED` from `events.routing`, `SANDBOX_EXECUTE_START` from `events.sandbox`, `TASK_CREATED` from `events.task`, `TASK_ASSIGNMENT_STARTED` from `events.task_assignment`, `TASK_ROUTING_STARTED` from `events.task_routing`, `TEMPLATE_LOADED` from `events.template`, `TOOL_INVOKE_START` from `events.tool`, `WORKSPACE_CREATED` from `events.workspace`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`
+- **Event names**: always use constants from the domain-specific module under `ai_company.observability.events` (e.g. `PROVIDER_CALL_START` from `events.provider`, `BUDGET_RECORD_ADDED` from `events.budget`, `CFO_ANOMALY_DETECTED` from `events.cfo`, `CONFLICT_DETECTED` from `events.conflict`, `MEETING_STARTED` from `events.meeting`, `CLASSIFICATION_START` from `events.classification`, `CONSOLIDATION_START` from `events.consolidation`, `ORG_MEMORY_QUERY_START` from `events.org_memory`, `API_REQUEST_STARTED` from `events.api`, `API_ROUTE_NOT_FOUND` from `events.api`, `CODE_RUNNER_EXECUTE_START` from `events.code_runner`, `DOCKER_EXECUTE_START` from `events.docker`, `MCP_INVOKE_START` from `events.mcp`, `SECURITY_EVALUATE_START` from `events.security`, `HR_HIRING_REQUEST_CREATED` from `events.hr`, `PERF_METRIC_RECORDED` from `events.performance`, `TRUST_EVALUATE_START` from `events.trust`, `PROMOTION_EVALUATE_START` from `events.promotion`, `PROMPT_BUILD_START` from `events.prompt`, `MEMORY_RETRIEVAL_START` from `events.memory`, `MEMORY_BACKEND_CONNECTED` from `events.memory`, `MEMORY_ENTRY_STORED` from `events.memory`, `MEMORY_BACKEND_SYSTEM_ERROR` from `events.memory`, `AUTONOMY_ACTION_AUTO_APPROVED` from `events.autonomy`, `TIMEOUT_POLICY_EVALUATED` from `events.timeout`, `PERSISTENCE_AUDIT_ENTRY_SAVED` from `events.persistence`, `TASK_ENGINE_STARTED` from `events.task_engine`, `COORDINATION_STARTED` from `events.coordination`, `COMMUNICATION_DISPATCH_START` from `events.communication`, `COMPANY_STARTED` from `events.company`, `CONFIG_LOADED` from `events.config`, `CORRELATION_ID_CREATED` from `events.correlation`, `DECOMPOSITION_STARTED` from `events.decomposition`, `DELEGATION_STARTED` from `events.delegation`, `EXECUTION_LOOP_STARTED` from `events.execution`, `GIT_OPERATION_START` from `events.git`, `PARALLEL_EXECUTION_STARTED` from `events.parallel`, `PERSONALITY_LOADED` from `events.personality`, `QUOTA_CHECKED` from `events.quota`, `ROLE_ASSIGNED` from `events.role`, `ROUTING_STARTED` from `events.routing`, `SANDBOX_EXECUTE_START` from `events.sandbox`, `TASK_CREATED` from `events.task`, `TASK_ASSIGNMENT_STARTED` from `events.task_assignment`, `TASK_ROUTING_STARTED` from `events.task_routing`, `TEMPLATE_LOADED` from `events.template`, `TOOL_INVOKE_START` from `events.tool`, `WORKSPACE_CREATED` from `events.workspace`, `APPROVAL_GATE_ESCALATION_DETECTED` from `events.approval_gate`, `APPROVAL_GATE_INITIALIZED` from `events.approval_gate`, `APPROVAL_GATE_RISK_CLASSIFIED` from `events.approval_gate`, `APPROVAL_GATE_RESUME_TRIGGERED` from `events.approval_gate`). Import directly: `from ai_company.observability.events.<domain> import EVENT_CONSTANT`


🧹 Nitpick | 🔵 Trivial

Consider adding comma after "e.g." for consistency.

Static analysis flagged the missing comma. While minor, it improves readability in the dense event constant listing.

🧰 Tools

🪛 LanguageTool

[style] ~154-~154: A comma is missing here.
Context: ...nder ai_company.observability.events (e.g. PROVIDER_CALL_START from `events.prov...

(EG_NO_COMMA)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@CLAUDE.md` at line 154, The sentence listing event constants after "Event names" is missing a comma after "e.g."; update the phrase "e.g. `PROVIDER_CALL_START` ..." to "e.g., `PROVIDER_CALL_START` ..." so the abbreviation is punctuated consistently in the long list (look for the "Event names" paragraph and the "e.g." before the `PROVIDER_CALL_START`/events constant list).

coderabbitai · 2026-03-13T23:39:00Z

+    except Exception:
+        logger.warning(
+            APPROVAL_GATE_RESUME_TRIGGERED,
+            approval_id=approval_id,
+            note="Resume trigger failed — decision was saved successfully",
+            exc_info=True,
+        )


⚠️ Potential issue | 🟡 Minor

Success event used for failure logging.

APPROVAL_GATE_RESUME_TRIGGERED semantically indicates a successful resume trigger. Using it for the failure path (with exc_info=True) will confuse observability dashboards and alerting.

Consider using a separate event constant like APPROVAL_GATE_RESUME_FAILED for this error path.

Suggested fix

except Exception: logger.warning( - APPROVAL_GATE_RESUME_TRIGGERED, + APPROVAL_GATE_RESUME_FAILED, approval_id=approval_id, note="Resume trigger failed — decision was saved successfully", exc_info=True, )

You'll need to add APPROVAL_GATE_RESUME_FAILED to ai_company.observability.events.approval_gate and import it.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/ai_company/api/controllers/approvals.py` around lines 219 - 225, Replace the success event used in the exception handler: create a new event constant APPROVAL_GATE_RESUME_FAILED in ai_company.observability.events.approval_gate, import it into approvals.py, and change the logger.warning call inside the except block to use APPROVAL_GATE_RESUME_FAILED (keeping the same message, approval_id, note and exc_info=True) instead of APPROVAL_GATE_RESUME_TRIGGERED so failure logging is correctly categorized.

coderabbitai · 2026-03-13T23:39:00Z

+        deleted = False
+        try:
+            deleted = await self._parked_context_repo.delete(parked.id)
+        except MemoryError, RecursionError:
+            raise
+        except Exception:
+            logger.exception(
+                APPROVAL_GATE_RESUME_DELETE_FAILED,
+                approval_id=approval_id,
+                parked_id=parked.id,
+                note="Context resumed but parked record not cleaned up",
+            )
+
+        if not deleted:
+            logger.warning(
+                APPROVAL_GATE_RESUME_DELETE_FAILED,
+                approval_id=approval_id,
+                parked_id=parked.id,
+                note="delete() returned False — parked record may still exist",
+            )


⚠️ Potential issue | 🟡 Minor

Misleading warning message when delete throws an exception.

If delete() raises an exception, it's logged via logger.exception(), but then the if not deleted: block also fires with the note "delete() returned False" — which is incorrect since it threw, not returned False. This creates redundant/confusing logging.

Suggested fix

deleted = False + delete_exception = False try: deleted = await self._parked_context_repo.delete(parked.id) except MemoryError, RecursionError: raise except Exception: + delete_exception = True logger.exception( APPROVAL_GATE_RESUME_DELETE_FAILED, approval_id=approval_id, parked_id=parked.id, note="Context resumed but parked record not cleaned up", ) - if not deleted: + if not deleted and not delete_exception: logger.warning( APPROVAL_GATE_RESUME_DELETE_FAILED, approval_id=approval_id, parked_id=parked.id, note="delete() returned False — parked record may still exist", )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

deleted = False

try:

deleted = await self._parked_context_repo.delete(parked.id)

except MemoryError, RecursionError:

raise

except Exception:

logger.exception(

APPROVAL_GATE_RESUME_DELETE_FAILED,

approval_id=approval_id,

parked_id=parked.id,

note="Context resumed but parked record not cleaned up",

)

if not deleted:

logger.warning(

APPROVAL_GATE_RESUME_DELETE_FAILED,

approval_id=approval_id,

parked_id=parked.id,

note="delete() returned False — parked record may still exist",

)

deleted = False

delete_exception = False

try:

deleted = await self._parked_context_repo.delete(parked.id)

except MemoryError, RecursionError:

raise

except Exception:

delete_exception = True

logger.exception(

APPROVAL_GATE_RESUME_DELETE_FAILED,

approval_id=approval_id,

parked_id=parked.id,

note="Context resumed but parked record not cleaned up",

)

if not deleted and not delete_exception:

logger.warning(

APPROVAL_GATE_RESUME_DELETE_FAILED,

approval_id=approval_id,

parked_id=parked.id,

note="delete() returned False — parked record may still exist",

)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/ai_company/engine/approval_gate.py` around lines 213 - 232, The warning "delete() returned False" can be emitted incorrectly when delete() actually raised and was logged by logger.exception; fix by tracking whether delete raised: introduce a boolean (e.g. delete_errored = False) before calling self._parked_context_repo.delete, set delete_errored = True inside the broad except Exception block that calls logger.exception(APPROVAL_GATE_RESUME_DELETE_FAILED, ...), and change the final check to only log the warning when not deleted and not delete_errored (if not deleted and not delete_errored: logger.warning(...)). This uses the existing symbols self._parked_context_repo.delete, deleted, logger.exception, logger.warning, and APPROVAL_GATE_RESUME_DELETE_FAILED.

coderabbitai · 2026-03-13T23:39:01Z

+        logger.debug(
+            APPROVAL_GATE_CONTEXT_PARKED,
+            approval_id=escalation.approval_id,
+            agent_id=agent_id,
+            note="No task_execution on context — task_id will be None",
+        )


⚠️ Potential issue | 🟡 Minor

Misleading event constant for debug log.

APPROVAL_GATE_CONTEXT_PARKED is logged before parking actually occurs. This log is meant to note a missing task_id, not a successful park. Using a "parked" event here is semantically incorrect and could confuse observability tooling.

Consider using a DEBUG-level log without the parked event constant, or introduce a separate event for this diagnostic note.

Suggested fix

else: - logger.debug( - APPROVAL_GATE_CONTEXT_PARKED, + logger.debug( + "approval_gate_context_park_no_task", approval_id=escalation.approval_id, agent_id=agent_id, note="No task_execution on context — task_id will be None", )

Alternatively, define a new event constant like APPROVAL_GATE_CONTEXT_PARK_NO_TASK in events.approval_gate if this diagnostic is important for observability.

Based on learnings: "Event names must always use constants from domain-specific modules under ai_company.observability.events" — the current usage misapplies the PARKED event for a pre-park diagnostic.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/ai_company/engine/loop_helpers.py` around lines 373 - 378, The debug log currently emits the APPROVAL_GATE_CONTEXT_PARKED event while only noting a missing task_id; change this to a semantically correct diagnostic: either remove the parked event constant and call logger.debug with a plain message about the missing task_execution/task_id, or add a new event constant (e.g., APPROVAL_GATE_CONTEXT_PARK_NO_TASK) to ai_company.observability.events.approval_gate and use that here instead of APPROVAL_GATE_CONTEXT_PARKED; update the logger.debug invocation in loop_helpers.py (the call that passes approval_id, agent_id and note="No task_execution on context — task_id will be None") to use the chosen option so observability events remain accurate.

coderabbitai · 2026-03-13T23:39:01Z

+                    risk_level=ApprovalRiskLevel(
+                        result.metadata.get("risk_level", "high"),


⚠️ Potential issue | 🟡 Minor

Invalid risk_level metadata causes unhandled exception.

If result.metadata["risk_level"] contains an invalid value (not in ApprovalRiskLevel enum), ApprovalRiskLevel(...) raises ValueError. While the outer except Exception catches this, the error message won't clearly indicate a metadata validation issue.

Consider validating explicitly or using a safer fallback:

Safer risk_level handling

+ raw_level = result.metadata.get("risk_level", "high") + try: + risk = ApprovalRiskLevel(raw_level) + except ValueError: + risk = ApprovalRiskLevel.HIGH + self._pending_escalations.append( _EscalationInfo( approval_id=str(result.metadata["approval_id"]), tool_call_id=tool_call.id, tool_name=tool.name, action_type=tool.action_type, - risk_level=ApprovalRiskLevel( - result.metadata.get("risk_level", "high"), - ), + risk_level=risk, reason="Agent requested human approval", ), )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/ai_company/tools/invoker.py` around lines 633 - 634, The construction of ApprovalRiskLevel from result.metadata can raise ValueError for invalid strings; update the code that builds the object (the ApprovalRiskLevel(...) call using result.metadata.get("risk_level", "high")) to validate or convert the metadata value first: read risk_str = result.metadata.get("risk_level"), check if risk_str is a valid member of ApprovalRiskLevel (or wrap ApprovalRiskLevel(risk_str) in a try/except ValueError), and on failure log a clear metadata validation warning and fall back to a safe default (e.g., "high") before passing into ApprovalRiskLevel; ensure the change is applied where ApprovalRiskLevel is instantiated so invalid metadata never raises an uncaught ValueError.

coderabbitai · 2026-03-13T23:39:01Z

+      const originalIsAxiosError = (await import('axios')).default.isAxiosError
+      vi.spyOn((await import('axios')).default, 'isAxiosError').mockImplementation((err) => {
+        if (err === axiosError) return true
+        return originalIsAxiosError(err)
+      })


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify spy restoration hygiene in this test file (read-only). rg -n "vi\\.spyOn\\(" web/src/__tests__/stores/approvals.test.ts -C2 rg -n "vi\\.(clearAllMocks|restoreAllMocks)" web/src/__tests__/stores/approvals.test.ts -C2

Repository: Aureliolo/synthorg

Length of output: 1006

🏁 Script executed:

cat -n web/src/__tests__/stores/approvals.test.ts | head -100

Repository: Aureliolo/synthorg

Length of output: 3921

🏁 Script executed:

cat -n web/src/__tests__/stores/approvals.test.ts | tail -100

Repository: Aureliolo/synthorg

Length of output: 4523

🏁 Script executed:

# Check for afterEach in the entire test file rg -n "afterEach|afterAll" web/src/__tests__/stores/approvals.test.ts -C2

Repository: Aureliolo/synthorg

Length of output: 44

Add afterEach hook with vi.restoreAllMocks() and fix axios.isAxiosError spy at lines 316–319.

Line 42 calls vi.clearAllMocks(), which only clears call history—not spy implementations. The axios.isAxiosError spies can leak into later tests. Additionally, the spy at lines 316–319 returns false for unmatched errors instead of delegating to the original implementation, creating inconsistency with lines 286–290.

Two fixes required:

Add afterEach block to restore all spies globally

Update lines 316–319 to preserve the original isAxiosError implementation

Suggested fixes

-import { describe, it, expect, beforeEach, vi } from 'vitest' +import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest' ... beforeEach(() => { setActivePinia(createPinia()) vi.clearAllMocks() }) + + afterEach(() => { + vi.restoreAllMocks() + })

At lines 316–319, update to match the pattern at lines 286–290:

+ const originalIsAxiosError = (await import('axios')).default.isAxiosError vi.spyOn((await import('axios')).default, 'isAxiosError').mockImplementation((err) => { if (err === axiosError) return true - return false + return originalIsAxiosError(err) })

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@web/src/__tests__/stores/approvals.test.ts` around lines 286 - 290, Add an afterEach hook that calls vi.restoreAllMocks() to fully restore spies between tests (in addition to the existing vi.clearAllMocks()); also change the axios.isAxiosError spy at the second location to capture the original implementation (like the first instance) and delegate to it for non-matching errors by storing (await import('axios')).default.isAxiosError in originalIsAxiosError and calling originalIsAxiosError(err) when err !== axiosError so the spy preserves the original behavior.

coderabbitai · 2026-03-13T23:39:01Z

+          case 'approval.approved':
+          case 'approval.rejected':
+          case 'approval.expired':
+            try {
+              const updated = await approvalsApi.getApproval(approvalId)
+              approvals.value = approvals.value.map((a) =>
+                a.id === approvalId ? updated : a,
+              )
+            } catch (err) {
+              if (axios.isAxiosError(err) && (err.response?.status === 404 || err.response?.status === 410)) {
+                // Item genuinely gone — remove from local list
+                const lengthBefore = approvals.value.length
+                approvals.value = approvals.value.filter((a) => a.id !== approvalId)
+                const removed = lengthBefore - approvals.value.length
+                if (removed > 0) {
+                  total.value = Math.max(0, total.value - removed)
+                }
+              } else {
+                console.warn('Failed to fetch approval:', approvalId, err)
+              }
+            }
+            break


⚠️ Potential issue | 🟡 Minor

Status-change handler updates items regardless of current filters.

When an approval is approved/rejected/expired, the code updates the item in-place (line 101-103) even if the updated status no longer matches activeFilters. This can leave non-matching items visible until the next manual refresh.

Consider filtering out items that no longer match after status changes.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@web/src/stores/approvals.ts` around lines 96 - 117, The handler for 'approval.approved'/'approval.rejected'/'approval.expired' currently replaces the item in approvals.value regardless of current activeFilters; instead, after fetching the updated approval via approvalsApi.getApproval(approvalId) check whether the updated item matches the current activeFilters (use whatever predicate/function you use elsewhere to test filter membership) and if it does replace the item in approvals.value, otherwise remove it from approvals.value and decrement total.value accordingly; keep the existing 404/410 removal branch as-is and perform this filter-check/replace-or-remove logic in the same switch case that currently updates approvals.value.

coderabbitai

♻️ Duplicate comments (1)

tests/unit/engine/test_approval_gate.py (1)

294-330: 🧹 Nitpick | 🔵 Trivial

Parameterize message-variant assertions to reduce duplication.

These three tests are structurally identical and are a good fit for a single parameterized test.

As per coding guidelines, "Prefer @pytest.mark.parametrize for testing similar cases."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/unit/engine/test_approval_gate.py` around lines 294 - 330, Replace the
three nearly identical tests in TestBuildResumeMessage with a single
`@pytest.mark.parametrize` test that iterates over cases for
ApprovalGate.build_resume_message; for each tuple include (approved, decided_by,
decision_reason, expected_substrings) and assert all expected substrings are in
the returned msg (e.g., "APPROVED"/"REJECTED", approval ID, actor, optional
reason, and markers like "USER-SUPPLIED REASON" or "untrusted data" when
applicable). Keep the call to ApprovalGate.build_resume_message("approval-1",
...) and name the param set to make failures readable; remove the three
individual test_* methods after adding the parameterized test.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@tests/unit/engine/test_approval_gate.py`:
- Around line 294-330: Replace the three nearly identical tests in
TestBuildResumeMessage with a single `@pytest.mark.parametrize` test that iterates
over cases for ApprovalGate.build_resume_message; for each tuple include
(approved, decided_by, decision_reason, expected_substrings) and assert all
expected substrings are in the returned msg (e.g., "APPROVED"/"REJECTED",
approval ID, actor, optional reason, and markers like "USER-SUPPLIED REASON" or
"untrusted data" when applicable). Keep the call to
ApprovalGate.build_resume_message("approval-1", ...) and name the param set to
make failures readable; remove the three individual test_* methods after adding
the parameterized test.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e5d8def0-b19b-4e4e-b300-716b9404a82c

📥 Commits

Reviewing files that changed from the base of the PR and between 3b5068f and 0664529.

📒 Files selected for processing (1)

tests/unit/engine/test_approval_gate.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Agent
GitHub Check: Greptile Review

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Python version 3.14+ with PEP 649 native lazy annotations required.
Do NOT use from __future__ import annotations — Python 3.14 has PEP 649 native lazy annotations.
Use PEP 758 except syntax: except A, B: (no parentheses) — ruff enforces this on Python 3.14.
Line length must be 88 characters, enforced by ruff.
NEVER use real vendor names (Anthropic, OpenAI, Claude, GPT, etc.) in project-owned code, docstrings, comments, tests, or config examples. Use generic names: example-provider, example-large-001, example-medium-001, example-small-001. Tests must use test-provider, test-small-001, etc.

Files:

tests/unit/engine/test_approval_gate.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Tests must use markers: @pytest.mark.unit, @pytest.mark.integration, @pytest.mark.e2e, @pytest.mark.slow
Test coverage minimum: 80% (enforced in CI).
Async tests: use asyncio_mode = "auto" — no manual @pytest.mark.asyncio needed.
Test timeout: 30 seconds per test.
Prefer @pytest.mark.parametrize for testing similar cases.

Files:

tests/unit/engine/test_approval_gate.py

🧠 Learnings (1)

📚 Learning: 2026-03-13T21:03:58.906Z

Learnt from: CR
Repo: Aureliolo/synthorg PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-03-13T21:03:58.906Z
Learning: Applies to tests/**/*.py : Prefer `pytest.mark.parametrize` for testing similar cases.

Applied to files:

tests/unit/engine/test_approval_gate.py

🧬 Code graph analysis (1)

tests/unit/engine/test_approval_gate.py (2)

src/ai_company/engine/approval_gate.py (5)

ApprovalGate (40-274)

should_park (71-90)

park_context (92-160)

resume_context (162-239)

build_resume_message (242-274)

src/ai_company/security/timeout/park_service.py (3)

ParkService (29-144)

park (37-106)

resume (108-144)

🔇 Additional comments (5)

tests/unit/engine/test_approval_gate.py (5)

1-13: LGTM!

Module setup correctly applies unit test markers and timeout. No from __future__ import annotations as required for Python 3.14+.

16-31: LGTM!

Clean test factory with sensible defaults. The noqa: PLR0913 is appropriate for test helper functions.

34-55: LGTM!

Well-designed fixtures that compose correctly. The parked_mock fixture's dependency on park_service ensures consistent mock state across tests.

58-70: LGTM!

Tests correctly verify the first-or-none semantics of should_park. Using a real ParkService() is acceptable here since the method doesn't interact with internal service state.

73-291: LGTM!

Comprehensive test coverage for park_context and resume_context flows. Good edge case handling:

Serialization/persistence error propagation (lines 144-182)

Parked record preservation on deserialization failure (lines 250-268)

Context returned despite delete failure (lines 270-291)

Copilot

Pull request overview

Adds an approval-gating workflow to the execution engine so SecOps ESCALATE verdicts and the new request_human_approval tool can “park” an execution pending a human decision, with supporting API/WebSocket and dashboard updates.

Changes:

Introduces ApprovalGate, EscalationInfo/ResumePayload, and loop/helper integration to terminate runs as PARKED when approval is required.
Adds RequestHumanApprovalTool and extends ToolInvoker + ToolRegistry to track and surface pending escalations.
Hardens approvals API (auth-bound requested_by, conflict-safe saves) and fixes frontend WS handling to use approval_id + re-fetch items.

Reviewed changes

Copilot reviewed 33 out of 33 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
web/src/stores/approvals.ts	Updates WS approval event handling to use `approval_id` and re-fetch full items.
web/src/tests/stores/approvals.test.ts	Expands WS event test coverage for new fetch-based behavior and error paths.
tests/unit/tools/test_invoker_escalation.py	Adds unit tests for `ToolInvoker.pending_escalations` (SecOps + tool metadata).
tests/unit/tools/test_approval_tool.py	Adds unit tests for `RequestHumanApprovalTool` behavior, validation, and failure handling.
tests/unit/observability/test_events.py	Updates event discovery/markers to include new `approval_gate` domain module.
tests/unit/engine/test_loop_helpers_approval.py	Tests loop helper parking behavior and error handling when escalations occur.
tests/unit/engine/test_approval_gate_models.py	Tests Pydantic models for escalation + resume payloads.
tests/unit/engine/test_approval_gate.py	Tests `ApprovalGate` park/resume behavior and resume message construction.
tests/unit/api/test_dto.py	Updates DTO tests for new `action_type` format validation and removed fields.
tests/unit/api/controllers/test_approvals.py	Updates controller tests for new create payload semantics.
src/ai_company/tools/registry.py	Adds `all_tools()` to enumerate tool instances deterministically.
src/ai_company/tools/invoker.py	Tracks escalation info from SecOps ESCALATE verdicts and tool parking metadata.
src/ai_company/tools/approval_tool.py	Implements `RequestHumanApprovalTool` that creates `ApprovalItem` and signals parking.
src/ai_company/tools/init.py	Exports `RequestHumanApprovalTool`.
src/ai_company/security/timeout/parked_context.py	Allows `task_id` to be `None` for taskless agents.
src/ai_company/security/timeout/park_service.py	Makes `ParkService.park()` accept optional `task_id`.
src/ai_company/observability/events/approval_gate.py	Adds structured event constants for approval gate lifecycle.
src/ai_company/engine/react_loop.py	Wires optional approval gate into ReAct loop tool-call execution path.
src/ai_company/engine/plan_execute_loop.py	Wires optional approval gate into Plan/Execute tool-call execution path.
src/ai_company/engine/loop_helpers.py	Adds escalation check after tool execution and parks context via approval gate.
src/ai_company/engine/approval_gate_models.py	Adds frozen models for escalation tracking and resume decision payloads.
src/ai_company/engine/approval_gate.py	Implements park/resume coordination and safe resume-message formatting.
src/ai_company/engine/agent_engine.py	Integrates approval gate + tool registry augmentation; extracts security factory.
src/ai_company/engine/_security_factory.py	Centralizes SecOps interceptor creation and registry augmentation with approval tool.
src/ai_company/engine/init.py	Exposes approval gate types from engine package.
src/ai_company/api/ws_models.py	Adds `approval.resumed` WS event type.
src/ai_company/api/state.py	Stores optional `ApprovalGate` on `AppState`.
src/ai_company/api/dto.py	Tightens DTO validation (action_type format, TTL bounds, comment max length).
src/ai_company/api/controllers/approvals.py	Binds `requested_by` to auth user; uses conflict-safe saves; attempts resume trigger.
src/ai_company/api/approval_store.py	Adds `save_if_pending()` and hardens `on_expire` callback error handling.
docs/design/engine.md	Documents parking behavior on escalations in engine run flow.
README.md	Updates status to reflect approval gate implementation.
CLAUDE.md	Updates repo module descriptions and event-constant guidance to include approval gate.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+      vi.spyOn((await import('axios')).default, 'isAxiosError').mockImplementation((err) => {
+        if (err === axiosError) return true
+        return false
+      })


+    try:
+        result = await approval_gate.resume_context(approval_id)
+        if result is None:
+            return
+
+        _ctx, parked_id = result
+        resume_message = ApprovalGate.build_resume_message(
+            approval_id,
+            approved=approved,
+            decided_by=decided_by,
+            decision_reason=decision_reason,
+        )
+        logger.info(
+            APPROVAL_GATE_RESUME_TRIGGERED,
+            approval_id=approval_id,
+            parked_id=parked_id,
+            approved=approved,
+            decided_by=decided_by,
+            resume_message_length=len(resume_message),
+        )


+    """Best-effort resume of a parked agent context after a decision.
+
+    If an ``ApprovalGate`` is configured, loads the parked context,
+    builds a resume message, and publishes a WebSocket event.
+    Failures are logged at WARNING and never propagate to the caller.
+


+    task_id: NotBlankStr | None = Field(
+        default=None, description="Task identifier (None for taskless agents)"
+    )


+                const item = await approvalsApi.getApproval(approvalId)
+                if (activeFilters.value) {
+                  // Filters are active — item may not match; just bump total
+                  total.value++
+                } else {
+                  approvals.value = [item, ...approvals.value]
+                  total.value++
+                }


+      // Patch axios.isAxiosError to recognize our mock
+      const originalIsAxiosError = (await import('axios')).default.isAxiosError
+      vi.spyOn((await import('axios')).default, 'isAxiosError').mockImplementation((err) => {
+        if (err === axiosError) return true
+        return originalIsAxiosError(err)
+      })


- _trigger_resume no longer calls resume_context() which would delete the parked record before a scheduler can consume it — now only logs the resume trigger for scheduler observation - Remove dead APPROVAL_RESUMED WsEventType (never emitted) - Remove unused ApprovalGate import from controller

greptile-apps · 2026-03-14T08:15:05Z

+                except MemoryError, RecursionError:
+                    raise


Python 3 except tuple syntax missing parentheses — SyntaxError across all new files

except MemoryError, RecursionError: is Python 2 syntax. In Python 3 the comma-separated form (without parentheses) is a SyntaxError; the tuple must be explicitly wrapped. Any module that contains this form will fail to import entirely.

The same bug appears throughout all the new files added in this PR:

src/ai_company/api/approval_store.py:172

src/ai_company/api/controllers/approvals.py:100

src/ai_company/engine/approval_gate.py:127, 149, 202, 216

src/ai_company/engine/loop_helpers.py:69, 114, 186, 303, 387

src/ai_company/tools/approval_tool.py:162, 238

src/ai_company/tools/invoker.py:212, 295, 639

Note: the existing (pre-PR) code in invoker.py already uses the correct form except (MemoryError, RecursionError) as exc: at lines 453, 539, 575, and 698 — confirming this is an inconsistency introduced only in the new additions.

Fix every occurrence by wrapping the two types in parentheses:

Suggested change

except MemoryError, RecursionError:

raise

except (MemoryError, RecursionError):

raise

The pattern to apply globally is:

# Wrong (Python 2) except MemoryError, RecursionError: # Correct (Python 3) except (MemoryError, RecursionError):

Prompt To Fix With AI

This is a comment left during a code review. Path: src/ai_company/api/approval_store.py Line: 172-173 Comment: **Python 3 `except` tuple syntax missing parentheses — SyntaxError across all new files** `except MemoryError, RecursionError:` is Python 2 syntax. In Python 3 the comma-separated form (without parentheses) is a `SyntaxError`; the tuple must be explicitly wrapped. Any module that contains this form will fail to import entirely. The same bug appears throughout all the new files added in this PR: - `src/ai_company/api/approval_store.py:172` - `src/ai_company/api/controllers/approvals.py:100` - `src/ai_company/engine/approval_gate.py:127, 149, 202, 216` - `src/ai_company/engine/loop_helpers.py:69, 114, 186, 303, 387` - `src/ai_company/tools/approval_tool.py:162, 238` - `src/ai_company/tools/invoker.py:212, 295, 639` Note: the existing (pre-PR) code in `invoker.py` already uses the correct form `except (MemoryError, RecursionError) as exc:` at lines 453, 539, 575, and 698 — confirming this is an inconsistency introduced only in the new additions. Fix every occurrence by wrapping the two types in parentheses: ```suggestion except (MemoryError, RecursionError): raise ``` The pattern to apply globally is: ```python # Wrong (Python 2) except MemoryError, RecursionError: # Correct (Python 3) except (MemoryError, RecursionError): ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-14T08:15:06Z

+    else:
+        logger.debug(
+            APPROVAL_GATE_CONTEXT_PARKED,
+            approval_id=escalation.approval_id,
+            agent_id=agent_id,
+            note="No task_execution on context — task_id will be None",
+        )


APPROVAL_GATE_CONTEXT_PARKED emitted before parking has occurred

This debug log fires when ctx.task_execution is None — i.e. before park_context() is even called. APPROVAL_GATE_CONTEXT_PARKED implies the context has been parked, so any log aggregation or alerting rule watching for approval_gate.context.parked will see a spurious event even if the subsequent park_context() call fails.

An informational constant such as APPROVAL_GATE_ESCALATION_DETECTED (or a dedicated "parking skipped task_id" diagnostic) is more accurate here. The successful-park event is already emitted inside ApprovalGate.park_context(), so this line also produces a duplicate on the happy path.

Suggested change

else:

logger.debug(

APPROVAL_GATE_CONTEXT_PARKED,

approval_id=escalation.approval_id,

agent_id=agent_id,

note="No task_execution on context — task_id will be None",

)

logger.debug(

APPROVAL_GATE_ESCALATION_DETECTED,

approval_id=escalation.approval_id,

agent_id=agent_id,

note="No task_execution on context — task_id will be None",

)

Prompt To Fix With AI

This is a comment left during a code review. Path: src/ai_company/engine/loop_helpers.py Line: 372-378 Comment: **`APPROVAL_GATE_CONTEXT_PARKED` emitted before parking has occurred** This debug log fires when `ctx.task_execution` is `None` — i.e. before `park_context()` is even called. `APPROVAL_GATE_CONTEXT_PARKED` implies the context *has* been parked, so any log aggregation or alerting rule watching for `approval_gate.context.parked` will see a spurious event even if the subsequent `park_context()` call fails. An informational constant such as `APPROVAL_GATE_ESCALATION_DETECTED` (or a dedicated "parking skipped task_id" diagnostic) is more accurate here. The successful-park event is already emitted inside `ApprovalGate.park_context()`, so this line also produces a duplicate on the happy path. ```suggestion logger.debug( APPROVAL_GATE_ESCALATION_DETECTED, approval_id=escalation.approval_id, agent_id=agent_id, note="No task_execution on context — task_id will be None", ) ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-14T08:15:07Z

+                  total.value++
+                } else {
+                  approvals.value = [item, ...approvals.value]
+                  total.value++
+                }
+              } catch (err) {
+                if (axios.isAxiosError(err) && (err.response?.status === 404 || err.response?.status === 410)) {
+                  // Item genuinely gone — skip
+                } else {
+                  console.warn('Failed to fetch approval:', approvalId, err)


total incremented unconditionally when filters are active, regardless of whether the new item matches them

When activeFilters.value is truthy, total.value++ is incremented for every approval.submitted event even though the newly submitted item may not satisfy the current filter (e.g. the user is filtering by status: 'rejected' while a new pending item arrives). This inflates the displayed count and can cause pagination to request pages that don't exist.

A stricter approach would be to skip the increment when the item's status/risk_level/action_type don't match the active filter criteria, or to do a quick attribute check on the already-fetched item object:

if (activeFilters.value) { // Only bump total if the item matches the active filter constraints const f = activeFilters.value const matches = (!f.status || item.status === f.status) && (!f.risk_level || item.risk_level === f.risk_level) && (!f.action_type || item.action_type === f.action_type) if (matches) total.value++ } else { approvals.value = [item, ...approvals.value] total.value++ }

Prompt To Fix With AI

This is a comment left during a code review. Path: web/src/stores/approvals.ts Line: 82-91 Comment: **`total` incremented unconditionally when filters are active, regardless of whether the new item matches them** When `activeFilters.value` is truthy, `total.value++` is incremented for every `approval.submitted` event even though the newly submitted item may not satisfy the current filter (e.g. the user is filtering by `status: 'rejected'` while a new `pending` item arrives). This inflates the displayed count and can cause pagination to request pages that don't exist. A stricter approach would be to skip the increment when the item's status/risk_level/action_type don't match the active filter criteria, or to do a quick attribute check on the already-fetched `item` object: ```typescript if (activeFilters.value) { // Only bump total if the item matches the active filter constraints const f = activeFilters.value const matches = (!f.status || item.status === f.status) && (!f.risk_level || item.risk_level === f.risk_level) && (!f.action_type || item.action_type === f.action_type) if (matches) total.value++ } else { approvals.value = [item, ...approvals.value] total.value++ } ``` How can I resolve this? If you propose a fix, please make it concise.

Aureliolo added 2 commits March 13, 2026 23:16

fix: address review findings across approval workflow

b999b65

Copilot AI review requested due to automatic review settings March 13, 2026 22:42

Aureliolo temporarily deployed to ci March 13, 2026 22:42 — with GitHub Actions Inactive

Copilot started reviewing on behalf of Aureliolo March 13, 2026 22:43 View session

Aureliolo temporarily deployed to cloudflare-preview March 13, 2026 22:43 — with GitHub Actions Inactive

gemini-code-assist Bot reviewed Mar 13, 2026

View reviewed changes

greptile-apps Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread src/ai_company/engine/approval_gate.py

Comment thread src/ai_company/tools/approval_tool.py

Comment thread src/ai_company/engine/loop_helpers.py

Copilot AI reviewed Mar 13, 2026

View reviewed changes

coderabbitai Bot reviewed Mar 13, 2026

View reviewed changes

Aureliolo temporarily deployed to ci March 13, 2026 23:35 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview March 13, 2026 23:35 — with GitHub Actions Inactive

fix: resolve mypy no-any-return in test fixture

0664529

Copilot AI review requested due to automatic review settings March 13, 2026 23:36

Aureliolo temporarily deployed to ci March 13, 2026 23:37 — with GitHub Actions Inactive

Copilot started reviewing on behalf of Aureliolo March 13, 2026 23:37 View session

Aureliolo temporarily deployed to cloudflare-preview March 13, 2026 23:37 — with GitHub Actions Inactive

coderabbitai Bot reviewed Mar 13, 2026

View reviewed changes

greptile-apps Bot reviewed Mar 13, 2026

View reviewed changes

Comment thread src/ai_company/api/controllers/approvals.py Outdated

Comment thread src/ai_company/api/controllers/approvals.py Outdated

Copilot AI reviewed Mar 13, 2026

View reviewed changes

Aureliolo temporarily deployed to ci March 14, 2026 08:08 — with GitHub Actions Inactive

Aureliolo had a problem deploying to cloudflare-preview March 14, 2026 08:09 — with GitHub Actions Error

Aureliolo closed this Mar 14, 2026

Aureliolo temporarily deployed to cloudflare-preview March 14, 2026 08:10 — with GitHub Actions Inactive

Aureliolo temporarily deployed to ci March 14, 2026 08:10 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview March 14, 2026 08:11 — with GitHub Actions Inactive

greptile-apps Bot reviewed Mar 14, 2026

View reviewed changes

		parking is needed. If so, the context is serialized via `ParkService`,
		persisted, and the loop returns a `PARKED` result.

		risk_level=ApprovalRiskLevel(
		result.metadata.get("risk_level", "high"),

Conversation

Aureliolo commented Mar 13, 2026

Summary

Review coverage

Test plan

Uh oh!

github-actions Bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dependency Review

Scanned Files

Uh oh!

coderabbitai Bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Summary by CodeRabbit

Release Notes

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot commented Mar 13, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 1/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Mar 13, 2026 •

edited

Loading

coderabbitai Bot commented Mar 13, 2026 •

edited

Loading

greptile-apps Bot commented Mar 13, 2026 •

edited

Loading

codecov Bot commented Mar 13, 2026 •

edited

Loading