Skip to content

feat: conversational interface v1 - 1:1 clarify + propose#2019

Merged
Aureliolo merged 11 commits into
mainfrom
feat/conversational-interface-v1
May 20, 2026
Merged

feat: conversational interface v1 - 1:1 clarify + propose#2019
Aureliolo merged 11 commits into
mainfrom
feat/conversational-interface-v1

Conversation

@Aureliolo
Copy link
Copy Markdown
Owner

Summary

EPIC #1967 keystone: a human talks to ONE fixed top agent (Chief of Staff) in natural language. The agent asks clarifying questions when the request is underspecified, then parks one or more concrete WorkItems in the human approval queue. On approval each item runs through the work pipeline; no autonomous acting (direct MCP acting is the separate child #1972).

The pipeline spine (build_work_pipeline, #1960) and approval producer (#1956) already landed and are ENFORCED in the ghost-wiring manifest. This PR drives them from a conversation.

What changed

Service + API

  • src/synthorg/meta/chief_of_staff/propose.py :: ChiefOfStaffProposer (single-responsibility; keeps chat.py explain-only). Structured clarify-or-propose LLM turn via core/json_parsing.py. SEC-1: human content wrapped via wrap_untrusted(TAG_TASK_DATA, ...).
  • POST /meta/chat/propose on MetaController with ConversationalProposeRequest args model, meta.chat.propose (5/60s) rate-limit, 503 dependency-matrix (proposer + work pipeline must be wired).
  • src/synthorg/api/controllers/_approval_review_gate.py :: try_conversational_intake_resume Flow 0 before parked-context / review-gate. On approve: reconstructs WorkItem from the proposal and drives app_state.work_pipeline.run. On reject: marks proposal REJECTED, pipeline untouched.
  • build_chief_of_staff_proposer factory + _wire_chief_of_staff_proposer on_startup hook + AppState set-once accessors (chief_of_staff_proposer, conversational_proposal_repo).
  • ChiefOfStaffConfig propose_* fields (opt-in, default off).

Data model

  • New ApprovalSource.CONVERSATIONAL_INTAKE discriminator + new ConversationRole / ConversationStatus / ConversationalProposalStatus enums.
  • Frozen Pydantic models: Conversation, ConversationTurn, ProposedWork, ProposeDecision (XOR validator), ConversationalProposal, ProposeArgs, ProposedApprovalSummary, ProposeResult.

Persistence (dual-backend)

  • New protocols: ConversationRepository (StatefulRepository), ConversationTurnRepository (AppendOnlyRepository), ConversationalProposalRepository (Stateful + FilteredQuery). Generic categories per ADR-0001; no bespoke methods.
  • Concrete sqlite + postgres impls, conversational_factory.build_conversational_repositories (backend-aware; degrades gracefully to None for absent / unknown backends so app boot stays clean).
  • One yoyo revision per backend (20260519000001_conversational_intake.sql). Postgres widens approvals.source CHECK via DROP/ADD CONSTRAINT. SQLite intentionally keeps the narrow CHECK (rebuild via writable_schema was perf-too-costly under the unit suite's --count=2 isolation gate; the v1 ApprovalStore is in-memory by default so CONVERSATIONAL_INTAKE rows never reach SQLite. Documented inline.).
  • schema_drift_baseline.txt grown by 4 canonical TEXT/TIMESTAMPTZ entries for the new timestamp columns (user-approved, identical to every other dual-backend table). Committed with ALLOW_BASELINE_GROWTH=1.

Errors / events

  • New <Domain><Condition>Error family in meta/errors.py: ConversationNotFoundError (404), ConversationClosedError (409), ConversationalProposeUnavailableError (503), ConversationalProposeResponseInvalidError (502). New ErrorCode entries 3017 / 4015 / 7010.
  • New event constants in observability/events/{chief_of_staff,approval_gate,persistence}.py (clarify/propose/cap/transition + dispatcher branch + persistence repo error/read events).

Manifest discipline (hard acceptance criterion)

  • scripts/_ghost_wiring_manifest.txt: new ENFORCED build_chief_of_staff_proposer #1968 line. Verified by scripts/check_no_ghost_wiring.py.

Web / generated types

  • Regenerated web/src/api/types/{openapi,dtos,enum-values,error-codes}.gen.ts for the new endpoint and enums.
  • Hand-written postChatPropose() client + ConversationalProposeResponse interface in web/src/api/endpoints/meta.ts; MSW handler in web/src/mocks/handlers/meta.ts (1:1 parity rule).

Docs

  • docs/design/self-improvement.md: /meta/chat/propose endpoint, propose.py + model trio in package inventory, three-flow approval-decision routing (Flow 0 conversational-intake), YAML defaults for propose_*.
  • README.md: v1 scope clarified (clarify + propose; no autonomous acting).
  • CLAUDE.md: one-liner on proposer wiring + 503 conditions.

Test plan

  • Unit (29 503 passed locally on -m unit): enums, models, errors, settings, proposer (clarify / propose / untrusted-wrap / parse-failure / clarification cap / project-resolution), wiring (build_chief_of_staff_proposer 503 matrix; backend-aware factory degradation), dispatch branch (approve-runs-pipeline, reject-skips-pipeline, missing-proposal-owned-but-noop, pipeline-unavailable-503, pipeline-failure-leaves-PENDING), API model validation + route + rate-limit-policy registration.
  • Dual-backend conformance (tests/conformance/persistence/, pytest.mark.integration): Conversation, ConversationTurn, ConversationalProposal against sqlite + postgres. Schema-drift gates green on both backends.
  • Simulation-harness e2e (tests/e2e/test_conversational_propose_e2e.py, pytest.mark.e2e): builds the REAL pipeline via build_runtime_services, REAL ChiefOfStaffProposer with a scripted clarify-then-propose provider, REAL ApprovalStore, REAL approval-decision seam (signal_resume_intent). Asserts: vague request → clarify → answer → PENDING approval in store → approve → Task created and advanced past CREATED by a real agent. Reject path: pipeline never runs.
  • All convention gates green (ghost-wiring, schema-drift, dual-backend parity, persistence-boundary, dto-types-ts-in-sync, error-codes-ts-in-sync, no-magic-numbers, no-review-origin, frozen-extra-forbid, ...).

Review coverage

11 specialist review agents ran pre-push (code, python, security, persistence, conventions, logging, async-concurrency, api-contract-drift, issue-resolution-verifier, docs-consistency, comment-quality-rot). 18 valid findings (7 High, 10 Medium, 1 doc polish) addressed in commit 80b34f9. 10 false positives dismissed with reasons in _audit/pre-pr-review/triage.md. issue-resolution-verifier returned PASS on every acceptance criterion and the parent epic's manifest-discipline rule.

Closes #1968

Aureliolo added 2 commits May 20, 2026 00:41
11-agent pre-pr-review surfaced findings; this round addresses 18 valid
items. Details in _audit/pre-pr-review/triage.md.

Critical / High:
- log via event constants (PERSISTENCE_CONVERSATIONAL_UNKNOWN_BACKEND,
  PERSISTENCE_CONVERSATIONAL_HANDLE_UNAVAILABLE) instead of string literals
- check transition_if return value in the approval dispatcher (both
  reject + execute paths) and surface a no-op as WARNING
- check conversation transition_if return in the proposer; warn when a
  concurrent propose-turn already flipped the status
- reorder _park_proposal: save proposal before adding approval to the
  queue so an approval-store failure leaves an invisible orphan
  proposal instead of a visible dangling queue item
- add postChatPropose() client + ConversationalProposeResponse type +
  MSW handler for /meta/chat/propose (1:1 MSW parity rule)

Medium:
- split _park_proposal: extract _build_work_item + _build_approval_item
  helpers (<50 line rule)
- emit COS_CONVERSATION_STATUS_TRANSITIONED INFO on ACTIVE to PROPOSED
  and ACTIVE to CLOSED transitions
- document the sequential user/assistant turn append cancellation
  window in converse()
- design doc: /meta/chat/propose endpoint, Conversation* models, Flow 0
  routing, propose_* YAML defaults
- README: clarify v1 covers clarify + propose only (no autonomous
  acting); CLAUDE.md: one-liner on proposer wiring + 503 conditions
- regenerate web TS DTO bundle for the new endpoint
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 19, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces the 'conversational interface' v1, enabling users to interact with the Chief of Staff agent to propose work. The system handles natural language requests by either asking clarifying questions or parking concrete work items in the approval queue. Upon human approval, these items are automatically routed through the work pipeline, ensuring all autonomous actions remain under strict human governance.

Highlights

  • Conversational Propose Endpoint: Added POST /meta/chat/propose to allow natural language interaction with the Chief of Staff agent, supporting clarification or work proposal generation.
  • Approval Decision Routing: Implemented Flow 0 in the approval decision seam to handle conversational intake, enabling the pipeline to execute approved work items.
  • Persistence Layer: Introduced dual-backend (SQLite/Postgres) repositories for conversations, turns, and proposals, including schema migrations and ghost-wiring enforcement.
  • Safety and Governance: Enforced mandatory human approval for all conversational proposals; no autonomous acting is permitted.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 19, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: e9250de9-0db5-4831-a006-393cc22cf2b1

📥 Commits

Reviewing files that changed from the base of the PR and between 02ea50d and fd67914.

📒 Files selected for processing (6)
  • src/synthorg/api/approval_store.py
  • src/synthorg/approval/protocol.py
  • src/synthorg/meta/chief_of_staff/propose.py
  • tests/unit/approval/test_protocol.py
  • tests/unit/hr/test_registry_autonomy.py
  • tests/unit/meta/chief_of_staff/test_propose.py
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: Lighthouse Dashboard
  • GitHub Check: Lighthouse Site
  • GitHub Check: Dashboard Test
  • GitHub Check: Test E2E
  • GitHub Check: Test Unit
  • GitHub Check: Test Integration
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Numerics live in settings/definitions/; allowlist 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants (NAME: int|float|Final|Final[int]|Final[float] = literal); no other hardcoded values; enforced by scripts/check_no_magic_numbers.py

Type hints on public functions; mypy strict; Google-style docstrings; line length 88; functions <50 lines; files <800 lines

Errors: <Domain><Condition>Error from DomainError; never inherit Exception/RuntimeError/etc directly; enforced by check_domain_error_hierarchy.py

Pydantic v2 frozen + extra="forbid" on every frozen model project-wide (gate check_frozen_model_extra_forbid.py; @computed_field auto-exempt, per-line # lint-allow: frozen-extra-forbid -- <reason> for extra="allow"/"ignore" boundaries); @computed_field for derived; NotBlankStr for identifiers

Args models at every system boundary; parse_typed() for every external dict ingestion; enforced by check_boundary_typed.py

Immutability: model_copy(update=...) or copy.deepcopy(); deepcopy at system boundaries

Async: asyncio.TaskGroup for fan-out/fan-in; helpers catch Exception (re-raise MemoryError/RecursionError)

Untrusted content (SEC-1): wrap_untrusted() from engine.prompt_safety; HTMLParseGuard for HTML

Use from synthorg.observability import get_logger; variable always logger; never import logging / print() in app code; structured kwargs (logger.info(EVENT, key=value)); event names from observability.events.<domain> constants

Error paths log WARNING/ERROR with context before raising; state transitions log INFO via *_STATUS_TRANSITIONED AFTER persistence write

Secret-log redaction (SEC-1): never error=str(exc) or interpolate {exc}; use error_type=type(exc).__name__ + error=safe_error_description(exc); never exc_info=True; OTel: forbid span.record_exception(exc), use span.set_attribute("exception.message", safe_error_description(exc)) + `record_exception=Fal...

Files:

  • src/synthorg/approval/protocol.py
  • src/synthorg/api/approval_store.py
  • src/synthorg/meta/chief_of_staff/propose.py
{src/synthorg/**/*.py,tests/**/*.py}

📄 CodeRabbit inference engine (CLAUDE.md)

Vendor-agnostic: NEVER use real vendor names in project code/tests; use example-provider, test-provider, example-{large,medium,small}-001; allowed in .claude/, third-party imports, providers/presets.py, web/public/provider-logos/

Files:

  • src/synthorg/approval/protocol.py
  • tests/unit/hr/test_registry_autonomy.py
  • src/synthorg/api/approval_store.py
  • tests/unit/approval/test_protocol.py
  • src/synthorg/meta/chief_of_staff/propose.py
  • tests/unit/meta/chief_of_staff/test_propose.py
src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/approval/protocol.py
  • src/synthorg/api/approval_store.py
  • src/synthorg/meta/chief_of_staff/propose.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Test markers: @pytest.mark.{unit,integration,e2e,slow}; async auto; timeout 30s global; coverage 80% min

Windows: unit tests use WindowsSelectorEventLoopPolicy (3.14 IOCP teardown race); subprocess tests override back

Test doubles: ladder in conventions.md section 12.1; FakeClock for Clock seam, mock_of[T](**overrides) for typed-boundary substitutions, SimpleNamespace for attribute-bags; bare MagicMock at typed boundary blocked by scripts/check_mock_spec.py

FakeClock and mock_of import from tests._shared; inject via clock= and the helper's spec subscript

Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add @example(...))

Flaky tests: NEVER skip/xfail; fix fundamentally; use asyncio.Event().wait() not sleep(large)

Files:

  • tests/unit/hr/test_registry_autonomy.py
  • tests/unit/approval/test_protocol.py
  • tests/unit/meta/chief_of_staff/test_propose.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/hr/test_registry_autonomy.py
  • tests/unit/approval/test_protocol.py
  • tests/unit/meta/chief_of_staff/test_propose.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T03:11:17.540Z
Learning: Read `docs/design/` page before implementing; deviations need approval per DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T03:11:17.540Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T03:11:17.540Z
Learning: No region/currency/locale privileged; use metric units; use British English per docs/reference/regional-defaults.md
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T03:11:17.540Z
Learning: Every convention PR ships its enforcement gate per docs/reference/convention-gates.md
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T03:11:17.540Z
Learning: After issue: branch + commit + push (no auto-PR); use `/pre-pr-review`; after PR use `/aurelio-review-pr` for external feedback; fix EVERYTHING valid, no deferring
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T03:11:17.540Z
Learning: Commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced; signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T03:11:17.540Z
Learning: Branches: `<type>/<slug>` from main
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T03:11:17.540Z
Learning: Pre-commit/pre-push hooks: `.pre-commit-config.yaml`; tool-call gates: `.claude/settings.json` PreToolUse (`scripts/check_*.sh`/`.py`)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T03:11:17.540Z
Learning: Squash merge; PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T03:11:17.540Z
Learning: GitHub queries: `gh issue list` via Bash, NOT MCP `list_issues`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T03:11:17.540Z
Learning: After every squash merge → `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T03:11:17.540Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T03:11:17.540Z
Learning: CLI: see `cli/CLAUDE.md` for CLI-specific guidelines; use `go -C cli`, never `cd cli`
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • src/synthorg/approval/protocol.py
  • tests/unit/hr/test_registry_autonomy.py
  • src/synthorg/api/approval_store.py
  • tests/unit/approval/test_protocol.py
  • src/synthorg/meta/chief_of_staff/propose.py
  • tests/unit/meta/chief_of_staff/test_propose.py
🔇 Additional comments (9)
src/synthorg/approval/protocol.py (1)

44-51: LGTM!

tests/unit/hr/test_registry_autonomy.py (1)

60-63: LGTM!

src/synthorg/api/approval_store.py (1)

220-251: LGTM!

tests/unit/approval/test_protocol.py (1)

66-66: LGTM!

src/synthorg/meta/chief_of_staff/propose.py (4)

246-254: LGTM!


431-476: LGTM!


575-645: LGTM!


647-687: LGTM!

tests/unit/meta/chief_of_staff/test_propose.py (1)

546-612: LGTM!


Walkthrough

This PR implements the conversational interface v1 feature — a clarifying-first "clarify-and-propose" flow for the Chief of Staff agent. It adds domain models and enums, persistence protocols and SQLite/Postgres implementations, a ChiefOfStaffProposer service with LLM prompting, approval-routing integration for conversational intake, a guarded POST /meta/chat/propose API, app startup wiring, rate-limit policy, observability constants, and extensive unit, conformance, and end-to-end tests.

@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 19, 2026 23:19 — with GitHub Actions Inactive
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 19, 2026

Merging this PR will not alter performance

✅ 54 untouched benchmarks


Comparing feat/conversational-interface-v1 (fd67914) with main (679c3bb)

Open in CodSpeed

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/synthorg/api/app.py`:
- Around line 1196-1199: Currently the code only wires the proposal repository
when both proposer and repositories are present; instead, always attach the repo
if repositories is available so conversational approvals can route to it even
when proposer is absent: call
app_state.set_conversational_proposal_repo(repositories.proposal_repo) whenever
repositories is not None, and keep
app_state.set_chief_of_staff_proposer(proposer) gated only by proposer being not
None (refer to the proposer and repositories variables and the methods
app_state.set_chief_of_staff_proposer and
app_state.set_conversational_proposal_repo).

In `@src/synthorg/api/controllers/_approval_review_gate.py`:
- Around line 279-304: Concurrent requests can both call
app_state.work_pipeline.run(work_item) before the CAS transition via
repo.transition_if(proposal.id, ConversationalProposalStatus.PENDING,
ConversationalProposalStatus.EXECUTED), causing duplicate pipeline side-effects;
fix by adding an intermediate CAS transition to an EXECUTING state before
running the pipeline: perform repo.transition_if(proposal.id, PENDING,
ConversationalProposalStatus.EXECUTING) and only run
app_state.work_pipeline.run(work_item) if that CAS succeeds, and on pipeline
failure revert the state (transition back to PENDING) or to a FAILED state so
only the winner executes side-effects; alternatively, if you cannot add a state,
explicitly document and enforce idempotency of pipeline handlers, but prefer
adding the EXECUTING transition and handling revert paths around
work_pipeline.run and the subsequent final transition to EXECUTED.

In `@src/synthorg/api/controllers/meta.py`:
- Around line 467-473: The code passes raw user input (data.message) into domain
orchestration when building ProposeArgs for proposer.converse; wrap the
untrusted content by calling wrap_untrusted(TAG_TASK_DATA, data.message) and
pass that wrapped value into ProposeArgs (replace data.message) so the
conversation boundary enforces SEC-1; update the ProposeArgs construction used
by proposer.converse accordingly (reference: ProposeArgs, proposer.converse,
wrap_untrusted, TAG_TASK_DATA).

In `@src/synthorg/meta/chief_of_staff/config.py`:
- Around line 90-93: Hoist the magic-number defaults and bounds used in the
propose fields into module-level annotated constants and use those constants in
the Field(...) calls: introduce constants like PROPOSE_TEMPERATURE_DEFAULT:
float, PROPOSE_TEMPERATURE_MIN: float, PROPOSE_TEMPERATURE_MAX: float,
PROPOSE_MAX_TOKENS_DEFAULT: int, PROPOSE_MAX_TOKENS_MIN: int,
PROPOSE_MAX_PROPOSALS_DEFAULT: int, PROPOSE_MAX_PROPOSALS_MIN/MAX: int,
PROPOSE_MAX_CLARIFICATION_DEFAULT: int, PROPOSE_MAX_CLARIFICATION_MIN/MAX: int,
then update propose_temperature, propose_max_tokens,
propose_max_proposals_per_turn, and propose_max_clarification_turns to reference
those constants in their default, ge, and le parameters (preserve same numeric
values but no hardcoded literals in Field).

In `@src/synthorg/meta/chief_of_staff/models.py`:
- Around line 447-451: The ProposeResult model allows inconsistent states; add
an in-model invariant check (e.g., a pydantic root_validator or
__post_init__/validate method on ProposeResult) that enforces: when status ==
"needs_clarification" then clarifying_question must be non‑None, proposals must
be empty, and conversation_closed must be False; when status == "proposed" then
proposals must be non‑empty and clarifying_question must be None (and optionally
conversation_closed allowed only if proposals present). Implement this
validation inside the ProposeResult model so invalid combinations cannot be
instantiated or passed through.

In `@src/synthorg/meta/chief_of_staff/propose.py`:
- Around line 271-277: The current except Exception handler in propose.py logs
all exceptions via COS_PROPOSE_FAILED; add an earlier except clause to
immediately re-raise system errors so they don't get treated as operational
failures: insert an except (MemoryError, RecursionError): raise before the
existing except Exception as exc: block (the block that logs with logger.error
and calls safe_error_description). This ensures MemoryError and RecursionError
propagate unchanged while other exceptions continue to be logged and re-raised.
- Around line 238-242: Replace the hardcoded literal limit=1000 in the call to
self._turn_repo.query within propose.py by introducing a module-level annotated
constant (e.g. _MAX_TURNS_QUERY_LIMIT: int = 1000) near the top of the module
and use that constant in the query call (limit=_MAX_TURNS_QUERY_LIMIT); update
any imports/comments if needed and ensure the constant name is clear and
annotated with type int to satisfy the coding guideline.

In `@tests/unit/api/controllers/test_conversational_intake_resume.py`:
- Around line 168-239: Add a unit test that covers the early-return when the
proposal repository is missing: create an ApprovalStore with _approval("a1"),
add it, build a _FakeAppState with proposal_repo=None, pipeline=_FakePipeline(),
then call try_conversational_intake_resume(state, "a1", approved=True) and
assert the return is True; reference symbols to locate code:
try_conversational_intake_resume, _FakeAppState, ApprovalStore, _FakePipeline,
and _approval/_proposal helpers.

In `@web/src/api/types/openapi.gen.ts`:
- Line 17479: The chat_propose endpoint is currently typed to return
ApiResponse[dict[str, Any]] so the OpenAPI schema exposes a generic dict; update
the controller to return the concrete model: import ProposeResult from
synthorg.meta.chief_of_staff.models, change the function/method return
annotation from ApiResponse[dict[str, Any]] to ApiResponse[ProposeResult] (the
annotated function around the chat_propose handler), replace the manual dict
construction and serialization with a simple return ApiResponse(data=result)
where result is the ProposeResult returned by proposer.converse(), and then
regenerate TypeScript types by running the DTO generation script (uv run python
scripts/generate_dto_types_ts.py) so the frontend gets
ApiResponse[ProposeResult] types.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: c26197e4-4502-4425-b5b1-e238e23481e4

📥 Commits

Reviewing files that changed from the base of the PR and between 679c3bb and 80b34f9.

📒 Files selected for processing (50)
  • CLAUDE.md
  • README.md
  • docs/design/self-improvement.md
  • scripts/_ghost_wiring_manifest.txt
  • scripts/schema_drift_baseline.txt
  • src/synthorg/api/app.py
  • src/synthorg/api/app_builders.py
  • src/synthorg/api/controllers/_approval_review_gate.py
  • src/synthorg/api/controllers/meta.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/state.py
  • src/synthorg/api/state_services_facades.py
  • src/synthorg/api/state_services_facades_mcp3.py
  • src/synthorg/core/enums.py
  • src/synthorg/core/error_taxonomy.py
  • src/synthorg/meta/chief_of_staff/config.py
  • src/synthorg/meta/chief_of_staff/models.py
  • src/synthorg/meta/chief_of_staff/prompts.py
  • src/synthorg/meta/chief_of_staff/propose.py
  • src/synthorg/meta/errors.py
  • src/synthorg/observability/events/approval_gate.py
  • src/synthorg/observability/events/chief_of_staff.py
  • src/synthorg/observability/events/persistence.py
  • src/synthorg/persistence/conversation_protocol.py
  • src/synthorg/persistence/conversational_factory.py
  • src/synthorg/persistence/conversational_proposal_protocol.py
  • src/synthorg/persistence/postgres/conversation_repo.py
  • src/synthorg/persistence/postgres/conversational_proposal_repo.py
  • src/synthorg/persistence/postgres/revisions/20260519000001_conversational_intake.sql
  • src/synthorg/persistence/postgres/schema.sql
  • src/synthorg/persistence/sqlite/conversation_repo.py
  • src/synthorg/persistence/sqlite/conversational_proposal_repo.py
  • src/synthorg/persistence/sqlite/revisions/20260519000001_conversational_intake.sql
  • src/synthorg/persistence/sqlite/schema.sql
  • tests/conformance/persistence/test_conversation_repository.py
  • tests/conformance/persistence/test_conversational_proposal_repository.py
  • tests/e2e/test_conversational_propose_e2e.py
  • tests/unit/api/controllers/test_conversational_intake_resume.py
  • tests/unit/meta/chief_of_staff/test_config.py
  • tests/unit/meta/chief_of_staff/test_conversational_models.py
  • tests/unit/meta/chief_of_staff/test_propose.py
  • tests/unit/meta/chief_of_staff/test_proposer_wiring.py
  • tests/unit/meta/test_api.py
  • tests/unit/meta/test_chief_of_staff_errors.py
  • web/src/api/endpoints/meta.ts
  • web/src/api/types/dtos.gen.ts
  • web/src/api/types/enum-values.gen.ts
  • web/src/api/types/error-codes.gen.ts
  • web/src/api/types/openapi.gen.ts
  • web/src/mocks/handlers/meta.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Lighthouse Dashboard
  • GitHub Check: Lighthouse Site
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: Dashboard Test
  • GitHub Check: Test Integration
  • GitHub Check: Test Unit
  • GitHub Check: Test E2E
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (16)
web/src/**/*.{js,jsx,ts,tsx,mts}

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/**/*.{js,jsx,ts,tsx,mts}: Always use createLogger from @/lib/logger; never bare console.warn/console.error/console.debug in application code. Variable name must always be log. Only logger.ts itself may use bare console methods. Use log.debug() (DEV-only, stripped in production), log.warn(), log.error().
Pass dynamic/untrusted values as separate args to logger calls (not interpolated into the message string) so they go through sanitizeArg
Attacker-controlled fields inside structured objects must be wrapped in sanitizeForLog() before embedding in log calls
Error-code constants (MANDATORY): import ErrorCode and ErrorCategory from @/api/types/errors (re-exported from the generated web/src/api/types/error-codes.gen.ts). Discriminate on ErrorCode.<NAME>, never on raw integer literals.
Use @eslint-react/web-api-no-leaked-fetch to detect fetch() in effects without AbortController cleanup

Files:

  • web/src/api/types/dtos.gen.ts
  • web/src/api/types/enum-values.gen.ts
  • web/src/api/types/error-codes.gen.ts
  • web/src/mocks/handlers/meta.ts
  • web/src/api/endpoints/meta.ts
  • web/src/api/types/openapi.gen.ts
web/src/api/types/**/*.gen.ts

📄 CodeRabbit inference engine (web/CLAUDE.md)

Generated DTO types (MANDATORY): NEVER hand-edit web/src/api/types/*.gen.ts. Regenerate with uv run python scripts/generate_dto_types_ts.py. Import DTOs via the barrel (import type { AgentConfig } from '@/api/types').

Files:

  • web/src/api/types/dtos.gen.ts
  • web/src/api/types/enum-values.gen.ts
  • web/src/api/types/error-codes.gen.ts
  • web/src/api/types/openapi.gen.ts
web/src/**/*.{ts,tsx,mts}

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/**/*.{ts,tsx,mts}: Use @typescript-eslint/no-floating-promises to forbid unawaited promises so async work cannot survive the test that scheduled it and trip the active-handle gate
Use @typescript-eslint/no-misused-promises (with checksVoidReturn: { attributes: false }) to forbid passing async functions where the callsite ignores the returned promise. React 19 async event handlers stay allowed via the attributes: false exemption.

Files:

  • web/src/api/types/dtos.gen.ts
  • web/src/api/types/enum-values.gen.ts
  • web/src/api/types/error-codes.gen.ts
  • web/src/mocks/handlers/meta.ts
  • web/src/api/endpoints/meta.ts
  • web/src/api/types/openapi.gen.ts
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Only src/synthorg/persistence/ may import sqlite/psycopg or emit raw SQL

No hardcoded numeric values outside settings/definitions/; allowlist only 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants

No from __future__ import annotations (Python 3.14 has PEP 649)

Type hints on public functions; mypy strict mode required; Google-style docstrings; line length 88; functions <50 lines; files <800 lines

Error classes must follow <Domain><Condition>Error naming from DomainError base class; never inherit Exception/RuntimeError directly

Pydantic v2: use frozen + extra="forbid" on every frozen model project-wide; @computed_field for derived values; NotBlankStr for identifiers

Args models required at every system boundary; use parse_typed() for every external dict ingestion

Use model_copy(update=...) or copy.deepcopy() for immutability; deepcopy at system boundaries

Async code: use asyncio.TaskGroup for fan-out/fan-in; helpers catch Exception and re-raise MemoryError/RecursionError

Clock seam: use clock: Clock | None = None parameter; inject FakeClock in tests; lifecycle services own _lifecycle_lock

Untrusted content (SEC-1): wrap via wrap_untrusted() from engine.prompt_safety; use HTMLParseGuard for HTML

Logging: import from synthorg.observability.get_logger; variable always logger; never use bare logging module or print() in app code

Event names must come from observability.events.<domain> constants; use structured kwargs; log state transitions AFTER persistence write with *_STATUS_TRANSITIONED

Secret-log redaction (SEC-1): never use error=str(exc) or interpolate {exc}; use error_type=type(exc).__name__ + safe_error_description(exc); never exc_info=True; OTel: forbid record_exception(), use set_attribute() with record_exception=False, set_status_on_exception=False

Telemetry: opt-in, off by default; every event property must be in _ALLOWED_PROPERTIES

V...

Files:

  • src/synthorg/api/state_services_facades_mcp3.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/observability/events/approval_gate.py
  • src/synthorg/api/state.py
  • src/synthorg/core/error_taxonomy.py
  • src/synthorg/observability/events/chief_of_staff.py
  • src/synthorg/persistence/conversational_factory.py
  • src/synthorg/api/app_builders.py
  • src/synthorg/meta/chief_of_staff/prompts.py
  • src/synthorg/api/controllers/meta.py
  • src/synthorg/persistence/conversational_proposal_protocol.py
  • src/synthorg/core/enums.py
  • src/synthorg/api/app.py
  • src/synthorg/meta/errors.py
  • src/synthorg/observability/events/persistence.py
  • src/synthorg/persistence/conversation_protocol.py
  • src/synthorg/api/state_services_facades.py
  • src/synthorg/persistence/postgres/conversation_repo.py
  • src/synthorg/meta/chief_of_staff/config.py
  • src/synthorg/persistence/sqlite/conversational_proposal_repo.py
  • src/synthorg/meta/chief_of_staff/propose.py
  • src/synthorg/persistence/postgres/conversational_proposal_repo.py
  • src/synthorg/meta/chief_of_staff/models.py
  • src/synthorg/api/controllers/_approval_review_gate.py
  • src/synthorg/persistence/sqlite/conversation_repo.py
src/synthorg/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

API startup: two phases—construction (create_app) wires synchronous services; on_startup wires services needing connected persistence; ordering invariants: agent_registry before auto_wire_meetings, SettingsService before WorkflowExecutionObserver, OntologyService wires after persistence.connect()

WebSocket resilience: per-frame timeout closes silent peers (1008); revalidation saturation closes (4011)

Files:

  • src/synthorg/api/state_services_facades_mcp3.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/state.py
  • src/synthorg/api/app_builders.py
  • src/synthorg/api/controllers/meta.py
  • src/synthorg/api/app.py
  • src/synthorg/api/state_services_facades.py
  • src/synthorg/api/controllers/_approval_review_gate.py
src/synthorg/{workers,api}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Runtime services: select behind ONE provider-present switch via build_runtime_services, returning RuntimeServices pair (worker execution service + coordinator); empty-company rejects task creation (4014) and /coordinate 503s; use swap_worker_execution_service / swap_coordinator / swap_provider_registry with lock synchronization

Files:

  • src/synthorg/api/state_services_facades_mcp3.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/state.py
  • src/synthorg/api/app_builders.py
  • src/synthorg/api/controllers/meta.py
  • src/synthorg/api/app.py
  • src/synthorg/api/state_services_facades.py
  • src/synthorg/api/controllers/_approval_review_gate.py
src/synthorg/{api,meta}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Conversational propose: POST /meta/chat/propose is opt-in (meta.chief_of_staff.propose_enabled); ChiefOfStaffProposer built by build_chief_of_staff_proposer (ENFORCED manifest entry, EPIC #1955); 503s when ANY of provider, persistence, or work pipeline missing; wrap human content via wrap_untrusted(TAG_TASK_DATA, ...) (SEC-1)

Files:

  • src/synthorg/api/state_services_facades_mcp3.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/api/state.py
  • src/synthorg/api/app_builders.py
  • src/synthorg/meta/chief_of_staff/prompts.py
  • src/synthorg/api/controllers/meta.py
  • src/synthorg/api/app.py
  • src/synthorg/meta/errors.py
  • src/synthorg/api/state_services_facades.py
  • src/synthorg/meta/chief_of_staff/config.py
  • src/synthorg/meta/chief_of_staff/propose.py
  • src/synthorg/meta/chief_of_staff/models.py
  • src/synthorg/api/controllers/_approval_review_gate.py
src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/api/state_services_facades_mcp3.py
  • src/synthorg/api/rate_limits/policies.py
  • src/synthorg/observability/events/approval_gate.py
  • src/synthorg/api/state.py
  • src/synthorg/core/error_taxonomy.py
  • src/synthorg/observability/events/chief_of_staff.py
  • src/synthorg/persistence/conversational_factory.py
  • src/synthorg/api/app_builders.py
  • src/synthorg/meta/chief_of_staff/prompts.py
  • src/synthorg/api/controllers/meta.py
  • src/synthorg/persistence/conversational_proposal_protocol.py
  • src/synthorg/core/enums.py
  • src/synthorg/api/app.py
  • src/synthorg/meta/errors.py
  • src/synthorg/observability/events/persistence.py
  • src/synthorg/persistence/conversation_protocol.py
  • src/synthorg/api/state_services_facades.py
  • src/synthorg/persistence/postgres/conversation_repo.py
  • src/synthorg/meta/chief_of_staff/config.py
  • src/synthorg/persistence/sqlite/conversational_proposal_repo.py
  • src/synthorg/meta/chief_of_staff/propose.py
  • src/synthorg/persistence/postgres/conversational_proposal_repo.py
  • src/synthorg/meta/chief_of_staff/models.py
  • src/synthorg/api/controllers/_approval_review_gate.py
  • src/synthorg/persistence/sqlite/conversation_repo.py
{README.md,docs/**/*.md}

📄 CodeRabbit inference engine (CLAUDE.md)

Numerics in README and public docs must be sourced from data/runtime_stats.yaml via <!--RS:NAME--> markers

Files:

  • README.md
  • docs/design/self-improvement.md
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Test markers: @pytest.mark.{unit,integration,e2e,slow}; async tests use auto; global timeout 30s; coverage 80% min

Windows tests: unit tests use WindowsSelectorEventLoopPolicy; subprocess tests override back to handle 3.14 IOCP teardown race

Test doubles: follow ladder in conventions.md; use FakeClock for Clock seam, mock_of[T](**overrides) for typed boundaries, SimpleNamespace for attribute-bags; bare MagicMock at typed boundaries is blocked by check_mock_spec.py

Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add @example(...))

Flaky tests: NEVER skip/xfail; fix fundamentally; use asyncio.Event().wait() not sleep(large)

Files:

  • tests/unit/meta/test_chief_of_staff_errors.py
  • tests/unit/meta/test_api.py
  • tests/unit/meta/chief_of_staff/test_config.py
  • tests/unit/meta/chief_of_staff/test_proposer_wiring.py
  • tests/unit/meta/chief_of_staff/test_conversational_models.py
  • tests/conformance/persistence/test_conversation_repository.py
  • tests/unit/meta/chief_of_staff/test_propose.py
  • tests/unit/api/controllers/test_conversational_intake_resume.py
  • tests/e2e/test_conversational_propose_e2e.py
  • tests/conformance/persistence/test_conversational_proposal_repository.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/meta/test_chief_of_staff_errors.py
  • tests/unit/meta/test_api.py
  • tests/unit/meta/chief_of_staff/test_config.py
  • tests/unit/meta/chief_of_staff/test_proposer_wiring.py
  • tests/unit/meta/chief_of_staff/test_conversational_models.py
  • tests/conformance/persistence/test_conversation_repository.py
  • tests/unit/meta/chief_of_staff/test_propose.py
  • tests/unit/api/controllers/test_conversational_intake_resume.py
  • tests/e2e/test_conversational_propose_e2e.py
  • tests/conformance/persistence/test_conversational_proposal_repository.py
src/synthorg/persistence/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

New repository protocols must inherit from one or more generic categories in persistence/_generics.py (SingletonRepository, IdKeyedRepository, FilteredQueryRepository, AppendOnlyRepository, StatefulRepository, MVCCRepository); bespoke methods permitted only under ADR-0001 D7

Repository protocols must stay @runtime_checkable

Repository CRUD: use save(entity), get(id), delete(id) -> bool, list_items(...), query(...) returning tuples

Datetime in persistence: use parse_iso_utc / format_iso_utc from persistence._shared; reject naive datetimes; use normalize_utc for already-typed

Files:

  • src/synthorg/persistence/conversational_factory.py
  • src/synthorg/persistence/conversational_proposal_protocol.py
  • src/synthorg/persistence/conversation_protocol.py
  • src/synthorg/persistence/postgres/conversation_repo.py
  • src/synthorg/persistence/sqlite/conversational_proposal_repo.py
  • src/synthorg/persistence/postgres/conversational_proposal_repo.py
  • src/synthorg/persistence/sqlite/conversation_repo.py
web/src/mocks/handlers/**/*.ts

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/mocks/handlers/**/*.ts: MSW handlers (MANDATORY): web/src/mocks/handlers/ must mirror web/src/api/endpoints/*.ts 1:1 with a default happy-path handler for every exported endpoint. Use onUnhandledRequest: 'error' in test setup; tests override per-case via server.use(...), never vi.mock('@/api/endpoints/*').
Use typed envelope helpers (successFor, paginatedFor, voidSuccess) to keep MSW handlers in lockstep with endpoint return types

Files:

  • web/src/mocks/handlers/meta.ts
web/src/{api/endpoints,stores}/**/*.ts

📄 CodeRabbit inference engine (web/CLAUDE.md)

Cursor pagination (MANDATORY): list endpoints must use opaque cursor-based paging via PaginationMeta. Stores must keep nextCursor + hasMore in state (not offset arithmetic) and early-return when !hasMore || !nextCursor. Display counts must come from data.length.

Files:

  • web/src/api/endpoints/meta.ts
web/src/api/endpoints/**/*.ts

📄 CodeRabbit inference engine (web/CLAUDE.md)

Health / readiness endpoints (MANDATORY): getLiveness() is always 200 while the process is alive; getReadiness() is 200 healthy / 503 unavailable (binary 'ok' | 'unavailable' outcome, no tri-state). Any new caller must handle the 503 path explicitly.

Files:

  • web/src/api/endpoints/meta.ts
docs/**/*.{d2,md}

📄 CodeRabbit inference engine (CLAUDE.md)

Use D2 for architecture/nested containers (theme 200, Dark Mauve, v0.7.1 pinned in CI); use Mermaid for flowcharts/sequence/pipelines; use Markdown tables for tabular data

Files:

  • docs/design/self-improvement.md
tests/conformance/persistence/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Dual-backend conformance: tests/conformance/persistence/ consumes backend fixture (SQLite + Postgres); enforced by check_dual_backend_test_parity.py

Files:

  • tests/conformance/persistence/test_conversation_repository.py
  • tests/conformance/persistence/test_conversational_proposal_repository.py
🧠 Learnings (9)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-19T23:18:33.420Z
Learning: Read `docs/design/` page before implementing; deviations need approval per DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-19T23:18:33.420Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-19T23:18:33.420Z
Learning: No region/currency/locale privilege; use metric units and British English
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-19T23:18:33.420Z
Learning: Every convention PR must ship its enforcement gate per docs/reference/convention-gates.md
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-19T23:18:33.420Z
Learning: After implementation: branch + commit + push (no auto-PR); use `/pre-pr-review` before creating PR; after PR: use `/aurelio-review-pr` for external feedback; fix all valid issues without deferring
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-19T23:18:33.420Z
Learning: Commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-19T23:18:33.420Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-19T23:18:33.420Z
Learning: Branches: `<type>/<slug>` naming from main
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-19T23:18:33.420Z
Learning: Squash merge; PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-19T23:18:33.420Z
Learning: GitHub queries: use `gh issue list` via Bash, NOT MCP `list_issues`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-19T23:18:33.420Z
Learning: After every squash merge, run `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-19T23:18:33.420Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • src/synthorg/api/state_services_facades_mcp3.py
  • src/synthorg/api/rate_limits/policies.py
  • tests/unit/meta/test_chief_of_staff_errors.py
  • src/synthorg/observability/events/approval_gate.py
  • src/synthorg/api/state.py
  • src/synthorg/core/error_taxonomy.py
  • src/synthorg/observability/events/chief_of_staff.py
  • src/synthorg/persistence/conversational_factory.py
  • src/synthorg/api/app_builders.py
  • tests/unit/meta/test_api.py
  • src/synthorg/meta/chief_of_staff/prompts.py
  • src/synthorg/api/controllers/meta.py
  • tests/unit/meta/chief_of_staff/test_config.py
  • tests/unit/meta/chief_of_staff/test_proposer_wiring.py
  • src/synthorg/persistence/conversational_proposal_protocol.py
  • src/synthorg/core/enums.py
  • src/synthorg/api/app.py
  • src/synthorg/meta/errors.py
  • src/synthorg/observability/events/persistence.py
  • src/synthorg/persistence/conversation_protocol.py
  • src/synthorg/api/state_services_facades.py
  • src/synthorg/persistence/postgres/conversation_repo.py
  • src/synthorg/meta/chief_of_staff/config.py
  • tests/unit/meta/chief_of_staff/test_conversational_models.py
  • src/synthorg/persistence/sqlite/conversational_proposal_repo.py
  • src/synthorg/meta/chief_of_staff/propose.py
  • src/synthorg/persistence/postgres/conversational_proposal_repo.py
  • tests/conformance/persistence/test_conversation_repository.py
  • src/synthorg/meta/chief_of_staff/models.py
  • tests/unit/meta/chief_of_staff/test_propose.py
  • tests/unit/api/controllers/test_conversational_intake_resume.py
  • src/synthorg/api/controllers/_approval_review_gate.py
  • tests/e2e/test_conversational_propose_e2e.py
  • tests/conformance/persistence/test_conversational_proposal_repository.py
  • src/synthorg/persistence/sqlite/conversation_repo.py
📚 Learning: 2026-05-16T18:36:19.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1944
File: docs/guides/contributing.md:95-95
Timestamp: 2026-05-16T18:36:19.195Z
Learning: In the SynthOrg repo, the “Doc Numeric Claims (MANDATORY)” RS-marker rule should be applied only to these docs: README.md; docs/index.md; docs/roadmap/index.md; docs/architecture/decisions.md; docs/reference/convention-gates.md. This rule is enforced by scripts/check_doc_numeric_macros.py (with runtime substitution by scripts/inject_runtime_stats.py), so reviewers should not flag similar numeric-claim issues in other paths (e.g., anything under docs/guides/). When checking those scoped files, the rule skips fenced code blocks and only flags digits that are adjacent to stat nouns (tests/providers/agents/stars/releases). Numeric CLI flags like “--num-workers=4” inside fenced bash code blocks are not subject to this rule.

Applied to files:

  • README.md
📚 Learning: 2026-05-16T18:36:31.446Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1944
File: docs/reference/conventions.md:787-789
Timestamp: 2026-05-16T18:36:31.446Z
Learning: In Aureliolo/synthorg, follow the `Doc Numeric Claims (MANDATORY)` rule enforced by `scripts/check_doc_numeric_macros.py` only for these markdown files: `README.md`, `docs/index.md`, `docs/roadmap/index.md`, `docs/architecture/decisions.md`, and `docs/reference/convention-gates.md`. The gate flags digits that appear adjacent to the stat nouns `tests`, `providers`, `agents`, `stars`, and `releases`—those numeric claims must use the required `<!--RS:...-->` macro format. Do not apply this rule to prose that mentions Python version numbers (e.g., “Python 3.14” / “Python 3.15”); those should not be flagged as requiring `<!--RS:...-->`.

Applied to files:

  • README.md
📚 Learning: 2026-05-16T18:36:35.250Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1944
File: docs/getting_started.md:109-109
Timestamp: 2026-05-16T18:36:35.250Z
Learning: In the synthorg repo, the “Doc Numeric Claims (MANDATORY)” RS-marker rule is enforced only for this exact set of Markdown files: README.md, docs/index.md, docs/roadmap/index.md, docs/architecture/decisions.md, and docs/reference/convention-gates.md. During code reviews, do not raise RS-marker/numeric-claims findings for numeric values in any other files (e.g., docs/getting_started.md, docs/guides/*, docs/reference/conventions.md), since they are not checked or injected by scripts/check_doc_numeric_macros.py or scripts/inject_runtime_stats.py.

Applied to files:

  • README.md
📚 Learning: 2026-05-16T18:36:31.446Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1944
File: docs/reference/conventions.md:787-789
Timestamp: 2026-05-16T18:36:31.446Z
Learning: In Aureliolo/synthorg, do not require adding `<!--RS:...-->` “Doc Numeric Claims (MANDATORY)” numeric macros for Python version numbers mentioned in documentation prose (e.g., “Python 3.14”, “Python 3.15”). The `scripts/check_doc_numeric_macros.py` gate only applies to `README.md`, `docs/index.md`, `docs/roadmap/index.md`, `docs/architecture/decisions.md`, and `docs/reference/convention-gates.md`, and it only flags digits adjacent to specific stat nouns (tests/providers/agents/stars/releases), not language version mentions like “Python 3.14”.

Applied to files:

  • README.md
  • CLAUDE.md
  • docs/design/self-improvement.md
📚 Learning: 2026-05-16T18:36:35.250Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1944
File: docs/getting_started.md:109-109
Timestamp: 2026-05-16T18:36:35.250Z
Learning: When reviewing Markdown in the synthorg repo, account for the CI gate `check_doc_numeric_macros.py`: it skips fenced code blocks entirely, and it only flags digits that are adjacent to these stat nouns: `tests`, `providers`, `agents`, `stars`, `releases`. Therefore, numeric examples such as CLI flag values (e.g., `--num-workers=4` in fenced bash blocks) and prose version numbers (e.g., `3.14`/`3.15`) are not expected to trigger this check; prioritize changes only when digits appear next to one of the listed nouns (e.g., “5 tests”, “10 stars”, etc.).

Applied to files:

  • README.md
  • CLAUDE.md
  • docs/design/self-improvement.md
📚 Learning: 2026-05-16T18:36:35.250Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1944
File: docs/getting_started.md:109-109
Timestamp: 2026-05-16T18:36:35.250Z
Learning: When reviewing markdown files for the "Doc Numeric Claims (MANDATORY)" RS-marker rule, only require/flag missing RS markers in the files that are actually in-scope for the rule. The scope is enforced via an identical _SCOPED_FILES allowlist in scripts/check_doc_numeric_macros.py and scripts/inject_runtime_stats.py, and currently includes: README.md; docs/index.md; docs/roadmap/index.md; docs/architecture/decisions.md; docs/reference/convention-gates.md. For any other markdown files (e.g., docs/getting_started.md, docs/guides/*), missing RS markers for numeric claims are no-ops and should NOT be flagged.

Applied to files:

  • README.md
  • docs/design/self-improvement.md
📚 Learning: 2026-05-16T18:36:35.250Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1944
File: docs/getting_started.md:109-109
Timestamp: 2026-05-16T18:36:35.250Z
Learning: When reviewing Markdown in the synthorg repo against the `check_doc_numeric_macros.py` gate, account for its documented behavior: it skips fenced code blocks entirely, and it only flags digits that are adjacent to specific stat nouns (`tests`, `providers`, `agents`, `stars`, `releases`). As a result, CLI-style numbers (e.g., `--num-workers=4`) inside fenced bash code blocks should never be treated as violations of this gate; only non-fenced text needs checking, and only around those specific nouns.

Applied to files:

  • docs/design/self-improvement.md
🔇 Additional comments (82)
tests/e2e/test_conversational_propose_e2e.py (1)

89-530: LGTM!

CLAUDE.md (1)

90-90: LGTM!

README.md (1)

43-43: LGTM!

docs/design/self-improvement.md (1)

117-122: LGTM!

Also applies to: 221-223, 235-243, 280-289

scripts/_ghost_wiring_manifest.txt (1)

45-45: LGTM!

tests/unit/meta/chief_of_staff/test_conversational_models.py (1)

1-223: LGTM!

tests/unit/meta/chief_of_staff/test_propose.py (1)

1-466: LGTM!

tests/unit/meta/chief_of_staff/test_proposer_wiring.py (1)

1-114: LGTM!

tests/unit/meta/chief_of_staff/test_config.py (1)

54-85: LGTM!

tests/unit/meta/test_api.py (1)

4-12: LGTM!

Also applies to: 37-69

tests/unit/meta/test_chief_of_staff_errors.py (1)

1-62: LGTM!

src/synthorg/persistence/sqlite/schema.sql (2)

1075-1078: Same backend-compatibility concern as the SQLite migration file.


1127-1181: LGTM!

src/synthorg/persistence/conversation_protocol.py (1)

35-177: LGTM!

src/synthorg/persistence/conversational_factory.py (1)

52-120: LGTM!

src/synthorg/persistence/conversational_proposal_protocol.py (1)

29-147: LGTM!

src/synthorg/persistence/postgres/revisions/20260519000001_conversational_intake.sql (1)

16-56: LGTM!

src/synthorg/persistence/postgres/schema.sql (1)

1153-1153: LGTM!

Also applies to: 1198-1238

src/synthorg/persistence/sqlite/revisions/20260519000001_conversational_intake.sql (1)

32-74: LGTM!

scripts/schema_drift_baseline.txt (1)

50-53: LGTM!

src/synthorg/persistence/sqlite/conversation_repo.py (8)

1-65: LGTM!


67-95: LGTM!


97-145: LGTM!


148-217: LGTM!


219-286: LGTM!


288-354: LGTM!


356-384: LGTM!


386-546: LGTM!

src/synthorg/persistence/sqlite/conversational_proposal_repo.py (4)

1-121: LGTM!


123-210: LGTM!


212-319: LGTM!


321-404: LGTM!

src/synthorg/persistence/postgres/conversation_repo.py (4)

1-107: LGTM!


109-323: LGTM!


325-463: LGTM!


465-468: LGTM!

src/synthorg/persistence/postgres/conversational_proposal_repo.py (3)

1-95: LGTM!


97-377: LGTM!


379-380: LGTM!

tests/conformance/persistence/test_conversation_repository.py (3)

1-106: LGTM!


108-224: LGTM!


226-302: LGTM!

tests/conformance/persistence/test_conversational_proposal_repository.py (2)

1-70: LGTM!


72-218: LGTM!

src/synthorg/core/enums.py (1)

597-607: LGTM!

Also applies to: 609-648

src/synthorg/core/error_taxonomy.py (1)

93-93: LGTM!

Also applies to: 111-111, 136-136

src/synthorg/meta/chief_of_staff/config.py (1)

12-12: LGTM!

Also applies to: 44-55, 83-89, 94-95

src/synthorg/meta/chief_of_staff/models.py (1)

21-28: LGTM!

Also applies to: 233-445

src/synthorg/meta/errors.py (1)

62-145: LGTM!

src/synthorg/observability/events/approval_gate.py (1)

39-52: LGTM!

src/synthorg/observability/events/chief_of_staff.py (1)

46-56: LGTM!

src/synthorg/observability/events/persistence.py (1)

612-644: LGTM!

src/synthorg/meta/chief_of_staff/prompts.py (1)

222-273: LGTM!

src/synthorg/meta/chief_of_staff/propose.py (12)

1-91: LGTM!


94-103: LGTM!


105-143: LGTM!


145-206: LGTM!


208-227: LGTM!


298-328: LGTM!


330-397: LGTM!


399-421: LGTM!


423-448: LGTM!


450-495: LGTM!


497-533: LGTM!


536-537: LGTM!

src/synthorg/api/controllers/_approval_review_gate.py (2)

31-40: LGTM!


502-510: LGTM!

tests/unit/api/controllers/test_conversational_intake_resume.py (3)

1-26: LGTM!


43-116: LGTM!


119-166: LGTM!

src/synthorg/api/rate_limits/policies.py (1)

128-128: LGTM!

src/synthorg/api/app_builders.py (1)

37-49: LGTM!

Also applies to: 197-257

src/synthorg/api/state.py (1)

204-204: LGTM!

Also applies to: 211-211

src/synthorg/api/state_services_facades.py (1)

119-119: LGTM!

Also applies to: 177-178

src/synthorg/api/state_services_facades_mcp3.py (1)

34-42: LGTM!

Also applies to: 84-85, 234-283

web/src/api/endpoints/meta.ts (1)

8-8: LGTM!

Also applies to: 73-87, 120-147

web/src/api/types/dtos.gen.ts (1)

171-171: LGTM!

web/src/api/types/enum-values.gen.ts (1)

43-43: LGTM!

web/src/api/types/error-codes.gen.ts (1)

43-43: LGTM!

Also applies to: 59-59, 78-78

web/src/api/types/openapi.gen.ts (2)

2120-2136: LGTM!

Also applies to: 5809-5817, 6687-6692


7572-7572: LGTM!

web/src/mocks/handlers/meta.ts (1)

8-8: LGTM!

Also applies to: 36-72

Comment thread src/synthorg/api/app.py Outdated
Comment thread src/synthorg/api/controllers/_approval_review_gate.py
Comment thread src/synthorg/api/controllers/meta.py
Comment thread src/synthorg/meta/chief_of_staff/config.py Outdated
Comment thread src/synthorg/meta/chief_of_staff/models.py
Comment thread src/synthorg/meta/chief_of_staff/propose.py
Comment thread src/synthorg/meta/chief_of_staff/propose.py
Comment on lines +5 to +23
-- SQLite-only design note: this revision deliberately does NOT widen
-- the ``approvals.source`` CHECK to admit ``conversational_intake``.
-- The only mechanisms available to SQLite are (a) a full ``approvals``
-- table rebuild + 9 index recreations and (b) a
-- ``PRAGMA writable_schema`` patch + reload; both forced multiple
-- migration-fixture-timing tests across the unit suite over their
-- 8s budget on Windows (the conformance / multi-tenancy arms migrate
-- from scratch per test, and the pre-push isolation gate runs them
-- twice each). The Postgres sibling DOES widen the constraint, which
-- is the production target.
--
-- Practical impact on SQLite: the ``approvals`` table retains the
-- narrow CHECK ``source IN ('parked_context', 'review_gate')``. The
-- v1 conversational interface keeps the ``ApprovalStore`` in-memory
-- (no persistent ``ApprovalRepository`` is wired by default), so no
-- ``CONVERSATIONAL_INTAKE`` row ever reaches the SQLite ``approvals``
-- table. Deployments that wire a persistent SQLite ``ApprovalStore``
-- AND enable the conversational interface should switch to Postgres
-- (the codebase's production backend) or follow up with a rebuild.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

SQLite migration keeps approvals.source incompatible with conversational-intake writes.

Line 5 intentionally skips widening the SQLite approvals.source CHECK, so any deployment persisting approvals on SQLite can reject rows using source='conversational_intake'. This creates backend-dependent runtime behaviour for the same feature path.

Please either rebuild SQLite approvals to include conversational_intake, or hard-block that configuration combination at startup with an explicit unsupported-mode failure.

Comment on lines +168 to +239
class TestConversationalIntakeResume:
async def test_non_conversational_source_is_inert(self) -> None:
state, _ = await _seed(source=ApprovalSource.REVIEW_GATE)
handled = await try_conversational_intake_resume(
state, # type: ignore[arg-type]
"a1",
approved=True,
)
assert handled is False

async def test_approve_runs_pipeline_and_marks_executed(self) -> None:
pipeline = _FakePipeline()
state, repo = await _seed(pipeline=pipeline)
handled = await try_conversational_intake_resume(
state, # type: ignore[arg-type]
"a1",
approved=True,
)
assert handled is True
assert len(pipeline.calls) == 1
assert pipeline.calls[0].source is WorkSource.CONVERSATIONAL
assert repo.items["prop-a1"].status is ConversationalProposalStatus.EXECUTED

async def test_reject_skips_pipeline_and_marks_rejected(self) -> None:
pipeline = _FakePipeline()
state, repo = await _seed(pipeline=pipeline)
handled = await try_conversational_intake_resume(
state, # type: ignore[arg-type]
"a1",
approved=False,
)
assert handled is True
assert pipeline.calls == []
assert repo.items["prop-a1"].status is ConversationalProposalStatus.REJECTED

async def test_missing_proposal_is_owned_but_noop(self) -> None:
state, _ = await _seed(with_proposal=False)
handled = await try_conversational_intake_resume(
state, # type: ignore[arg-type]
"a1",
approved=True,
)
assert handled is True

async def test_pipeline_unavailable_raises(self) -> None:
store = ApprovalStore()
await store.add(_approval("a1"))
repo = _FakeProposalRepo()
prop = _proposal("a1")
repo.items[prop.id] = prop
state = _FakeAppState(
approval_store=store,
proposal_repo=repo,
pipeline=None,
)
with pytest.raises(ServiceUnavailableError):
await try_conversational_intake_resume(
state, # type: ignore[arg-type]
"a1",
approved=True,
)

async def test_pipeline_failure_leaves_proposal_pending(self) -> None:
pipeline = _FakePipeline(error=RuntimeError("boom"))
state, repo = await _seed(pipeline=pipeline)
handled = await try_conversational_intake_resume(
state, # type: ignore[arg-type]
"a1",
approved=True,
)
assert handled is True
assert repo.items["prop-a1"].status is ConversationalProposalStatus.PENDING
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | 💤 Low value

Consider adding test coverage for missing proposal repository.

The implementation has an early-return path when has_conversational_proposal_repo is False (logs error and returns True). This edge case isn't currently covered by the tests.

🧪 Proposed additional test case
async def test_missing_proposal_repo_returns_true(self) -> None:
    store = ApprovalStore()
    await store.add(_approval("a1"))
    state = _FakeAppState(
        approval_store=store,
        proposal_repo=None,  # repo not wired
        pipeline=_FakePipeline(),
    )
    handled = await try_conversational_intake_resume(
        state,  # type: ignore[arg-type]
        "a1",
        approved=True,
    )
    assert handled is True
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/api/controllers/test_conversational_intake_resume.py` around lines
168 - 239, Add a unit test that covers the early-return when the proposal
repository is missing: create an ApprovalStore with _approval("a1"), add it,
build a _FakeAppState with proposal_repo=None, pipeline=_FakePipeline(), then
call try_conversational_intake_resume(state, "a1", approved=True) and assert the
return is True; reference symbols to locate code:
try_conversational_intake_resume, _FakeAppState, ApprovalStore, _FakePipeline,
and _approval/_proposal helpers.

Comment thread web/src/api/types/openapi.gen.ts Outdated
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the "Conversational Propose" feature for the Chief of Staff agent, enabling a natural language interface to clarify underspecified requests and park concrete work items in the human approval queue. The implementation includes new API endpoints, Pydantic models, and database repositories for SQLite and Postgres. Feedback identifies a potential race condition in the approval dispatch flow that could lead to duplicate work execution and suggests improving error handling consistency when repositories are missing.

Comment on lines +281 to +304
await app_state.work_pipeline.run(work_item)
except MemoryError, RecursionError:
raise
except ServiceUnavailableError:
raise
except Exception as exc:
# The approve decision is already persisted; a pipeline failure
# must not 5xx the response. Leave the proposal PENDING (not
# EXECUTED) so the failure is visible and not silently dropped.
logger.error(
APPROVAL_GATE_CONVERSATIONAL_FAILED,
approval_id=approval_id,
proposal_id=proposal.id,
error_type=type(exc).__name__,
error=safe_error_description(exc),
note="pipeline run failed; proposal left pending",
)
return True

transitioned = await repo.transition_if(
proposal.id,
ConversationalProposalStatus.PENDING,
ConversationalProposalStatus.EXECUTED,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There is a potential race condition here that could lead to duplicate work execution. The work_pipeline.run(work_item) call is awaited before the proposal status is transitioned to EXECUTED. If two concurrent approval decisions are processed for the same approval_id, both could reread the proposal as PENDING, both could successfully run the pipeline, and then both would attempt the CAS transition. While only one transition will succeed, the side-effectful work will have run twice. Consider performing a transition to an intermediate state like EXECUTING before running the pipeline, or ensuring the pipeline run itself is idempotent for a given proposal_id.

Comment on lines +223 to +229
if not app_state.has_conversational_proposal_repo:
logger.error(
APPROVAL_GATE_CONVERSATIONAL_FAILED,
approval_id=approval_id,
note="conversational proposal repo not wired",
)
return True
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Inconsistency in error handling: when the conversational_proposal_repo is missing, the function logs an error and returns True (swallowing the decision), but when the work_pipeline is missing (line 277), it raises a ServiceUnavailableError. For a hard misconfiguration like a missing repository, raising an error is generally preferred so the caller is aware that the approval decision could not be fully processed.

    if not app_state.has_conversational_proposal_repo:
        logger.error(
            APPROVAL_GATE_CONVERSATIONAL_FAILED,
            approval_id=approval_id,
            note="conversational proposal repo not wired",
        )
        msg = "Conversational proposal repository unavailable"
        raise ServiceUnavailableError(msg)

CI (Test Conformance SQLite + Test Integration, 18 fails, single root cause):
- Seed parent conversation row in every proposal-bearing conformance
  test so the fk_cp_conversation FK is satisfied before save().

CodeRabbit + Gemini inline feedback:
- app.py: wire conversational_proposal_repo independently of proposer
  availability so a queued conversational approval can still route to
  the repo when the proposer is absent this boot.
- _approval_review_gate.py: race-window fix; PENDING -> EXECUTING CAS
  acquired before work_pipeline.run, EXECUTING -> EXECUTED on success,
  EXECUTING -> PENDING revert on pipeline failure. Only the CAS winner
  ever drives the pipeline for a given proposal.
- _approval_review_gate.py: missing proposal repo now raises
  ServiceUnavailableError (Gemini-b) instead of swallowing the
  approval with log+return-True.
- meta.py: wrap chat_propose user message via
  wrap_untrusted(TAG_TASK_DATA, ...) at the API boundary (SEC-1).
- meta.py: change chat_propose return type from ApiResponse[dict[str,
  Any]] to ApiResponse[ProposeResult]; regenerate dtos.gen.ts +
  openapi.gen.ts; switch web meta.ts endpoint to the generated DTO.
- chief_of_staff/config.py: hoist propose-* Field literals to
  module-level annotated named constants.
- chief_of_staff/models.py: add model_validator to ProposeResult
  enforcing the needs_clarification vs proposed branch invariants.
- chief_of_staff/propose.py: re-raise MemoryError/RecursionError
  before the generic Exception handler in _run_decision.
- chief_of_staff/propose.py: extract limit=1000 to module-level
  _MAX_TURNS_QUERY_LIMIT.
- SQLite migration trap: hard-block propose_enabled + SQLite +
  persistent ApprovalRepository combination at startup so the narrow
  approvals.source CHECK cannot strand conversational_intake writes.

EXECUTING state plumbing (CAS dance prerequisite):
- ConversationalProposalStatus gets EXECUTING.
- conversational_proposals.status CHECK widened to include
  'executing' on both SQLite + Postgres migrations and schemas.

Tests:
- New test_missing_proposal_repo_raises asserts ServiceUnavailableError
  on missing repo (supersedes CR-9's recommendation, which CR matched
  to the old return-True behavior).
- New test_concurrent_acquire_only_one_runs_pipeline asserts the
  EXECUTING CAS makes the loser inert.
- Renamed test_pipeline_failure_leaves_proposal_pending ->
  test_pipeline_failure_reverts_executing_to_pending to match the
  new revert path.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

♻️ Duplicate comments (1)
src/synthorg/api/app.py (1)

1175-1230: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not return before wiring the proposal repository.

Line 1177 exits the hook before Lines 1222-1230 run, so a boot without provider_registry never attaches conversational_proposal_repo. That breaks approve/reject handling for already-parked conversational approvals, which is the exact fallback this hook says it should preserve.

Suggested fix
         if app_state.has_chief_of_staff_proposer:
             return
-        if provider_registry is None:
-            return
         from synthorg.meta.config import (  # noqa: PLC0415
             load_self_improvement_config,
         )
         from synthorg.persistence.conversational_factory import (  # noqa: PLC0415
             build_conversational_repositories,
         )
 
+        repositories = build_conversational_repositories(persistence)
+        if repositories is not None:
+            app_state.set_conversational_proposal_repo(repositories.proposal_repo)
+        if provider_registry is None:
+            return
+
         meta_self_improvement = await load_self_improvement_config(
             app_state.settings_service if app_state.has_settings_service else None,
         )
@@
-        repositories = build_conversational_repositories(persistence)
-        # Wire the proposal repo independently of the proposer: a
-        # conversational-intake approval that exists from a previous
-        # boot must still route through the repo even if the proposer
-        # itself is currently unavailable (e.g. provider absent this
-        # boot), or its decision would silently fail in
-        # ``try_conversational_intake_resume``.
-        if repositories is not None:
-            app_state.set_conversational_proposal_repo(repositories.proposal_repo)
         proposer = build_chief_of_staff_proposer(
             meta_self_improvement.chief_of_staff,
             provider_registry=provider_registry,
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/synthorg/api/app.py` around lines 1175 - 1230, The early return when
provider_registry is None prevents wiring the conversational proposal repo;
remove or relocate the "if provider_registry is None: return" so that after
checking has_chief_of_staff_proposer you still call
build_conversational_repositories(persistence) and, if repositories is not None,
call app_state.set_conversational_proposal_repo(repositories.proposal_repo);
keep any provider_registry-dependent logic (e.g. proposer wiring) conditional
but ensure the repository wiring always runs so already-parked conversational
approvals remain handled.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/synthorg/api/app.py`:
- Around line 1189-1201: Replace the migration-history framed comment with a
concise, invariant-focused explanation of the runtime guard: state that the code
hard-blocks the combination of SQLite + persistent ApprovalRepository when
propose_enabled is true because SQLite's current approvals.source CHECK
restricts allowed sources (e.g. only parked_context/review_gate), which would
cause a persistent ApprovalRepository to reject conversational_intake rows at
write time; note that Postgres does not have this restriction and is the
supported production path or else keep ApprovalStore in-memory on SQLite, and
that failing at startup is preferred to failing later on the first
/meta/chat/propose call.

In `@src/synthorg/api/approval_store.py`:
- Around line 123-131: Rewrite the property docstring to state current
behavior/rationale only (no migration history or reviewer/migration framing):
remove the historical/migration text about the "v1 SQLite migration", "Windows
8s-budget regression", and "workaround" and replace with a concise WHY
explanation that the persistent SQLite ApprovalRepository keeps approvals.source
narrow (allowed values: parked_context/review_gate) which prevents
conversational_intake rows from being accepted at write time, so wiring the
conversational proposer should be refused and Postgres is the sanctioned
production backend; reference the relevant symbols ApprovalRepository,
approvals.source, conversational_intake, conversational interface, proposer, and
Postgres in the docstring to make the rationale clear.

In `@src/synthorg/api/controllers/_approval_review_gate.py`:
- Around line 183-366: The function try_conversational_intake_resume is doing
too many responsibilities; split it into small branch-specific helpers: (1) a
loader/helper that calls _reread_approval_item and validates
ApprovalSource.CONVERSATIONAL_INTAKE and returns the proposal (use
ConversationalProposalFilterSpec and repo.query to locate the proposal), (2) a
reject handler that performs repo.transition_if(..., PENDING, REJECTED) and logs
the APPROVAL_GATE_CONVERSATIONAL_REJECTED/FAILED paths, (3) an
execute-acquire-and-run helper that performs repo.transition_if(..., PENDING,
EXECUTING), calls WorkItem.model_validate_json and app_state.work_pipeline.run,
and handles revert-on-exception via repo.transition_if(..., EXECUTING, PENDING)
with the same logging, and (4) a finalize helper that does the
EXECUTING->EXECUTED CAS and logs APPROVAL_GATE_CONVERSATIONAL_EXECUTED or the
CAS-failure warning; then make try_conversational_intake_resume orchestrate
these helpers and only keep high-level flow, preserving exact calls to
_reread_approval_item, repo.transition_if, WorkItem.model_validate_json,
app_state.work_pipeline.run and the same ServiceUnavailableError raises and log
messages.
- Around line 219-221: The current check conflates a transient read (item is
None) with a legitimate non-conversational approval and returns False, allowing
signal_resume_intent() to finish and leave a conversational approval stranded;
change the logic in the approval routing code that calls _reread_approval_item
so that item == None is treated as a transient/error case (do NOT return False)
— instead raise a transient/read error (or return a distinct sentinel that
causes the caller to retry) and only return False when item is present and
item.source != ApprovalSource.CONVERSATIONAL_INTAKE; reference
_reread_approval_item, ApprovalSource.CONVERSATIONAL_INTAKE and
signal_resume_intent so reviewers can find and update the conditional
accordingly.

In `@src/synthorg/meta/chief_of_staff/config.py`:
- Around line 16-25: The comments in src/synthorg/meta/chief_of_staff/config.py
reference real vendor/model names (e.g., "Claude/LiteLLM provider" and
"Claude/GPT-4o class models"); update those comments to use project placeholders
per guidelines: replace vendor mentions with "example-provider" (or
"test-provider" where appropriate) and model-class mentions with
"example-large-001" (or example-{large,medium,small}-001 as suitable) so the
comment near _PROPOSE_TEMPERATURE_DEFAULT and the token-size remark no longer
contains real vendor/model names.

In `@src/synthorg/meta/chief_of_staff/propose.py`:
- Around line 177-189: The current sequence allocation in propose.converse (it
reads prior_turns via _ordered_turns and sets sequence=len(prior_turns) before
calling _turn_repo.append) has a TOCTOU race causing uq_ct_conversation_sequence
collisions; fix by moving sequence assignment into the repository write path so
the DB computes the next sequence atomically (e.g., INSERT ... SELECT
COALESCE(MAX(sequence), -1)+1 or use a DB sequence) in the implementation of
_turn_repo.append, or alternatively catch the DB unique-constraint error for
uq_ct_conversation_sequence inside _turn_repo.append (or the converse caller)
and retry the append flow deterministically (re-fetch latest sequence and
re-attempt a limited number of times), ensuring ConversationTurn sequence is
assigned only once the atomic write succeeds.
- Around line 227-232: After fetching existing via
self._conversation_repo.get(args.conversation_id) add an authorization check
that the conversation owner matches the caller (e.g., compare
existing.created_by to args.user_id or args.created_by); if they differ, reject
reuse by raising an error (preferably
ConversationNotFoundError(conversation_id=args.conversation_id) to avoid leaking
existence, or a ConversationAccessDenied/Authorization error if that type
exists). Place this check right after retrieving existing and before the status
check (i.e., between the get() call and the ConversationStatus.CLOSED check) so
propose.py enforces that only the creator can resume a conversation.

---

Duplicate comments:
In `@src/synthorg/api/app.py`:
- Around line 1175-1230: The early return when provider_registry is None
prevents wiring the conversational proposal repo; remove or relocate the "if
provider_registry is None: return" so that after checking
has_chief_of_staff_proposer you still call
build_conversational_repositories(persistence) and, if repositories is not None,
call app_state.set_conversational_proposal_repo(repositories.proposal_repo);
keep any provider_registry-dependent logic (e.g. proposer wiring) conditional
but ensure the repository wiring always runs so already-parked conversational
approvals remain handled.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8cecb2ee-83f1-402c-9332-f0c0bef5e02a

📥 Commits

Reviewing files that changed from the base of the PR and between 80b34f9 and 05b3e84.

📒 Files selected for processing (17)
  • src/synthorg/api/app.py
  • src/synthorg/api/approval_store.py
  • src/synthorg/api/controllers/_approval_review_gate.py
  • src/synthorg/api/controllers/meta.py
  • src/synthorg/core/enums.py
  • src/synthorg/meta/chief_of_staff/config.py
  • src/synthorg/meta/chief_of_staff/models.py
  • src/synthorg/meta/chief_of_staff/propose.py
  • src/synthorg/persistence/postgres/revisions/20260519000001_conversational_intake.sql
  • src/synthorg/persistence/postgres/schema.sql
  • src/synthorg/persistence/sqlite/revisions/20260519000001_conversational_intake.sql
  • src/synthorg/persistence/sqlite/schema.sql
  • tests/conformance/persistence/test_conversational_proposal_repository.py
  • tests/unit/api/controllers/test_conversational_intake_resume.py
  • web/src/api/endpoints/meta.ts
  • web/src/api/types/dtos.gen.ts
  • web/src/api/types/openapi.gen.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Deploy Preview
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: Dashboard Test
  • GitHub Check: Test Integration
  • GitHub Check: Test Unit
  • GitHub Check: Test E2E
  • GitHub Check: Lighthouse Dashboard
  • GitHub Check: Lighthouse Site
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (9)
web/src/**/*.{js,jsx,ts,tsx,mts}

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/**/*.{js,jsx,ts,tsx,mts}: Always use createLogger from @/lib/logger; never bare console.warn/console.error/console.debug in application code. Variable name must always be log. Only logger.ts itself may use bare console methods. Use log.debug() (DEV-only, stripped in production), log.warn(), log.error().
Pass dynamic/untrusted values as separate args to logger calls (not interpolated into the message string) so they go through sanitizeArg
Attacker-controlled fields inside structured objects must be wrapped in sanitizeForLog() before embedding in log calls
Error-code constants (MANDATORY): import ErrorCode and ErrorCategory from @/api/types/errors (re-exported from the generated web/src/api/types/error-codes.gen.ts). Discriminate on ErrorCode.<NAME>, never on raw integer literals.
Use @eslint-react/web-api-no-leaked-fetch to detect fetch() in effects without AbortController cleanup

Files:

  • web/src/api/types/dtos.gen.ts
  • web/src/api/endpoints/meta.ts
  • web/src/api/types/openapi.gen.ts
web/src/api/types/**/*.gen.ts

📄 CodeRabbit inference engine (web/CLAUDE.md)

Generated DTO types (MANDATORY): NEVER hand-edit web/src/api/types/*.gen.ts. Regenerate with uv run python scripts/generate_dto_types_ts.py. Import DTOs via the barrel (import type { AgentConfig } from '@/api/types').

Files:

  • web/src/api/types/dtos.gen.ts
  • web/src/api/types/openapi.gen.ts
web/src/**/*.{ts,tsx,mts}

📄 CodeRabbit inference engine (web/CLAUDE.md)

web/src/**/*.{ts,tsx,mts}: Use @typescript-eslint/no-floating-promises to forbid unawaited promises so async work cannot survive the test that scheduled it and trip the active-handle gate
Use @typescript-eslint/no-misused-promises (with checksVoidReturn: { attributes: false }) to forbid passing async functions where the callsite ignores the returned promise. React 19 async event handlers stay allowed via the attributes: false exemption.

Files:

  • web/src/api/types/dtos.gen.ts
  • web/src/api/endpoints/meta.ts
  • web/src/api/types/openapi.gen.ts
web/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Web: see web/CLAUDE.md for design system and component guidelines

Files:

  • web/src/api/types/dtos.gen.ts
  • web/src/api/endpoints/meta.ts
  • web/src/api/types/openapi.gen.ts
web/src/{api/endpoints,stores}/**/*.ts

📄 CodeRabbit inference engine (web/CLAUDE.md)

Cursor pagination (MANDATORY): list endpoints must use opaque cursor-based paging via PaginationMeta. Stores must keep nextCursor + hasMore in state (not offset arithmetic) and early-return when !hasMore || !nextCursor. Display counts must come from data.length.

Files:

  • web/src/api/endpoints/meta.ts
web/src/api/endpoints/**/*.ts

📄 CodeRabbit inference engine (web/CLAUDE.md)

Health / readiness endpoints (MANDATORY): getLiveness() is always 200 while the process is alive; getReadiness() is 200 healthy / 503 unavailable (binary 'ok' | 'unavailable' outcome, no tri-state). Any new caller must handle the 503 path explicitly.

Files:

  • web/src/api/endpoints/meta.ts
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Numerics must live in settings/definitions/; allowlist 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants (NAME: int|float|Final|Final[int]|Final[float] = literal); enforced by scripts/check_no_magic_numbers.py

No from __future__ import annotations (3.14 has PEP 649); PEP 758 except: except A, B: no parens unless binding

Type hints on public functions; mypy strict; Google-style docstrings; line length 88; functions <50 lines; files <800 lines

Errors must follow <Domain><Condition>Error naming from DomainError base; never inherit Exception/RuntimeError directly; enforced by check_domain_error_hierarchy.py

Pydantic v2: frozen + extra="forbid" on every frozen model; @computed_field auto-exempt; per-line # lint-allow: frozen-extra-forbid -- <reason> for boundaries; enforced by check_frozen_model_extra_forbid.py

Args models at every system boundary; parse_typed() for every external dict ingestion; enforced by check_boundary_typed.py

Immutability: use model_copy(update=...) or copy.deepcopy(); deepcopy at system boundaries

Async: use asyncio.TaskGroup for fan-out/fan-in; helpers catch Exception (re-raise MemoryError/RecursionError)

Clock seam: clock: Clock | None = None; tests inject FakeClock; services own _lifecycle_lock; timed-out stops mark unrestartable

Untrusted content (SEC-1): wrap via wrap_untrusted() from engine.prompt_safety; use HTMLParseGuard for HTML

Use from synthorg.observability import get_logger; variable name always logger; never import logging or print() in app code

Event names from observability.events.<domain> constants; use structured kwargs (logger.info(EVENT, key=value))

Error paths log WARNING/ERROR with context before raising; state transitions log INFO via *_STATUS_TRANSITIONED after persistence write

Secret-log redaction (SEC-1): never use error=str(exc) or interpolate {exc}; use error_type=type(exc).__name__ + `error=safe...

Files:

  • src/synthorg/api/approval_store.py
  • src/synthorg/api/controllers/meta.py
  • src/synthorg/core/enums.py
  • src/synthorg/meta/chief_of_staff/config.py
  • src/synthorg/api/app.py
  • src/synthorg/api/controllers/_approval_review_gate.py
  • src/synthorg/meta/chief_of_staff/models.py
  • src/synthorg/meta/chief_of_staff/propose.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/api/approval_store.py
  • src/synthorg/api/controllers/meta.py
  • src/synthorg/core/enums.py
  • src/synthorg/meta/chief_of_staff/config.py
  • src/synthorg/api/app.py
  • src/synthorg/api/controllers/_approval_review_gate.py
  • src/synthorg/meta/chief_of_staff/models.py
  • src/synthorg/meta/chief_of_staff/propose.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Test markers: @pytest.mark.{unit,integration,e2e,slow}; async auto; timeout 30s global; coverage 80% min

Windows: unit tests use WindowsSelectorEventLoopPolicy; subprocess tests override back

Test doubles: use FakeClock for Clock seam, mock_of[T](**overrides) for typed-boundary substitutions, SimpleNamespace for attribute-bags; bare MagicMock at typed boundaries is blocked by scripts/check_mock_spec.py

Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add @example(...)); never skip/xfail flaky tests; fix fundamentally

Files:

  • tests/conformance/persistence/test_conversational_proposal_repository.py
  • tests/unit/api/controllers/test_conversational_intake_resume.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/conformance/persistence/test_conversational_proposal_repository.py
  • tests/unit/api/controllers/test_conversational_intake_resume.py
tests/conformance/persistence/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Dual-backend conformance: tests/conformance/persistence/ consumes backend fixture (SQLite + Postgres); enforced by check_dual_backend_test_parity.py

Files:

  • tests/conformance/persistence/test_conversational_proposal_repository.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:13:21.876Z
Learning: Read `docs/design/` page before implementing; deviations need approval (see DESIGN_SPEC.md)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:13:21.876Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:13:21.876Z
Learning: No region/currency/locale privileged; use metric units; British English (see docs/reference/regional-defaults.md)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:13:21.876Z
Learning: Every convention PR must ship its enforcement gate (see docs/reference/convention-gates.md)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:13:21.876Z
Learning: After issue implementation: branch + commit + push (no auto-PR); use `/pre-pr-review` then `/aurelio-review-pr` for feedback; fix everything valid before merging
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:13:21.876Z
Learning: xdist: `-n 8 --dist=loadfile` auto-applied via pyproject `addopts`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:13:21.876Z
Learning: Commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:13:21.876Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:13:21.876Z
Learning: Branches: `<type>/<slug>` from main
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:13:21.876Z
Learning: Pre-commit/pre-push hooks: use `.pre-commit-config.yaml`; tool-call gates in `.claude/settings.json` PreToolUse
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:13:21.876Z
Learning: Squash merge; PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:13:21.876Z
Learning: Use `gh issue list` via Bash, NOT MCP `list_issues`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:13:21.876Z
Learning: After every squash merge → `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:13:21.876Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • src/synthorg/api/approval_store.py
  • src/synthorg/api/controllers/meta.py
  • src/synthorg/core/enums.py
  • src/synthorg/meta/chief_of_staff/config.py
  • src/synthorg/api/app.py
  • tests/conformance/persistence/test_conversational_proposal_repository.py
  • src/synthorg/api/controllers/_approval_review_gate.py
  • src/synthorg/meta/chief_of_staff/models.py
  • src/synthorg/meta/chief_of_staff/propose.py
  • tests/unit/api/controllers/test_conversational_intake_resume.py
🔇 Additional comments (8)
web/src/api/types/dtos.gen.ts (1)

78-78: LGTM!

Also applies to: 172-172, 355-356

web/src/api/endpoints/meta.ts (1)

8-12: LGTM!

Also applies to: 16-20, 114-141

src/synthorg/api/controllers/meta.py (1)

45-52: LGTM!

Also applies to: 406-479

src/synthorg/persistence/postgres/schema.sql (1)

1153-1153: LGTM!

Also applies to: 1198-1238

src/synthorg/persistence/sqlite/schema.sql (1)

1075-1078: LGTM!

Also applies to: 1127-1181

src/synthorg/core/enums.py (1)

597-606: LGTM!

Also applies to: 609-654

src/synthorg/meta/chief_of_staff/config.py (1)

71-82: LGTM!

Also applies to: 110-137

web/src/api/types/openapi.gen.ts (1)

2120-2136: LGTM!

Also applies to: 5229-5236, 5817-5825, 6695-6700, 7580-7580, 9972-9990, 17487-17517

Comment thread src/synthorg/api/app.py Outdated
Comment thread src/synthorg/api/approval_store.py Outdated
Comment thread src/synthorg/api/controllers/_approval_review_gate.py Outdated
Comment thread src/synthorg/api/controllers/_approval_review_gate.py Outdated
Comment thread src/synthorg/meta/chief_of_staff/config.py Outdated
Comment thread src/synthorg/meta/chief_of_staff/propose.py
Comment thread src/synthorg/meta/chief_of_staff/propose.py
@codecov
Copy link
Copy Markdown

codecov Bot commented May 20, 2026

Codecov Report

❌ Patch coverage is 74.54044% with 277 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.01%. Comparing base (679c3bb) to head (fd67914).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
...c/synthorg/persistence/sqlite/conversation_repo.py 62.89% 56 Missing and 3 partials ⚠️
...synthorg/persistence/postgres/conversation_repo.py 65.27% 47 Missing and 3 partials ⚠️
...persistence/sqlite/conversational_proposal_repo.py 65.57% 41 Missing and 1 partial ⚠️
...rsistence/postgres/conversational_proposal_repo.py 70.00% 32 Missing and 1 partial ⚠️
src/synthorg/meta/chief_of_staff/propose.py 86.74% 20 Missing and 2 partials ⚠️
src/synthorg/persistence/conversational_factory.py 54.28% 15 Missing and 1 partial ⚠️
src/synthorg/api/controllers/meta.py 45.83% 13 Missing ⚠️
src/synthorg/meta/chief_of_staff/models.py 80.32% 8 Missing and 4 partials ⚠️
src/synthorg/api/approval_store.py 44.44% 9 Missing and 1 partial ⚠️
src/synthorg/api/app.py 55.00% 5 Missing and 4 partials ⚠️
... and 2 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2019      +/-   ##
==========================================
- Coverage   85.11%   85.01%   -0.11%     
==========================================
  Files        1918     1926       +8     
  Lines      113601   114686    +1085     
  Branches     9692     9763      +71     
==========================================
+ Hits        96690    97498     +808     
- Misses      14534    14789     +255     
- Partials     2377     2399      +22     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

SynthOrg added 2 commits May 20, 2026 02:42
Round 2 covers CodeRabbit's re-review of round 1's commit.

Comment / docstring hygiene (CR-#1, #2, #5):
- app.py: rewrite the SQLite+persistent-ApprovalStore hard-block
  comment without migration framing; describe the invariant the
  guard enforces, not the migration's history.
- approval_store.py: trim ``has_persistent_repo`` docstring to
  current-state behaviour only.
- chief_of_staff/config.py: replace real vendor names
  (OpenAI/Claude/GPT-4o/LiteLLM) in propose-* constant comments
  with project-approved generic wording.

App startup ordering (CR-#8 duplicate):
- app.py: move conversational_proposal_repo wiring above the
  ``provider_registry is None: return`` early-return so the repo
  attaches even on boots without an LLM provider (preserves the
  approve/reject path for already-parked conversational approvals,
  which was the stated rationale for the round-1 decoupling).

Conversational-intake resume (CR-#3 + CR-#4):
- _approval_review_gate.py: split try_conversational_intake_resume
  (187 lines, well over the <50 line budget) into orchestrator +
  three branch-specific helpers:
    * _load_conversational_proposal: source check + repo + lookup.
    * _reject_conversational_proposal: PENDING -> REJECTED CAS.
    * _execute_conversational_proposal: PENDING -> EXECUTING ->
      EXECUTED CAS with revert-on-failure.
  The orchestrator is now 43 lines and reads as a flow table.
- _approval_review_gate.py: ``_reread_approval_item`` returning
  ``None`` for a conversational approval now raises
  ServiceUnavailableError. Returning False would route the call
  through to non-conversational flows and silently strand the
  decided approval. Docstring updated to spell out the
  flow-specific semantics.

Cross-tenant privacy (CR-#7, Critical):
- propose.py: ``_resolve_conversation`` now rejects conversation-id
  reuse across users. A caller who learns another conversation_id
  could otherwise append turns and have prior history fed back to
  the model -- a cross-tenant break. Mapped to NotFound so the
  endpoint cannot be used to probe existence either.

TOCTOU on sequence allocation (CR-#6, Heavy lift):
- conversation_repo.py (SQLite + Postgres): bounded
  retry-on-IntegrityError in ``append`` when the
  (conversation_id, sequence) UNIQUE constraint fires. The repo
  re-queries the live max sequence, model_copies the turn, and
  retries up to 3 times. Other constraint violations (FK miss,
  CHECK on content/role) still translate directly to
  ConstraintViolationError without retry.

Tests:
- test_propose.py: new test_other_user_resume_blocked.
- test_conversational_intake_resume.py: new
  test_unreadable_approval_item_raises (covers the new
  raise-on-None path).
- test_conversation_repository.py: renamed
  test_append_duplicate_sequence_rejected ->
  test_append_duplicate_sequence_resequenced; asserts the new
  race-safe semantics (second append lands at sequence + 1).
Round 2's first commit applied CodeRabbit's suggested raise-on-None
in ``_load_conversational_proposal``, intending to surface a
transient approval-store read failure on a CONVERSATIONAL_INTAKE
approval. The pre-push pytest-affected gate caught the regression:
``test_flow1_existence_check_error_returns_early`` constructs a
``_store(None)`` (item is None) and expects Flow 0 to yield so
Flow 1 can run its parked-context probe. The new raise short-circuits
that fallback for every approval whose store row is unreadable,
not just CONVERSATIONAL_INTAKE ones (the source is exactly what we
cannot determine when item is None).

The original behaviour -- ``if item is None or item.source is not
CONVERSATIONAL_INTAKE: return False, None`` -- is correct:
- Flow 1 (``try_mid_execution_resume``) owns the parked-context
  probe for the item-is-None case and routes legitimate parked
  approvals correctly.
- Flow 2 (review-gate) handles its own missing-task_id no-op.
- A CONVERSATIONAL_INTAKE approval whose store row is unreadable
  is an acknowledged residual edge case; the right fix is at the
  approval-store layer (differentiate "transient error" from
  "row missing") and is out of scope for this PR.

Reverts:
- ``_load_conversational_proposal``: drop the raise, restore the
  yield-when-item-is-None semantics.
- ``_reread_approval_item`` docstring: remove the misleading
  "Flow 0 raises" wording.
- Drop ``test_unreadable_approval_item_raises`` (asserted the
  removed behaviour).

Round-2 history will record CR-#4 as a documented skip.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/synthorg/meta/chief_of_staff/propose.py (1)

177-210: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Serialise converse() per conversation before the LLM turn.

prior_turns and next_sequence are snapped before this request is durably linearised. The repo-side resequencing only avoids sequence collisions; two concurrent converse() calls on the same conversation_id still run the model against different histories and can persist clarifications/proposals in an order neither call saw. A double-submit or retry race can therefore produce incoherent 1:1 conversation state.

A conversation-scoped lock or optimistic CAS around the whole turn pipeline (load turns -> append user -> run model -> append assistant/proposals -> update conversation) would keep the transcript linear.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/synthorg/meta/chief_of_staff/propose.py` around lines 177 - 210, The
propose pipeline snaps prior_turns and next_sequence before the request is
linearised, allowing concurrent converse() calls to race; wrap the entire turn
pipeline (load conversation/_resolve_conversation, fetch _ordered_turns,
create-and-append user ConversationTurn via _turn_repo.append, run
_run_decision, then call _record_clarification or _record_proposals) in a
conversation-scoped critical section or optimistic CAS: either acquire a
per-conversation async lock (e.g. self._conversation_locks[conversation.id])
around that sequence or perform the work in a DB transaction/CAS that verifies
and increments a conversation version, recomputing next_sequence after the
lock/transaction and retrying on conflict, so the model always runs against the
canonical linearised history before persisting assistant turns.
♻️ Duplicate comments (1)
src/synthorg/api/controllers/_approval_review_gate.py (1)

216-218: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't treat an unreadable approval as “not conversational”.

_reread_approval_item() returns None on store failures too. Returning (False, None) here lets signal_resume_intent() fall through to the parked-context / review-gate paths and complete without ever rejecting or executing the conversational proposal, leaving an already-decided conversational approval stranded.

Suggested fix
     item = await _reread_approval_item(app_state, approval_id)
-    if item is None or item.source is not ApprovalSource.CONVERSATIONAL_INTAKE:
+    if item is None:
+        logger.error(
+            APPROVAL_GATE_CONVERSATIONAL_FAILED,
+            approval_id=approval_id,
+            note="approval state unavailable",
+        )
+        msg = "Approval state unavailable"
+        raise ServiceUnavailableError(msg)
+    if item.source is not ApprovalSource.CONVERSATIONAL_INTAKE:
         return False, None
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/synthorg/api/controllers/_approval_review_gate.py` around lines 216 -
218, The current check conflates a store-read failure with a non-conversational
approval; change the logic so that _reread_approval_item(app_state, approval_id)
being None is treated as a read failure and propagated (e.g., return (None,
None) or raise) instead of returning (False, None), and only evaluate
item.source != ApprovalSource.CONVERSATIONAL_INTAKE when item is not None; this
ensures signal_resume_intent() can detect read failures rather than mistakenly
treating them as non-conversational approvals.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/synthorg/persistence/postgres/conversation_repo.py`:
- Around line 392-405: The resequencing read using _TURN_NEXT_SEQUENCE_SQL
inside append() is not wrapped in the repository's QueryError contract; catch
database exceptions around the async with block that executes
_TURN_NEXT_SEQUENCE_SQL (the cursor.execute / cur.fetchone section) and rethrow
them as the repository's QueryError so callers don't observe raw psycopg
exceptions; ensure you import/use QueryError and wrap the entire resequencing
block (including await cur.execute and await cur.fetchone) to raise QueryError
with a helpful message and the original exception attached.

In `@src/synthorg/persistence/sqlite/conversation_repo.py`:
- Around line 470-479: The fallback SELECT used when sequence_race is true (the
query executed with _TURN_NEXT_SEQUENCE_SQL inside the append() flow) can raise
raw sqlite exceptions; catch exceptions from the await self._db.execute(...) /
fetchone() and rethrow them as QueryError (or pass them through the same
SQLite-to-QueryError mapping used elsewhere in append()), so the read that
computes next_sequence is wrapped in the same error mapping as the rest of
append() and will yield a QueryError instead of leaking sqlite3/aiosqlite
exceptions.

---

Outside diff comments:
In `@src/synthorg/meta/chief_of_staff/propose.py`:
- Around line 177-210: The propose pipeline snaps prior_turns and next_sequence
before the request is linearised, allowing concurrent converse() calls to race;
wrap the entire turn pipeline (load conversation/_resolve_conversation, fetch
_ordered_turns, create-and-append user ConversationTurn via _turn_repo.append,
run _run_decision, then call _record_clarification or _record_proposals) in a
conversation-scoped critical section or optimistic CAS: either acquire a
per-conversation async lock (e.g. self._conversation_locks[conversation.id])
around that sequence or perform the work in a DB transaction/CAS that verifies
and increments a conversation version, recomputing next_sequence after the
lock/transaction and retrying on conflict, so the model always runs against the
canonical linearised history before persisting assistant turns.

---

Duplicate comments:
In `@src/synthorg/api/controllers/_approval_review_gate.py`:
- Around line 216-218: The current check conflates a store-read failure with a
non-conversational approval; change the logic so that
_reread_approval_item(app_state, approval_id) being None is treated as a read
failure and propagated (e.g., return (None, None) or raise) instead of returning
(False, None), and only evaluate item.source !=
ApprovalSource.CONVERSATIONAL_INTAKE when item is not None; this ensures
signal_resume_intent() can detect read failures rather than mistakenly treating
them as non-conversational approvals.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: ca38dcdc-d999-4279-8896-dcdfaea74643

📥 Commits

Reviewing files that changed from the base of the PR and between 05b3e84 and 83e3921.

📒 Files selected for processing (9)
  • src/synthorg/api/app.py
  • src/synthorg/api/approval_store.py
  • src/synthorg/api/controllers/_approval_review_gate.py
  • src/synthorg/meta/chief_of_staff/config.py
  • src/synthorg/meta/chief_of_staff/propose.py
  • src/synthorg/persistence/postgres/conversation_repo.py
  • src/synthorg/persistence/sqlite/conversation_repo.py
  • tests/conformance/persistence/test_conversation_repository.py
  • tests/unit/meta/chief_of_staff/test_propose.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Build Backend
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: Dashboard Test
  • GitHub Check: Test Unit
  • GitHub Check: Test E2E
  • GitHub Check: Test Integration
  • GitHub Check: Lighthouse Dashboard
  • GitHub Check: Lighthouse Site
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (6)
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Numerics live in settings/definitions/; allowlist only 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants per check_no_magic_numbers.py

Comments explain WHY only; no reviewer citations / issue back-refs / migration framing per check_no_review_origin_in_code.py

No from __future__ import annotations; use PEP 758 syntax

Type hints on public functions; mypy strict; Google-style docstrings; line length 88; functions <50 lines; files <800 lines

Errors use <Domain><Condition>Error naming from DomainError base; never inherit Exception/RuntimeError directly per check_domain_error_hierarchy.py

Pydantic v2 frozen + extra="forbid" on every frozen model project-wide; @computed_field for derived; NotBlankStr for identifiers per check_frozen_model_extra_forbid.py

Args models at every system boundary; parse_typed() for every external dict ingestion per check_boundary_typed.py

Immutability: use model_copy(update=...) or copy.deepcopy(); deepcopy at system boundaries

Async code: use asyncio.TaskGroup for fan-out/fan-in; helpers catch Exception (re-raise MemoryError/RecursionError)

Clock seam: clock: Clock | None = None; tests inject FakeClock; lifecycle: services own _lifecycle_lock; timed-out stops mark unrestartable

Untrusted content (SEC-1): use wrap_untrusted() from engine.prompt_safety; HTMLParseGuard for HTML

Logging: use from synthorg.observability import get_logger; variable always logger; never import logging / print() in app code

Event names from observability.events.<domain> constants; structured kwargs (logger.info(EVENT, key=value))

Secret-log redaction (SEC-1): never use error=str(exc) or interpolate {exc}; use error_type=type(exc).__name__ + error=safe_error_description(exc); never exc_info=True; OTel: forbid span.record_exception(exc) per check_logger_exception_str_exc.py

Never use real vendor names in project code/tests; use example-provider, `test-...

Files:

  • src/synthorg/api/approval_store.py
  • src/synthorg/api/app.py
  • src/synthorg/api/controllers/_approval_review_gate.py
  • src/synthorg/meta/chief_of_staff/config.py
  • src/synthorg/persistence/postgres/conversation_repo.py
  • src/synthorg/persistence/sqlite/conversation_repo.py
  • src/synthorg/meta/chief_of_staff/propose.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/api/approval_store.py
  • src/synthorg/api/app.py
  • src/synthorg/api/controllers/_approval_review_gate.py
  • src/synthorg/meta/chief_of_staff/config.py
  • src/synthorg/persistence/postgres/conversation_repo.py
  • src/synthorg/persistence/sqlite/conversation_repo.py
  • src/synthorg/meta/chief_of_staff/propose.py
src/synthorg/api/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

API startup: construction phase wires synchronous services; on_startup wires services needing connected persistence; agent_registry before auto_wire_meetings; tunnel_provider always wired; SettingsService before WorkflowExecutionObserver; OntologyService after persistence.connect()

Files:

  • src/synthorg/api/approval_store.py
  • src/synthorg/api/app.py
  • src/synthorg/api/controllers/_approval_review_gate.py
src/synthorg/{workers,api}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Runtime services: build from shared AgentEngine via build_runtime_services; install via _install_runtime_services; swap functions hold lock; empty company rejects with 4014 error, /coordinate honestly 503s

Files:

  • src/synthorg/api/approval_store.py
  • src/synthorg/api/app.py
  • src/synthorg/api/controllers/_approval_review_gate.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Test markers: @pytest.mark.{unit,integration,e2e,slow}; async auto; timeout 30s global; coverage 80% min

Windows: unit tests use WindowsSelectorEventLoopPolicy; subprocess tests override back

Test doubles: FakeClock for Clock seam; mock_of[T](**overrides) for typed-boundary substitutions; SimpleNamespace for attribute-bags; bare MagicMock at typed boundary blocked by check_mock_spec.py

FakeClock and mock_of import from tests._shared; inject via clock= and helper's spec subscript

Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add @example(...))

Flaky tests: never skip/xfail; fix fundamentally; use asyncio.Event().wait() not sleep(large)

Files:

  • tests/conformance/persistence/test_conversation_repository.py
  • tests/unit/meta/chief_of_staff/test_propose.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/conformance/persistence/test_conversation_repository.py
  • tests/unit/meta/chief_of_staff/test_propose.py
tests/conformance/persistence/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Dual-backend conformance: tests/conformance/persistence/ consumes backend fixture (SQLite + Postgres) per check_dual_backend_test_parity.py

Files:

  • tests/conformance/persistence/test_conversation_repository.py
src/synthorg/persistence/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Only src/synthorg/persistence/ may import sqlite/psycopg or emit raw SQL; new repositories must inherit from generic categories in persistence/_generics.py per ADR-0001

Repository CRUD: use save(entity), get(id), delete(id) -> bool, list_items(...), query(...) returning tuples

Datetime in persistence: use parse_iso_utc / format_iso_utc from persistence._shared (reject naive); normalize_utc for already-typed

Files:

  • src/synthorg/persistence/postgres/conversation_repo.py
  • src/synthorg/persistence/sqlite/conversation_repo.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:58:20.575Z
Learning: Read `docs/design/` page before implementing; deviations need approval per DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:58:20.575Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:58:20.575Z
Learning: No region/currency/locale privilege; use metric units and British English per regional-defaults.md
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:58:20.575Z
Learning: Every convention PR ships its enforcement gate per convention-gates.md
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:58:20.575Z
Learning: After issue: branch + commit + push (no auto-PR); use `/pre-pr-review`; after PR: `/aurelio-review-pr`; fix EVERYTHING valid; no deferring
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:58:20.575Z
Learning: Commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:58:20.575Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:58:20.575Z
Learning: Branches: `<type>/<slug>` from main
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:58:20.575Z
Learning: Pre-commit/pre-push hooks in `.pre-commit-config.yaml`; tool-call gates in `.claude/settings.json` PreToolUse
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:58:20.575Z
Learning: Squash merge with PR body as commit message; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:58:20.575Z
Learning: GitHub queries: use `gh issue list` via Bash, NOT MCP `list_issues`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:58:20.575Z
Learning: After every squash merge → `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:58:20.575Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T00:58:20.575Z
Learning: Use Go via `go -C cli` command, never `cd cli`; see `cli/CLAUDE.md`
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • src/synthorg/api/approval_store.py
  • src/synthorg/api/app.py
  • src/synthorg/api/controllers/_approval_review_gate.py
  • tests/conformance/persistence/test_conversation_repository.py
  • src/synthorg/meta/chief_of_staff/config.py
  • src/synthorg/persistence/postgres/conversation_repo.py
  • src/synthorg/persistence/sqlite/conversation_repo.py
  • src/synthorg/meta/chief_of_staff/propose.py
  • tests/unit/meta/chief_of_staff/test_propose.py
🔇 Additional comments (1)
tests/unit/meta/chief_of_staff/test_propose.py (1)

351-373: LGTM!

Comment thread src/synthorg/persistence/postgres/conversation_repo.py
Comment thread src/synthorg/persistence/sqlite/conversation_repo.py
CodeRabbit re-review of round 2 (head 83e3921) surfaced 3 valid
findings plus 1 duplicate re-flag (replied + flagged for resolve).

QueryError wrapping on the resequencing read (CR-A1, CR-A2):
- conversation_repo.py (SQLite + Postgres): wrap the
  ``_TURN_NEXT_SEQUENCE_SQL`` lookup that runs inside the
  ``append`` retry branch in its own try/except. SQLite catches
  ``(sqlite3.Error, aiosqlite.Error)``; Postgres catches
  ``psycopg.Error``. Both translate to ``QueryError`` with the
  same logging shape the rest of ``append`` already uses, so
  callers no longer see raw driver exceptions leak from the
  resequencing path.

Conversation-scoped turn pipeline serialisation (CR-A3, Heavy lift):
- propose.py: add per-conversation ``asyncio.Lock`` registry to
  ``ChiefOfStaffProposer``. ``converse()`` now resolves the
  conversation first (so a fresh ``conversation_id`` doesn't need
  locking) and then runs the rest of the turn pipeline under the
  per-conversation lock; the body is extracted into ``_run_turn``.
  This closes the broader TOCTOU window CR-A6 (round 2) only
  addressed at the sequence-collision layer: two concurrent
  ``converse()`` calls on the same conversation now serialise
  fully, so the model always runs against the canonical
  linearised history and assistant turns commit in order.
  ``_lock_for`` uses a lazy ``_conversation_locks_guard`` so the
  lock binds to the live request-handling loop rather than the
  loop that constructed the proposer. The lock dict grows with
  conversation count; for v1's short-thread expectation this is
  acceptable, follow-up if scale demands it.

CodeRabbit duplicate re-flag (round 2 CR-#4):
- Replied on inline thread 3270499316 explaining the disproof
  recorded in round 2's history (the raise-on-None fix breaks
  test_flow1_existence_check_error_returns_early because Flow 1's
  parked-context probe owns the item-None fallback for
  non-conversational approvals). Asked @coderabbitai to resolve
  the thread.

Tests:
- test_propose.py: new test_per_conversation_lock_serialises_turns
  asserts two concurrent ``converse()`` calls on the same id land
  4 distinct turns at sequences 0/1/2/3 (would collide on 0 absent
  the lock).
- test_propose.py: new test_locks_isolated_per_conversation asserts
  ``_lock_for`` returns the same instance per id and a fresh
  instance per distinct id.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/synthorg/meta/chief_of_staff/propose.py`:
- Around line 150-182: The _conversation_locks dict is unbounded; add a cleanup
to remove the per-conversation lock when a conversation enters a terminal state:
implement a new helper (e.g. _remove_conversation_lock(conversation_id: str))
that acquires _conversation_locks_guard and pops
self._conversation_locks[conversation_id] if present, and call this helper from
the code paths that mark a conversation terminal (e.g. inside _cap_conversation
and/or _record_proposals) after the final commit; keep using
_conversation_locks_guard to synchronise removal so _lock_for and removals
cannot race.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: c6250a49-78e4-4126-88de-3a6b6f1e1586

📥 Commits

Reviewing files that changed from the base of the PR and between 83e3921 and 82efbe2.

📒 Files selected for processing (4)
  • src/synthorg/meta/chief_of_staff/propose.py
  • src/synthorg/persistence/postgres/conversation_repo.py
  • src/synthorg/persistence/sqlite/conversation_repo.py
  • tests/unit/meta/chief_of_staff/test_propose.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Test E2E
  • GitHub Check: Dashboard Test
  • GitHub Check: Test Integration
  • GitHub Check: Test Unit
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: Lighthouse Site
  • GitHub Check: Lighthouse Dashboard
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (6)
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Test markers: @pytest.mark.{unit,integration,e2e,slow}. Async auto. Timeout 30s global. Coverage 80% min. xdist -n 8 --dist=loadfile auto-applied via pyproject addopts

Windows: unit tests use WindowsSelectorEventLoopPolicy (3.14 IOCP teardown race). Subprocess tests override back

Test doubles: use FakeClock for Clock seam, mock_ofT for typed-boundary substitutions, SimpleNamespace for attribute-bags. Bare MagicMock at typed boundary is blocked by check_mock_spec.py (zero-tolerance)

Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add @example(...)). Never skip/xfail flaky tests; fix fundamentally. Use asyncio.Event().wait() not sleep(large)

Files:

  • tests/unit/meta/chief_of_staff/test_propose.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/meta/chief_of_staff/test_propose.py
{src,tests}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Vendor-agnostic test data: NEVER use real vendor names in project code/tests. Use example-provider, test-provider, example-{large,medium,small}-001. Allowed in .claude/, third-party imports, providers/presets.py, web/public/provider-logos/

Files:

  • tests/unit/meta/chief_of_staff/test_propose.py
  • src/synthorg/persistence/sqlite/conversation_repo.py
  • src/synthorg/persistence/postgres/conversation_repo.py
  • src/synthorg/meta/chief_of_staff/propose.py
src/synthorg/persistence/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

New repository protocols must inherit from one or more generic categories in persistence/_generics.py (SingletonRepository, IdKeyedRepository, FilteredQueryRepository, AppendOnlyRepository, StatefulRepository, MVCCRepository); bespoke methods allowed only under ADR-0001 D7

Repository protocols must stay @runtime_checkable

Repository CRUD methods: save(entity), get(id), delete(id) -> bool, list_items(...), query(...) returning tuples

Datetime in persistence: use parse_iso_utc / format_iso_utc from persistence._shared (reject naive); use normalize_utc for already-typed

Files:

  • src/synthorg/persistence/sqlite/conversation_repo.py
  • src/synthorg/persistence/postgres/conversation_repo.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

No hardcoded numeric values; numerics must live in settings/definitions/. Allowlist: 0, 1, -1, HTTP codes, hex masks, powers-of-2, module-level annotated named constants (NAME: int|float|Final|Final[int]|Final[float] = literal). Enforced by scripts/check_no_magic_numbers.py

Use from synthorg.observability import get_logger with variable name logger; never import logging or print() in app code

Event names must come from observability.events.<domain> constants; use structured kwargs (logger.info(EVENT, key=value))

Error paths log WARNING/ERROR with context before raising; state transitions log INFO via *_STATUS_TRANSITIONED after persistence write

Never use error=str(exc) or interpolate {exc} in logs; use error_type=type(exc).__name__ + error=safe_error_description(exc). Never exc_info=True. For OTel: forbid span.record_exception(exc); use span.set_attribute("exception.message", safe_error_description(exc)) with record_exception=False, set_status_on_exception=False. Enforced by check_logger_exception_str_exc.py

Comments must explain WHY only; no reviewer citations, issue back-refs, or migration framing. Enforced by check_no_review_origin_in_code.py + check_no_migration_framing.py

No from __future__ import annotations in Python 3.14 (has PEP 649). Use PEP 758 except: except A, B: no parens unless binding

Type hints required on public functions; mypy strict. Google-style docstrings. Line length 88; functions <50 lines; files <800 lines

Errors must use format <Domain><Condition>Error inheriting from DomainError, never directly from Exception/RuntimeError/etc. Enforced by check_domain_error_hierarchy.py

Every Pydantic v2 frozen model project-wide must use extra="forbid". Gate: check_frozen_model_extra_forbid.py. @computed_field auto-exempt; use # lint-allow: frozen-extra-forbid -- <reason> for extra="allow"/"ignore" boundaries. Use @computed_field for derived fields. Use NotBlankStr for identifiers

Args models requ...

Files:

  • src/synthorg/persistence/sqlite/conversation_repo.py
  • src/synthorg/persistence/postgres/conversation_repo.py
  • src/synthorg/meta/chief_of_staff/propose.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/persistence/sqlite/conversation_repo.py
  • src/synthorg/persistence/postgres/conversation_repo.py
  • src/synthorg/meta/chief_of_staff/propose.py
src/synthorg/!(persistence)/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Only src/synthorg/persistence/ may import sqlite/psycopg or emit raw SQL

Files:

  • src/synthorg/meta/chief_of_staff/propose.py
src/synthorg/meta/chief_of_staff/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Conversational propose: POST /meta/chat/propose is opt-in (meta.chief_of_staff.propose_enabled); ChiefOfStaffProposer built by build_chief_of_staff_proposer (ENFORCED manifest entry) and 503s when provider, persistence, or work pipeline missing. Human content wrapped via wrap_untrusted(TAG_TASK_DATA, ...) (SEC-1)

Files:

  • src/synthorg/meta/chief_of_staff/propose.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T01:27:43.039Z
Learning: Read design spec from `docs/design/` page before implementing; deviations require approval per DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T01:27:43.039Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T01:27:43.039Z
Learning: No region/currency/locale should be privileged; use metric units; use British English in all text
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T01:27:43.039Z
Learning: Every convention PR must ship its enforcement gate per docs/reference/convention-gates.md
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T01:27:43.039Z
Learning: Configuration precedence: DB > env > code default via SettingsService/ConfigResolver (Cat-1) or env > code default (Cat-2, read_only_post_init); Cat-3 bootstrap secrets are pure env. YAML is company-template ingestion only, not a precedence tier
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T01:27:43.039Z
Learning: Timeout/slow test failures indicate source-code regression; never edit tests/baselines/unit_timing.json or scripts/*_baseline.{txt,json} / scripts/_*_baseline.py. Both families PreToolUse-blocked. Per-invocation bypass requires explicit approval: ALLOW_BASELINE_GROWTH=1 git commit
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T01:27:43.039Z
Learning: After issue completion: branch + commit + push (no auto-PR); use `/pre-pr-review`. After PR merge use `/aurelio-review-pr` for external feedback. Fix all valid issues; no deferring
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T01:27:43.039Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T01:27:43.039Z
Learning: Git commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced. Signed commits required on protected refs. Branches: `<type>/<slug>` from main. Squash merge. PR body becomes squash commit; trailers (Release-As, Closes `#N`) in PR body
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T01:27:43.039Z
Learning: After every squash merge → `/post-merge-cleanup`
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • tests/unit/meta/chief_of_staff/test_propose.py
  • src/synthorg/persistence/sqlite/conversation_repo.py
  • src/synthorg/persistence/postgres/conversation_repo.py
  • src/synthorg/meta/chief_of_staff/propose.py
🔇 Additional comments (4)
tests/unit/meta/chief_of_staff/test_propose.py (1)

1-544: LGTM!

src/synthorg/persistence/sqlite/conversation_repo.py (1)

1-614: LGTM!

src/synthorg/persistence/postgres/conversation_repo.py (1)

1-532: LGTM!

src/synthorg/meta/chief_of_staff/propose.py (1)

1-149: LGTM!

Also applies to: 183-603

Comment thread src/synthorg/meta/chief_of_staff/propose.py
Round 4 covers CodeRabbit's re-review of round 3's commit. CR-#4
(round 2 duplicate) was closed: CR replied accepting the disproof
and resolved the thread.

Per-conversation lock cleanup (CR-C1, Nitpick):
- propose.py: ``_release_conversation_lock`` helper pops the dict
  entry under ``_conversation_locks_guard`` so a removal cannot
  race with a concurrent ``_lock_for`` resolving the same id.
  Called from ``_record_proposals`` (after the conversation
  transitions to PROPOSED, the v1 1:1 terminal state) and
  ``_cap_conversation`` (after force-closing to CLOSED). The
  ``_record_clarification`` path is NOT a terminal state -- the
  conversation stays ACTIVE for the next turn -- so the lock is
  retained.

Tests:
- test_propose.py::test_lock_released_on_terminal_propose asserts
  the dict drops the conversation id after a successful propose.
- test_propose.py::test_lock_released_on_clarification_cap asserts
  the same after the clarification cap force-closes the thread.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/synthorg/meta/chief_of_staff/propose.py`:
- Around line 184-197: The cleanup currently performed in
_release_conversation_lock removes the mapping while the old per-conversation
lock may still be held, allowing a race where a new converse() creates a fresh
lock and runs _run_turn() concurrently; instead, stop popping the entry inside
_release_conversation_lock and move that removal into converse() after the outer
"async with <that conversation's lock>" has fully exited, checking that
self._conversation_locks.get(conversation_id) is still the same lock instance
before popping it; apply the same change for the other similar sites (around
lines ~475-478 and ~614-617) and consider updating _resolve_conversation() to
treat ConversationStatus.PROPOSED as terminal if v1 semantics require rejecting
PROPOSED transitions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: 97985311-bc21-44a2-a896-49638c90e764

📥 Commits

Reviewing files that changed from the base of the PR and between 82efbe2 and 3f7c8f1.

📒 Files selected for processing (2)
  • src/synthorg/meta/chief_of_staff/propose.py
  • tests/unit/meta/chief_of_staff/test_propose.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Build Backend
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: Dashboard Test
  • GitHub Check: Test E2E
  • GitHub Check: Test Unit
  • GitHub Check: Test Integration
  • GitHub Check: Lighthouse Dashboard
  • GitHub Check: Lighthouse Site
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use SettingsService/ConfigResolver (Cat-1) or env > code default (Cat-2, read_only_post_init); Cat-3 bootstrap secrets are pure env at the boot site. No os.environ.get outside startup; pre-init Cat-2 reads use settings.bootstrap_resolver.resolve_init_value

Numerics live in settings/definitions/; allowlist 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants. Enforced by scripts/check_no_magic_numbers.py

Comments should explain WHY only; no reviewer citations / issue back-refs / migration framing. Enforced by check_no_review_origin_in_code.py + check_no_migration_framing.py

No from __future__ import annotations (3.14 has PEP 649). Use PEP 758 except: except A, B: no parens unless binding

Errors: <Domain><Condition>Error from DomainError; never inherit Exception/RuntimeError/etc directly. Enforced by check_domain_error_hierarchy.py

Pydantic v2 frozen + extra="forbid" on every frozen model; @computed_field for derived; NotBlankStr for identifiers. Per-line # lint-allow: frozen-extra-forbid -- <reason> for exceptions. Enforced by check_frozen_model_extra_forbid.py

Args models at every system boundary; parse_typed() for every external dict ingestion. Enforced by check_boundary_typed.py

Immutability: use model_copy(update=...) or copy.deepcopy(); deepcopy at system boundaries

Async: use asyncio.TaskGroup for fan-out/fan-in; helpers catch Exception (re-raise MemoryError/RecursionError)

Untrusted content (SEC-1): use wrap_untrusted() from engine.prompt_safety; use HTMLParseGuard for HTML

Telemetry: opt-in, off by default. Every event property must be in _ALLOWED_PROPERTIES. See telemetry.md

Files:

  • tests/unit/meta/chief_of_staff/test_propose.py
  • src/synthorg/meta/chief_of_staff/propose.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Test markers: @pytest.mark.{unit,integration,e2e,slow}. Async auto. Timeout 30s global. Coverage 80% min

xdist -n 8 --dist=loadfile auto-applied via pyproject addopts; Windows unit tests use WindowsSelectorEventLoopPolicy (3.14 IOCP teardown race)

Test doubles: FakeClock for Clock seam, mock_of[T](**overrides) for typed-boundary substitutions, SimpleNamespace for attribute-bags. Bare MagicMock at typed boundaries forbidden by scripts/check_mock_spec.py

Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add @example(...)). Never skip/xfail on flaky tests; fix fundamentally. Use asyncio.Event().wait() not sleep(large)

Files:

  • tests/unit/meta/chief_of_staff/test_propose.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/meta/chief_of_staff/test_propose.py
{src/**/*.py,tests/**/*.py}

📄 CodeRabbit inference engine (CLAUDE.md)

Vendor-agnostic: NEVER use real vendor names in project code/tests. Use example-provider, test-provider, example-{large,medium,small}-001. Allowed in .claude/, third-party imports, providers/presets.py, web/public/provider-logos/

Files:

  • tests/unit/meta/chief_of_staff/test_propose.py
  • src/synthorg/meta/chief_of_staff/propose.py
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Type hints on public functions; mypy strict. Google-style docstrings. Line length 88; functions <50 lines; files <800 lines

Clock seam: clock: Clock | None = None; tests inject FakeClock. Lifecycle: services own _lifecycle_lock; timed-out stops mark unrestartable

Use from synthorg.observability import get_logger; variable always logger. Never import logging / print() in app code

Event names from observability.events.<domain> constants; structured kwargs (logger.info(EVENT, key=value))

Error paths log WARNING/ERROR with context before raising; state transitions log INFO via *_STATUS_TRANSITIONED after persistence write

Secret-log redaction (SEC-1): never error=str(exc) or interpolate {exc}; use error_type=type(exc).__name__ + error=safe_error_description(exc). Never exc_info=True. OTel: forbid span.record_exception(exc). Enforced by check_logger_exception_str_exc.py

Files:

  • src/synthorg/meta/chief_of_staff/propose.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/meta/chief_of_staff/propose.py
src/synthorg/meta/chief_of_staff/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Conversational propose: POST /meta/chat/propose is opt-in (meta.chief_of_staff.propose_enabled); ChiefOfStaffProposer built by build_chief_of_staff_proposer (ENFORCED manifest entry); 503 when ANY of provider, persistence, or work pipeline missing. Human content wrapped via wrap_untrusted(TAG_TASK_DATA, ...)

Files:

  • src/synthorg/meta/chief_of_staff/propose.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T01:51:19.899Z
Learning: Read `docs/design/` page before implementing; deviations need approval. See DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T01:51:19.899Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T01:51:19.899Z
Learning: No region/currency/locale privileged; use metric units; British English for regional defaults
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T01:51:19.899Z
Learning: Every convention PR ships its enforcement gate
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T01:51:19.899Z
Learning: After issue completion: branch + commit + push (no auto-PR); use `/pre-pr-review`. After PR: `/aurelio-review-pr` for external feedback. Fix EVERYTHING valid; no deferring
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T01:51:19.899Z
Learning: Commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced. Signed commits required on protected refs
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T01:51:19.899Z
Learning: Branches: `<type>/<slug>` from main. Pre-commit/pre-push hooks: `.pre-commit-config.yaml`. Tool-call gates: `.claude/settings.json` PreToolUse
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T01:51:19.899Z
Learning: Squash merge. PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T01:51:19.899Z
Learning: After every squash merge → `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T01:51:19.899Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • tests/unit/meta/chief_of_staff/test_propose.py
  • src/synthorg/meta/chief_of_staff/propose.py

Comment thread src/synthorg/meta/chief_of_staff/propose.py Outdated
Round 5 covers CodeRabbit's re-review of the round-4 lock cleanup.

Lock cleanup race (CR-D1, Critical):
- propose.py: round 4 invoked ``_release_conversation_lock`` from
  inside ``_record_proposals`` / ``_cap_conversation`` -- i.e. WHILE
  the per-conversation lock was still held by the current
  ``converse()`` caller. That window let a concurrent ``converse()``
  on the same conversation observe an empty dict slot, mint a
  fresh lock, and run ``_run_turn`` in parallel with the
  still-unwinding first caller. The lock invariant the round-3 PR
  introduced was therefore defeated by the round-4 cleanup.

Fix:
- ``_release_conversation_lock`` now takes a keyword-only
  ``expected_lock: asyncio.Lock`` and only pops the dict entry
  when ``self._conversation_locks.get(conversation_id) is
  expected_lock``. The identity check is essential -- if a
  concurrent caller has already replaced the entry with a fresh
  lock, popping it would invalidate that caller's serialisation.
- ``converse()`` captures the lock instance from ``_lock_for``,
  runs the turn pipeline inside ``async with lock``, and only
  AFTER the ``async with`` releases calls
  ``_release_conversation_lock`` with ``expected_lock=lock``.
  Cleanup is gated on the result shape: ``result.status ==
  "proposed"`` (terminal -> PROPOSED) or
  ``result.conversation_closed`` (terminal -> CLOSED).
- The in-lock cleanup calls from ``_record_proposals`` and
  ``_cap_conversation`` are removed entirely. Single-source the
  release in ``converse()``; only one ``_release_conversation_lock``
  call site remains.

Note on PROPOSED resumability: CR also raised whether
``_resolve_conversation`` should treat PROPOSED as terminal and
reject resumes. Documented skip -- that's a separate design call
(a user can legitimately want to refine a parked-but-rejected
conversation), and the lock + identity-check guarantee no
parallel ``_run_turn`` regardless of PROPOSED resumability.

Tests:
- test_propose.py: new test_release_identity_check_skips_replaced_lock
  asserts the new ``expected_lock`` parameter -- releasing under
  the OLD lock identity is a no-op when the slot has already been
  replaced; releasing under the CURRENT identity actually pops.
- Existing test_lock_released_on_terminal_propose +
  test_lock_released_on_clarification_cap still pass under the
  new post-lock cleanup wiring.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
src/synthorg/meta/chief_of_staff/propose.py (2)

242-244: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Revalidate the conversation inside the lock before running the turn.

_resolve_conversation() completed before Line 242, so a second request can read ACTIVE, wait behind another request that moves the thread to PROPOSED or CLOSED, and then still append turns once it acquires this lock. That lets a stale caller park extra approvals against a terminal conversation. Reload or at least re-check existing conversations after entering the locked section.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/synthorg/meta/chief_of_staff/propose.py` around lines 242 - 244, After
acquiring the per-conversation lock (via self._lock_for(conversation.id))
revalidate the conversation state before calling self._run_turn: reload the
conversation (e.g., call self._resolve_conversation(conversation.id) or fetch
the latest object) inside the async with lock block and verify it is still in
the expected ACTIVE state; if it has moved to PROPOSED or CLOSED,
abort/raise/return early instead of calling _run_turn to prevent appending turns
to a stale/terminal conversation.

203-204: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Queued waiters still make this lock cleanup unsafe.

The identity check only covers the “dict entry was replaced with a different lock object” case. It does not cover callers that already captured expected_lock and are queued on it: A can release, B can acquire the same lock, then Line 203-Line 204 pops the registry entry and C mints a fresh lock that runs concurrently with B. Unless this registry tracks outstanding users/waiters, the safe immediate fix is to stop deleting terminal locks here.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/synthorg/meta/chief_of_staff/propose.py` around lines 203 - 204, The
current cleanup that does self._conversation_locks.pop(conversation_id, None)
while checking identity against expected_lock is unsafe because queued waiters
might still be using expected_lock; stop deleting terminal locks here — remove
that pop so locks remain registered, or implement a proper
reference-count/waiter-tracking scheme (increment when a caller acquires or
queues on a lock and decrement on release, and only remove the dict entry when
the count reaches zero and there are no waiters). Update the code paths that
create/return locks (the routines that access self._conversation_locks and
expected_lock) to either no-op the pop or to adjust the refcount/waiter tracking
accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@src/synthorg/meta/chief_of_staff/propose.py`:
- Around line 242-244: After acquiring the per-conversation lock (via
self._lock_for(conversation.id)) revalidate the conversation state before
calling self._run_turn: reload the conversation (e.g., call
self._resolve_conversation(conversation.id) or fetch the latest object) inside
the async with lock block and verify it is still in the expected ACTIVE state;
if it has moved to PROPOSED or CLOSED, abort/raise/return early instead of
calling _run_turn to prevent appending turns to a stale/terminal conversation.
- Around line 203-204: The current cleanup that does
self._conversation_locks.pop(conversation_id, None) while checking identity
against expected_lock is unsafe because queued waiters might still be using
expected_lock; stop deleting terminal locks here — remove that pop so locks
remain registered, or implement a proper reference-count/waiter-tracking scheme
(increment when a caller acquires or queues on a lock and decrement on release,
and only remove the dict entry when the count reaches zero and there are no
waiters). Update the code paths that create/return locks (the routines that
access self._conversation_locks and expected_lock) to either no-op the pop or to
adjust the refcount/waiter tracking accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: c055e8e4-a741-4af6-9354-b1c712d67f67

📥 Commits

Reviewing files that changed from the base of the PR and between 3f7c8f1 and 97bb2cd.

📒 Files selected for processing (2)
  • src/synthorg/meta/chief_of_staff/propose.py
  • tests/unit/meta/chief_of_staff/test_propose.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: Dashboard Test
  • GitHub Check: Test Unit
  • GitHub Check: Test Integration
  • GitHub Check: Lighthouse Dashboard
  • GitHub Check: Lighthouse Site
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Test markers: @pytest.mark.{unit,integration,e2e,slow}; async auto; global timeout 30s; minimum coverage 80%

Windows: unit tests use WindowsSelectorEventLoopPolicy (3.14 IOCP teardown race); subprocess tests override back

Test doubles: FakeClock for Clock seam, mock_ofT for typed-boundary substitutions, SimpleNamespace for attribute-bags; bare MagicMock at typed boundary blocked by scripts/check_mock_spec.py (zero-tolerance, no baseline)

FakeClock and mock_of imported from tests._shared; inject via clock= and helper spec subscript

Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add @example(...))

Flaky tests: NEVER skip/xfail; fix fundamentally; use asyncio.Event().wait() instead of sleep(large)

Files:

  • tests/unit/meta/chief_of_staff/test_propose.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/meta/chief_of_staff/test_propose.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Follow configuration precedence: DB > env > code default via SettingsService/ConfigResolver (Cat-1) or env > code default (Cat-2, read_only_post_init); Cat-3 bootstrap secrets are pure env at boot site; no os.environ.get outside startup; pre-init Cat-2 reads use settings.bootstrap_resolver.resolve_init_value (see docs/reference/configuration-precedence.md)

No hardcoded numeric values; numerics live in settings/definitions/; allowlist only: 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants (NAME: int|float|Final|Final[int]|Final[float] = literal); enforced by scripts/check_no_magic_numbers.py

Comments must explain WHY only; no reviewer citations, issue back-refs, or migration framing; enforced by check_no_review_origin_in_code.py and check_no_migration_framing.py

No from __future__ import annotations (Python 3.14 has PEP 649); use PEP 758 syntax except except A, B: requires parens when binding

Type hints on public functions; mypy strict mode; Google-style docstrings; line length 88; functions <50 lines; files <800 lines

Error classes must use naming pattern Error and inherit from DomainError, never directly from Exception/RuntimeError; enforced by check_domain_error_hierarchy.py

Every Pydantic v2 frozen model must have extra="forbid" project-wide; gate check_frozen_model_extra_forbid.py; @computed_field auto-exempt; per-line lint-allow comments for extra="allow"/"ignore" at boundaries

Use @computed_field for derived properties in Pydantic models

Use NotBlankStr for identifiers in Pydantic models

Args models at every system boundary; use parse_typed() for every external dict ingestion; enforced by check_boundary_typed.py

Use model_copy(update=...) or copy.deepcopy() for immutability; deepcopy at system boundaries

Use asyncio.TaskGroup for fan-out/fan-in; helpers catch Exception (re-raise MemoryError/RecursionError)

Use Clock seam pattern: clock: Clock | None = None; tests inject FakeClock; service...

Files:

  • src/synthorg/meta/chief_of_staff/propose.py
src/synthorg/meta/chief_of_staff/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Conversational propose: POST /meta/chat/propose is opt-in (meta.chief_of_staff.propose_enabled); ChiefOfStaffProposer built by build_chief_of_staff_proposer (ENFORCED manifest entry, EPIC #1955); 503s when ANY of provider, connected persistence, or work pipeline missing; human content wrapped via wrap_untrusted(TAG_TASK_DATA, ...) (SEC-1); approval decisions route through Flow 0 of signal_resume_intent

Files:

  • src/synthorg/meta/chief_of_staff/propose.py
src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/meta/chief_of_staff/propose.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T02:15:09.416Z
Learning: Read design specification before implementing; deviations require approval (see docs/DESIGN_SPEC.md)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T02:15:09.416Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T02:15:09.416Z
Learning: Reuse web/src/components/ui/ design system components; use design tokens only (see web/CLAUDE.md)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T02:15:09.416Z
Learning: No region/currency/locale privilege; use metric units; use British English (see docs/reference/regional-defaults.md)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T02:15:09.416Z
Learning: Every convention PR ships its enforcement gate (see docs/reference/convention-gates.md)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T02:15:09.416Z
Learning: After issue: create branch + commit + push (no auto-PR); use /pre-pr-review before creating PR; gh pr create is blocked; after PR: use /aurelio-review-pr for external feedback; fix all valid issues before merging
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T02:15:09.416Z
Learning: Git commits: <type>: <description> (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T02:15:09.416Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via synthorg-repo-bot)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T02:15:09.416Z
Learning: Branches: <type>/<slug> from main
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T02:15:09.416Z
Learning: Squash merge; PR body becomes squash commit; trailers (Release-As, Closes `#N`) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T02:15:09.416Z
Learning: GitHub queries: use gh issue list via Bash, NOT MCP list_issues
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T02:15:09.416Z
Learning: After every squash merge → /post-merge-cleanup
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T02:15:09.416Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API (see cli/CLAUDE.md)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T02:15:09.416Z
Learning: For web development: see web/CLAUDE.md; for CLI: see cli/CLAUDE.md (use go -C cli, never cd cli); for shell scripts: see ~/.claude/rules/common/bash.md
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • tests/unit/meta/chief_of_staff/test_propose.py
  • src/synthorg/meta/chief_of_staff/propose.py
🔇 Additional comments (1)
tests/unit/meta/chief_of_staff/test_propose.py (1)

610-635: LGTM!

Round 6 covers CodeRabbit's re-review of round 5. CR identified two
distinct concurrency issues. One required a real correctness fix
(inside-lock revalidation). The other is an honest revert: the lock
cleanup chain CR has been pushing on since round 4 cannot be made
race-free without refcount + waiter tracking, which v1 doesn't need
at its expected scale.

Inside-lock conversation revalidation (CR-E1, Critical):
- propose.py: ``_run_turn`` now re-reads the conversation under the
  lock and raises ``ConversationClosedError`` if the status is no
  longer ACTIVE. Without this, two concurrent ``converse()`` calls
  could both read ACTIVE in ``_resolve_conversation`` outside the
  lock; A acquires the lock, transitions to PROPOSED in
  ``_record_proposals``, releases; B then acquires the lock and
  silently parks extra approvals against a now-PROPOSED conversation
  (``_park_proposal`` runs before ``transition_if`` no-ops on the
  stale ACTIVE->PROPOSED transition, so the approvals land in the
  queue even though the status flip warns). The outside-lock
  ``_resolve_conversation`` pre-check is kept as a fast-path for
  obvious CLOSED rejections; the inside-lock check is now the
  authoritative gate.

Lock cleanup reverted (CR-E2, Critical):
- propose.py: removed ``_release_conversation_lock`` and the
  ``converse()`` post-lock cleanup added in round 4 / refined in
  round 5. CR's CR-E2 walks the remaining race: even with the
  identity check, callers already queued on ``expected_lock`` can
  acquire the lock AFTER its owner exits the ``async with`` but
  BEFORE the cleanup pop runs; the pop then leaves a slot for a
  third caller to mint a fresh lock and run concurrently with the
  queued waiter. Closing this race needs a refcount of in-flight +
  queued users, which v1 does not need: the dict at v1's 1:1
  per-user scale is bounded by active human users, well below the
  threshold that would matter for memory.

  The round-4 nitpick (unbounded dict growth) was explicitly labeled
  "Trivial / Low value" by CR. Round 5 + 6 demonstrate it cannot be
  fixed simply; the safe path is to keep the locks registered for
  the process lifetime. Revisit with refcount tracking if a future
  scale requirement demands it.

Tests:
- test_propose.py: removed test_lock_released_on_terminal_propose,
  test_lock_released_on_clarification_cap, and
  test_release_identity_check_skips_replaced_lock -- they asserted
  behaviour that is no longer the design.
- test_propose.py: new
  test_run_turn_aborts_if_conversation_terminal_under_lock -- patches
  the fake conversation repo's ``get`` so the second call (the
  inside-lock re-fetch) returns PROPOSED; asserts converse raises
  ConversationClosedError and that no turns were appended.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/synthorg/meta/chief_of_staff/propose.py (1)

424-436: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Make multi-proposal parking atomic or compensating.

This loop commits proposals one at a time. If _park_proposal() fails on the 2nd+ item, this method raises with the conversation still ACTIVE and no assistant summary turn, but the earlier approvals are already visible in the queue. A retry can then park duplicate approvals for the items that succeeded before the failure. Please add batch-level idempotency or compensation here instead of treating each proposal independently.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/synthorg/meta/chief_of_staff/propose.py` around lines 424 - 436, The loop
over decision.proposals calls _park_proposal repeatedly, causing partial commits
if one call fails; make the multi-proposal operation atomic by first validating
all proposals and preparing a batch (e.g., gather project for each item and
pre-check idempotency keys) and then either (a) perform a single transactional
batch persist of all proposals or (b) implement compensation: track created
proposal IDs from _park_proposal and on any failure call a new helper to
reverse/unpark those created items before re-raising; ensure you use the same
idempotency key or check existence inside _park_proposal to avoid duplicates and
update the propose flow to only mark the conversation/assistant summary after
the entire batch succeeds (references: decision.proposals, _park_proposal, raise
ConversationalProposeResponseInvalidError, conversation state/summary update).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/synthorg/meta/chief_of_staff/propose.py`:
- Around line 244-246: Before raising ConversationClosedError in the propose
flow, log a warning with the conversation context: after fetching current via
self._conversation_repo.get(conversation.id) and detecting current is None or
current.status != ConversationStatus.ACTIVE, call self._logger.warning (or the
class logger used elsewhere) including conversation.id and any relevant current
state, then raise ConversationClosedError(conversation_id=conversation.id);
reference the conditional that checks current and ConversationStatus.ACTIVE and
the ConversationClosedError symbol so the log is emitted immediately before the
raise.

---

Outside diff comments:
In `@src/synthorg/meta/chief_of_staff/propose.py`:
- Around line 424-436: The loop over decision.proposals calls _park_proposal
repeatedly, causing partial commits if one call fails; make the multi-proposal
operation atomic by first validating all proposals and preparing a batch (e.g.,
gather project for each item and pre-check idempotency keys) and then either (a)
perform a single transactional batch persist of all proposals or (b) implement
compensation: track created proposal IDs from _park_proposal and on any failure
call a new helper to reverse/unpark those created items before re-raising;
ensure you use the same idempotency key or check existence inside _park_proposal
to avoid duplicates and update the propose flow to only mark the
conversation/assistant summary after the entire batch succeeds (references:
decision.proposals, _park_proposal, raise
ConversationalProposeResponseInvalidError, conversation state/summary update).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: 92c268d5-82fb-4474-b600-d5a34bfad2a2

📥 Commits

Reviewing files that changed from the base of the PR and between 97bb2cd and 02ea50d.

📒 Files selected for processing (2)
  • src/synthorg/meta/chief_of_staff/propose.py
  • tests/unit/meta/chief_of_staff/test_propose.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Lighthouse Site
  • GitHub Check: Lighthouse Dashboard
  • GitHub Check: Dashboard Test
  • GitHub Check: Test Integration
  • GitHub Check: Test Unit
  • GitHub Check: Test E2E
  • GitHub Check: Build Web Assets (melange)
  • GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{py,pyi}

📄 CodeRabbit inference engine (CLAUDE.md)

Read docs/design/ page before implementing; deviations need approval per DESIGN_SPEC.md

Only src/synthorg/persistence/ may import sqlite/psycopg or emit raw SQL; new repository protocols must inherit from generic categories in persistence/_generics.py (SingletonRepository, IdKeyedRepository, FilteredQueryRepository, AppendOnlyRepository, StatefulRepository, MVCCRepository); protocols stay @runtime_checkable

Use SettingsService/ConfigResolver (Cat-1) or env > code default (Cat-2, read_only_post_init) for configuration precedence; Cat-3 bootstrap secrets are pure env at boot site; no os.environ.get outside startup; use settings.bootstrap_resolver.resolve_init_value for pre-init Cat-2 reads

Do not hardcode numerics; numerics live in settings/definitions/; allowlist only 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants of form NAME: int|float|Final|Final[int]|Final[float] = literal; enforced by scripts/check_no_magic_numbers.py

Never import logging or use print() in app code; use from synthorg.observability import get_logger with variable name logger; emit event names from observability.events.<domain> constants via structured kwargs

On error paths, log WARNING/ERROR with context before raising; on state transitions, log INFO via *_STATUS_TRANSITIONED AFTER persistence write; never interpolate exceptions (error=str(exc)) or use exc_info=True; use error_type=type(exc).__name__ + error=safe_error_description(exc) (SEC-1)

Errors must be <Domain><Condition>Error inheriting from DomainError, never directly from Exception/RuntimeError/etc; enforced by check_domain_error_hierarchy.py

Use Pydantic v2 frozen models with extra="forbid" on every frozen model project-wide (gate check_frozen_model_extra_forbid.py); @computed_field auto-exempt; use per-line # lint-allow: frozen-extra-forbid -- <reason> for extra="allow"/"ignore" boundaries

Use @computed_field ...

Files:

  • tests/unit/meta/chief_of_staff/test_propose.py
  • src/synthorg/meta/chief_of_staff/propose.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use test markers: @pytest.mark.{unit,integration,e2e,slow}; async tests use auto mode; 30s global timeout; coverage 80% min

xdist -n 8 --dist=loadfile auto-applied via pyproject addopts (loadfile prevents Python 3.14+ Windows ProactorEventLoop leak)

Windows unit tests use WindowsSelectorEventLoopPolicy (Python 3.14 IOCP teardown race); subprocess tests override back

Use FakeClock from tests._shared for Clock seam injection; use mock_of[T](**overrides) from tests._shared for typed-boundary substitutions; use SimpleNamespace for attribute-bags; bare MagicMock at typed boundaries forbidden by scripts/check_mock_spec.py

Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add @example(...)); never skip/xfail flaky tests—fix fundamentally; use asyncio.Event().wait() not sleep(large)

Files:

  • tests/unit/meta/chief_of_staff/test_propose.py

⚙️ CodeRabbit configuration file

Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare @settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which @given() honors automatically.

Files:

  • tests/unit/meta/chief_of_staff/test_propose.py
src/synthorg/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Wrap untrusted content via wrap_untrusted() from engine.prompt_safety; use HTMLParseGuard for HTML content (SEC-1)

Never use real vendor names in project code/tests; use example-provider, test-provider, example-{large,medium,small}-001; real names allowed only in .claude/, third-party imports, providers/presets.py, web/public/provider-logos/

Telemetry is opt-in and off by default; every event property must be in _ALLOWED_PROPERTIES; see telemetry.md

Files:

  • src/synthorg/meta/chief_of_staff/propose.py
src/**/*.py

⚙️ CodeRabbit configuration file

This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.

Files:

  • src/synthorg/meta/chief_of_staff/propose.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T02:39:17.304Z
Learning: Every convention PR ships its enforcement gate; see `docs/reference/convention-gates.md`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T02:39:17.304Z
Learning: Never edit `tests/baselines/unit_timing.json` or any `scripts/*_baseline.{txt,json}` / `scripts/_*_baseline.py`; both families are PreToolUse-blocked; per-invocation bypass: `ALLOW_BASELINE_GROWTH=1 git commit ...` (requires explicit user approval)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T02:39:17.304Z
Learning: After issue: branch + commit + push (no auto-PR); use `/pre-pr-review` (`gh pr create` is blocked by `scripts/check_no_pr_create.sh`); after PR: `/aurelio-review-pr` for external feedback; fix EVERYTHING valid; no deferring
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T02:39:17.304Z
Learning: Present every plan for accept/deny before coding (MANDATORY planning guideline)
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T02:39:17.304Z
Learning: After every squash merge → `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-20T02:39:17.304Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

  • tests/unit/meta/chief_of_staff/test_propose.py
  • src/synthorg/meta/chief_of_staff/propose.py

Comment thread src/synthorg/meta/chief_of_staff/propose.py
SynthOrg added 2 commits May 20, 2026 05:02
Logging on inside-lock terminal abort (CR-F1, Major Quick win):
- propose.py _run_turn: emit a structured warning before raising
  ConversationClosedError on the inside-lock conversation-status
  recheck. Captures the conversation id and the observed
  (post-flip) status so operators can correlate the race in logs.

Multi-proposal parking atomicity (CR-F2, Major Heavy lift):
- _park_proposal is now self-atomic: if the approval-store add
  fails after the proposal row was committed, the proposal row is
  removed before re-raising. The cleanup is best-effort -- the
  original exception is preserved even if the proposal delete
  itself fails (logged under park_proposal_cleanup_failed).
- _record_proposals tracks every successful park in the batch and
  unwinds them on any later failure via the new
  _unwind_parked_proposal helper. Unwind order is reverse of
  _park_proposal (approval first, proposal second) so no caller
  ever sees a dangling approval pointing at a deleted proposal.
  Unwind failures are logged but never mask the original
  exception.
- New ApprovalStoreProtocol.delete + ApprovalStore.delete: returns
  bool, removes from in-memory cache + persistent repo (if wired).
  Repo failures restore the cache entry and re-raise; pure-cache
  removals succeed without touching a non-existent repo.

Without compensation, a partial commit leaves earlier approvals
visible in the queue but the conversation stays ACTIVE with no
proposal_id for the client retry to dedupe against; the retry
would then double-park those items.

Tests:
- test_propose.py::test_record_proposals_unwinds_on_partial_park_failure:
  scripted two-proposal response + proposal_repo.save patched to
  raise on the second call; asserts converse raises, proposal repo
  ends empty, approval store ends empty, conversation status stays
  ACTIVE.
- test_protocol.py: ApprovalStoreProtocol surface now includes
  'delete'; the surface-stability assertion is updated.
Pre-push mypy caught a missing protocol member on the
``_RecordingApprovalStore`` test stub in
``tests/unit/hr/test_registry_autonomy.py``. Round 7 added
``delete`` to ``ApprovalStoreProtocol`` to enable the
multi-proposal parking compensation; the test stub was hand-rolled
(no ``AsyncMock(spec=...)``) so it needs the new method
explicitly.

Implementation mirrors the in-memory shape of the other stub
methods: filter the recorded list by id, return whether anything
was removed. Other test stubs that depend on the protocol use
``AsyncMock(spec=ApprovalStoreProtocol)`` or
``mock_of[ApprovalStoreProtocol](...)`` which pick the new
``delete`` up automatically; no other test files needed changes.
@Aureliolo Aureliolo deployed to lighthouse May 20, 2026 03:10 — with GitHub Actions Active
@Aureliolo Aureliolo temporarily deployed to cloudflare-preview May 20, 2026 03:11 — with GitHub Actions Inactive
@Aureliolo Aureliolo merged commit 5ad3c46 into main May 20, 2026
82 checks passed
@Aureliolo Aureliolo deleted the feat/conversational-interface-v1 branch May 20, 2026 04:57
@Aureliolo Aureliolo deployed to cloudflare-preview May 20, 2026 04:57 — with GitHub Actions Active
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Conversational interface v1: 1:1 clarify + propose

1 participant