feat: conversational interface v1 - 1:1 clarify + propose#2019
Conversation
11-agent pre-pr-review surfaced findings; this round addresses 18 valid items. Details in _audit/pre-pr-review/triage.md. Critical / High: - log via event constants (PERSISTENCE_CONVERSATIONAL_UNKNOWN_BACKEND, PERSISTENCE_CONVERSATIONAL_HANDLE_UNAVAILABLE) instead of string literals - check transition_if return value in the approval dispatcher (both reject + execute paths) and surface a no-op as WARNING - check conversation transition_if return in the proposer; warn when a concurrent propose-turn already flipped the status - reorder _park_proposal: save proposal before adding approval to the queue so an approval-store failure leaves an invisible orphan proposal instead of a visible dangling queue item - add postChatPropose() client + ConversationalProposeResponse type + MSW handler for /meta/chat/propose (1:1 MSW parity rule) Medium: - split _park_proposal: extract _build_work_item + _build_approval_item helpers (<50 line rule) - emit COS_CONVERSATION_STATUS_TRANSITIONED INFO on ACTIVE to PROPOSED and ACTIVE to CLOSED transitions - document the sequential user/assistant turn append cancellation window in converse() - design doc: /meta/chat/propose endpoint, Conversation* models, Flow 0 routing, propose_* YAML defaults - README: clarify v1 covers clarify + propose only (no autonomous acting); CLAUDE.md: one-liner on proposer wiring + 503 conditions - regenerate web TS DTO bundle for the new endpoint
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces the 'conversational interface' v1, enabling users to interact with the Chief of Staff agent to propose work. The system handles natural language requests by either asking clarifying questions or parking concrete work items in the approval queue. Upon human approval, these items are automatically routed through the work pipeline, ensuring all autonomous actions remain under strict human governance. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI (base), Organization UI (inherited) Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (6)
📜 Recent review details⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
🧰 Additional context used📓 Path-based instructions (4)src/synthorg/**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
{src/synthorg/**/*.py,tests/**/*.py}📄 CodeRabbit inference engine (CLAUDE.md)
Files:
src/**/*.py⚙️ CodeRabbit configuration file
Files:
tests/**/*.py📄 CodeRabbit inference engine (CLAUDE.md)
Files:
⚙️ CodeRabbit configuration file
Files:
🧠 Learnings (2)📓 Common learnings📚 Learning: 2026-05-05T09:04:46.195ZApplied to files:
🔇 Additional comments (9)
WalkthroughThis PR implements the conversational interface v1 feature — a clarifying-first "clarify-and-propose" flow for the Chief of Staff agent. It adds domain models and enums, persistence protocols and SQLite/Postgres implementations, a ChiefOfStaffProposer service with LLM prompting, approval-routing integration for conversational intake, a guarded POST /meta/chat/propose API, app startup wiring, rate-limit policy, observability constants, and extensive unit, conformance, and end-to-end tests. |
There was a problem hiding this comment.
Actionable comments posted: 10
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/synthorg/api/app.py`:
- Around line 1196-1199: Currently the code only wires the proposal repository
when both proposer and repositories are present; instead, always attach the repo
if repositories is available so conversational approvals can route to it even
when proposer is absent: call
app_state.set_conversational_proposal_repo(repositories.proposal_repo) whenever
repositories is not None, and keep
app_state.set_chief_of_staff_proposer(proposer) gated only by proposer being not
None (refer to the proposer and repositories variables and the methods
app_state.set_chief_of_staff_proposer and
app_state.set_conversational_proposal_repo).
In `@src/synthorg/api/controllers/_approval_review_gate.py`:
- Around line 279-304: Concurrent requests can both call
app_state.work_pipeline.run(work_item) before the CAS transition via
repo.transition_if(proposal.id, ConversationalProposalStatus.PENDING,
ConversationalProposalStatus.EXECUTED), causing duplicate pipeline side-effects;
fix by adding an intermediate CAS transition to an EXECUTING state before
running the pipeline: perform repo.transition_if(proposal.id, PENDING,
ConversationalProposalStatus.EXECUTING) and only run
app_state.work_pipeline.run(work_item) if that CAS succeeds, and on pipeline
failure revert the state (transition back to PENDING) or to a FAILED state so
only the winner executes side-effects; alternatively, if you cannot add a state,
explicitly document and enforce idempotency of pipeline handlers, but prefer
adding the EXECUTING transition and handling revert paths around
work_pipeline.run and the subsequent final transition to EXECUTED.
In `@src/synthorg/api/controllers/meta.py`:
- Around line 467-473: The code passes raw user input (data.message) into domain
orchestration when building ProposeArgs for proposer.converse; wrap the
untrusted content by calling wrap_untrusted(TAG_TASK_DATA, data.message) and
pass that wrapped value into ProposeArgs (replace data.message) so the
conversation boundary enforces SEC-1; update the ProposeArgs construction used
by proposer.converse accordingly (reference: ProposeArgs, proposer.converse,
wrap_untrusted, TAG_TASK_DATA).
In `@src/synthorg/meta/chief_of_staff/config.py`:
- Around line 90-93: Hoist the magic-number defaults and bounds used in the
propose fields into module-level annotated constants and use those constants in
the Field(...) calls: introduce constants like PROPOSE_TEMPERATURE_DEFAULT:
float, PROPOSE_TEMPERATURE_MIN: float, PROPOSE_TEMPERATURE_MAX: float,
PROPOSE_MAX_TOKENS_DEFAULT: int, PROPOSE_MAX_TOKENS_MIN: int,
PROPOSE_MAX_PROPOSALS_DEFAULT: int, PROPOSE_MAX_PROPOSALS_MIN/MAX: int,
PROPOSE_MAX_CLARIFICATION_DEFAULT: int, PROPOSE_MAX_CLARIFICATION_MIN/MAX: int,
then update propose_temperature, propose_max_tokens,
propose_max_proposals_per_turn, and propose_max_clarification_turns to reference
those constants in their default, ge, and le parameters (preserve same numeric
values but no hardcoded literals in Field).
In `@src/synthorg/meta/chief_of_staff/models.py`:
- Around line 447-451: The ProposeResult model allows inconsistent states; add
an in-model invariant check (e.g., a pydantic root_validator or
__post_init__/validate method on ProposeResult) that enforces: when status ==
"needs_clarification" then clarifying_question must be non‑None, proposals must
be empty, and conversation_closed must be False; when status == "proposed" then
proposals must be non‑empty and clarifying_question must be None (and optionally
conversation_closed allowed only if proposals present). Implement this
validation inside the ProposeResult model so invalid combinations cannot be
instantiated or passed through.
In `@src/synthorg/meta/chief_of_staff/propose.py`:
- Around line 271-277: The current except Exception handler in propose.py logs
all exceptions via COS_PROPOSE_FAILED; add an earlier except clause to
immediately re-raise system errors so they don't get treated as operational
failures: insert an except (MemoryError, RecursionError): raise before the
existing except Exception as exc: block (the block that logs with logger.error
and calls safe_error_description). This ensures MemoryError and RecursionError
propagate unchanged while other exceptions continue to be logged and re-raised.
- Around line 238-242: Replace the hardcoded literal limit=1000 in the call to
self._turn_repo.query within propose.py by introducing a module-level annotated
constant (e.g. _MAX_TURNS_QUERY_LIMIT: int = 1000) near the top of the module
and use that constant in the query call (limit=_MAX_TURNS_QUERY_LIMIT); update
any imports/comments if needed and ensure the constant name is clear and
annotated with type int to satisfy the coding guideline.
In `@tests/unit/api/controllers/test_conversational_intake_resume.py`:
- Around line 168-239: Add a unit test that covers the early-return when the
proposal repository is missing: create an ApprovalStore with _approval("a1"),
add it, build a _FakeAppState with proposal_repo=None, pipeline=_FakePipeline(),
then call try_conversational_intake_resume(state, "a1", approved=True) and
assert the return is True; reference symbols to locate code:
try_conversational_intake_resume, _FakeAppState, ApprovalStore, _FakePipeline,
and _approval/_proposal helpers.
In `@web/src/api/types/openapi.gen.ts`:
- Line 17479: The chat_propose endpoint is currently typed to return
ApiResponse[dict[str, Any]] so the OpenAPI schema exposes a generic dict; update
the controller to return the concrete model: import ProposeResult from
synthorg.meta.chief_of_staff.models, change the function/method return
annotation from ApiResponse[dict[str, Any]] to ApiResponse[ProposeResult] (the
annotated function around the chat_propose handler), replace the manual dict
construction and serialization with a simple return ApiResponse(data=result)
where result is the ProposeResult returned by proposer.converse(), and then
regenerate TypeScript types by running the DTO generation script (uv run python
scripts/generate_dto_types_ts.py) so the frontend gets
ApiResponse[ProposeResult] types.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI (base), Organization UI (inherited)
Review profile: ASSERTIVE
Plan: Pro
Run ID: c26197e4-4502-4425-b5b1-e238e23481e4
📒 Files selected for processing (50)
CLAUDE.mdREADME.mddocs/design/self-improvement.mdscripts/_ghost_wiring_manifest.txtscripts/schema_drift_baseline.txtsrc/synthorg/api/app.pysrc/synthorg/api/app_builders.pysrc/synthorg/api/controllers/_approval_review_gate.pysrc/synthorg/api/controllers/meta.pysrc/synthorg/api/rate_limits/policies.pysrc/synthorg/api/state.pysrc/synthorg/api/state_services_facades.pysrc/synthorg/api/state_services_facades_mcp3.pysrc/synthorg/core/enums.pysrc/synthorg/core/error_taxonomy.pysrc/synthorg/meta/chief_of_staff/config.pysrc/synthorg/meta/chief_of_staff/models.pysrc/synthorg/meta/chief_of_staff/prompts.pysrc/synthorg/meta/chief_of_staff/propose.pysrc/synthorg/meta/errors.pysrc/synthorg/observability/events/approval_gate.pysrc/synthorg/observability/events/chief_of_staff.pysrc/synthorg/observability/events/persistence.pysrc/synthorg/persistence/conversation_protocol.pysrc/synthorg/persistence/conversational_factory.pysrc/synthorg/persistence/conversational_proposal_protocol.pysrc/synthorg/persistence/postgres/conversation_repo.pysrc/synthorg/persistence/postgres/conversational_proposal_repo.pysrc/synthorg/persistence/postgres/revisions/20260519000001_conversational_intake.sqlsrc/synthorg/persistence/postgres/schema.sqlsrc/synthorg/persistence/sqlite/conversation_repo.pysrc/synthorg/persistence/sqlite/conversational_proposal_repo.pysrc/synthorg/persistence/sqlite/revisions/20260519000001_conversational_intake.sqlsrc/synthorg/persistence/sqlite/schema.sqltests/conformance/persistence/test_conversation_repository.pytests/conformance/persistence/test_conversational_proposal_repository.pytests/e2e/test_conversational_propose_e2e.pytests/unit/api/controllers/test_conversational_intake_resume.pytests/unit/meta/chief_of_staff/test_config.pytests/unit/meta/chief_of_staff/test_conversational_models.pytests/unit/meta/chief_of_staff/test_propose.pytests/unit/meta/chief_of_staff/test_proposer_wiring.pytests/unit/meta/test_api.pytests/unit/meta/test_chief_of_staff_errors.pyweb/src/api/endpoints/meta.tsweb/src/api/types/dtos.gen.tsweb/src/api/types/enum-values.gen.tsweb/src/api/types/error-codes.gen.tsweb/src/api/types/openapi.gen.tsweb/src/mocks/handlers/meta.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: Lighthouse Dashboard
- GitHub Check: Lighthouse Site
- GitHub Check: Build Web Assets (melange)
- GitHub Check: Dashboard Test
- GitHub Check: Test Integration
- GitHub Check: Test Unit
- GitHub Check: Test E2E
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (16)
web/src/**/*.{js,jsx,ts,tsx,mts}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/**/*.{js,jsx,ts,tsx,mts}: Always usecreateLoggerfrom@/lib/logger; never bareconsole.warn/console.error/console.debugin application code. Variable name must always belog. Onlylogger.tsitself may use bare console methods. Uselog.debug()(DEV-only, stripped in production),log.warn(),log.error().
Pass dynamic/untrusted values as separate args to logger calls (not interpolated into the message string) so they go throughsanitizeArg
Attacker-controlled fields inside structured objects must be wrapped insanitizeForLog()before embedding in log calls
Error-code constants (MANDATORY): importErrorCodeandErrorCategoryfrom@/api/types/errors(re-exported from the generatedweb/src/api/types/error-codes.gen.ts). Discriminate onErrorCode.<NAME>, never on raw integer literals.
Use@eslint-react/web-api-no-leaked-fetchto detectfetch()in effects withoutAbortControllercleanup
Files:
web/src/api/types/dtos.gen.tsweb/src/api/types/enum-values.gen.tsweb/src/api/types/error-codes.gen.tsweb/src/mocks/handlers/meta.tsweb/src/api/endpoints/meta.tsweb/src/api/types/openapi.gen.ts
web/src/api/types/**/*.gen.ts
📄 CodeRabbit inference engine (web/CLAUDE.md)
Generated DTO types (MANDATORY): NEVER hand-edit
web/src/api/types/*.gen.ts. Regenerate withuv run python scripts/generate_dto_types_ts.py. Import DTOs via the barrel (import type { AgentConfig } from '@/api/types').
Files:
web/src/api/types/dtos.gen.tsweb/src/api/types/enum-values.gen.tsweb/src/api/types/error-codes.gen.tsweb/src/api/types/openapi.gen.ts
web/src/**/*.{ts,tsx,mts}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/**/*.{ts,tsx,mts}: Use@typescript-eslint/no-floating-promisesto forbid unawaited promises so async work cannot survive the test that scheduled it and trip the active-handle gate
Use@typescript-eslint/no-misused-promises(withchecksVoidReturn: { attributes: false }) to forbid passing async functions where the callsite ignores the returned promise. React 19asyncevent handlers stay allowed via theattributes: falseexemption.
Files:
web/src/api/types/dtos.gen.tsweb/src/api/types/enum-values.gen.tsweb/src/api/types/error-codes.gen.tsweb/src/mocks/handlers/meta.tsweb/src/api/endpoints/meta.tsweb/src/api/types/openapi.gen.ts
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Only
src/synthorg/persistence/may import sqlite/psycopg or emit raw SQLNo hardcoded numeric values outside
settings/definitions/; allowlist only 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constantsNo
from __future__ import annotations(Python 3.14 has PEP 649)Type hints on public functions; mypy strict mode required; Google-style docstrings; line length 88; functions <50 lines; files <800 lines
Error classes must follow
<Domain><Condition>Errornaming fromDomainErrorbase class; never inheritException/RuntimeErrordirectlyPydantic v2: use frozen +
extra="forbid"on every frozen model project-wide;@computed_fieldfor derived values;NotBlankStrfor identifiersArgs models required at every system boundary; use
parse_typed()for every external dict ingestionUse
model_copy(update=...)orcopy.deepcopy()for immutability; deepcopy at system boundariesAsync code: use
asyncio.TaskGroupfor fan-out/fan-in; helpers catchExceptionand re-raiseMemoryError/RecursionErrorClock seam: use
clock: Clock | None = Noneparameter; injectFakeClockin tests; lifecycle services own_lifecycle_lockUntrusted content (SEC-1): wrap via
wrap_untrusted()fromengine.prompt_safety; useHTMLParseGuardfor HTMLLogging: import from
synthorg.observability.get_logger; variable alwayslogger; never use bareloggingmodule orprint()in app codeEvent names must come from
observability.events.<domain>constants; use structured kwargs; log state transitions AFTER persistence write with*_STATUS_TRANSITIONEDSecret-log redaction (SEC-1): never use
error=str(exc)or interpolate{exc}; useerror_type=type(exc).__name__+safe_error_description(exc); neverexc_info=True; OTel: forbidrecord_exception(), useset_attribute()withrecord_exception=False, set_status_on_exception=FalseTelemetry: opt-in, off by default; every event property must be in
_ALLOWED_PROPERTIESV...
Files:
src/synthorg/api/state_services_facades_mcp3.pysrc/synthorg/api/rate_limits/policies.pysrc/synthorg/observability/events/approval_gate.pysrc/synthorg/api/state.pysrc/synthorg/core/error_taxonomy.pysrc/synthorg/observability/events/chief_of_staff.pysrc/synthorg/persistence/conversational_factory.pysrc/synthorg/api/app_builders.pysrc/synthorg/meta/chief_of_staff/prompts.pysrc/synthorg/api/controllers/meta.pysrc/synthorg/persistence/conversational_proposal_protocol.pysrc/synthorg/core/enums.pysrc/synthorg/api/app.pysrc/synthorg/meta/errors.pysrc/synthorg/observability/events/persistence.pysrc/synthorg/persistence/conversation_protocol.pysrc/synthorg/api/state_services_facades.pysrc/synthorg/persistence/postgres/conversation_repo.pysrc/synthorg/meta/chief_of_staff/config.pysrc/synthorg/persistence/sqlite/conversational_proposal_repo.pysrc/synthorg/meta/chief_of_staff/propose.pysrc/synthorg/persistence/postgres/conversational_proposal_repo.pysrc/synthorg/meta/chief_of_staff/models.pysrc/synthorg/api/controllers/_approval_review_gate.pysrc/synthorg/persistence/sqlite/conversation_repo.py
src/synthorg/api/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
API startup: two phases—construction (
create_app) wires synchronous services; on_startup wires services needing connected persistence; ordering invariants:agent_registrybeforeauto_wire_meetings,SettingsServicebeforeWorkflowExecutionObserver,OntologyServicewires afterpersistence.connect()WebSocket resilience: per-frame timeout closes silent peers (1008); revalidation saturation closes (4011)
Files:
src/synthorg/api/state_services_facades_mcp3.pysrc/synthorg/api/rate_limits/policies.pysrc/synthorg/api/state.pysrc/synthorg/api/app_builders.pysrc/synthorg/api/controllers/meta.pysrc/synthorg/api/app.pysrc/synthorg/api/state_services_facades.pysrc/synthorg/api/controllers/_approval_review_gate.py
src/synthorg/{workers,api}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Runtime services: select behind ONE provider-present switch via
build_runtime_services, returningRuntimeServicespair (worker execution service + coordinator); empty-company rejects task creation (4014) and/coordinate503s; useswap_worker_execution_service/swap_coordinator/swap_provider_registrywith lock synchronization
Files:
src/synthorg/api/state_services_facades_mcp3.pysrc/synthorg/api/rate_limits/policies.pysrc/synthorg/api/state.pysrc/synthorg/api/app_builders.pysrc/synthorg/api/controllers/meta.pysrc/synthorg/api/app.pysrc/synthorg/api/state_services_facades.pysrc/synthorg/api/controllers/_approval_review_gate.py
src/synthorg/{api,meta}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Conversational propose:
POST /meta/chat/proposeis opt-in (meta.chief_of_staff.propose_enabled);ChiefOfStaffProposerbuilt bybuild_chief_of_staff_proposer(ENFORCED manifest entry, EPIC#1955); 503s when ANY of provider, persistence, or work pipeline missing; wrap human content viawrap_untrusted(TAG_TASK_DATA, ...)(SEC-1)
Files:
src/synthorg/api/state_services_facades_mcp3.pysrc/synthorg/api/rate_limits/policies.pysrc/synthorg/api/state.pysrc/synthorg/api/app_builders.pysrc/synthorg/meta/chief_of_staff/prompts.pysrc/synthorg/api/controllers/meta.pysrc/synthorg/api/app.pysrc/synthorg/meta/errors.pysrc/synthorg/api/state_services_facades.pysrc/synthorg/meta/chief_of_staff/config.pysrc/synthorg/meta/chief_of_staff/propose.pysrc/synthorg/meta/chief_of_staff/models.pysrc/synthorg/api/controllers/_approval_review_gate.py
src/**/*.py
⚙️ CodeRabbit configuration file
This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.
Files:
src/synthorg/api/state_services_facades_mcp3.pysrc/synthorg/api/rate_limits/policies.pysrc/synthorg/observability/events/approval_gate.pysrc/synthorg/api/state.pysrc/synthorg/core/error_taxonomy.pysrc/synthorg/observability/events/chief_of_staff.pysrc/synthorg/persistence/conversational_factory.pysrc/synthorg/api/app_builders.pysrc/synthorg/meta/chief_of_staff/prompts.pysrc/synthorg/api/controllers/meta.pysrc/synthorg/persistence/conversational_proposal_protocol.pysrc/synthorg/core/enums.pysrc/synthorg/api/app.pysrc/synthorg/meta/errors.pysrc/synthorg/observability/events/persistence.pysrc/synthorg/persistence/conversation_protocol.pysrc/synthorg/api/state_services_facades.pysrc/synthorg/persistence/postgres/conversation_repo.pysrc/synthorg/meta/chief_of_staff/config.pysrc/synthorg/persistence/sqlite/conversational_proposal_repo.pysrc/synthorg/meta/chief_of_staff/propose.pysrc/synthorg/persistence/postgres/conversational_proposal_repo.pysrc/synthorg/meta/chief_of_staff/models.pysrc/synthorg/api/controllers/_approval_review_gate.pysrc/synthorg/persistence/sqlite/conversation_repo.py
{README.md,docs/**/*.md}
📄 CodeRabbit inference engine (CLAUDE.md)
Numerics in README and public docs must be sourced from
data/runtime_stats.yamlvia<!--RS:NAME-->markers
Files:
README.mddocs/design/self-improvement.md
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Test markers:
@pytest.mark.{unit,integration,e2e,slow}; async tests useauto; global timeout 30s; coverage 80% minWindows tests: unit tests use
WindowsSelectorEventLoopPolicy; subprocess tests override back to handle 3.14 IOCP teardown raceTest doubles: follow ladder in conventions.md; use
FakeClockfor Clock seam,mock_of[T](**overrides)for typed boundaries,SimpleNamespacefor attribute-bags; bareMagicMockat typed boundaries is blocked bycheck_mock_spec.pyHypothesis: 10 deterministic CI examples; failures are real bugs (fix + add
@example(...))Flaky tests: NEVER skip/xfail; fix fundamentally; use
asyncio.Event().wait()notsleep(large)
Files:
tests/unit/meta/test_chief_of_staff_errors.pytests/unit/meta/test_api.pytests/unit/meta/chief_of_staff/test_config.pytests/unit/meta/chief_of_staff/test_proposer_wiring.pytests/unit/meta/chief_of_staff/test_conversational_models.pytests/conformance/persistence/test_conversation_repository.pytests/unit/meta/chief_of_staff/test_propose.pytests/unit/api/controllers/test_conversational_intake_resume.pytests/e2e/test_conversational_propose_e2e.pytests/conformance/persistence/test_conversational_proposal_repository.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/meta/test_chief_of_staff_errors.pytests/unit/meta/test_api.pytests/unit/meta/chief_of_staff/test_config.pytests/unit/meta/chief_of_staff/test_proposer_wiring.pytests/unit/meta/chief_of_staff/test_conversational_models.pytests/conformance/persistence/test_conversation_repository.pytests/unit/meta/chief_of_staff/test_propose.pytests/unit/api/controllers/test_conversational_intake_resume.pytests/e2e/test_conversational_propose_e2e.pytests/conformance/persistence/test_conversational_proposal_repository.py
src/synthorg/persistence/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
New repository protocols must inherit from one or more generic categories in
persistence/_generics.py(SingletonRepository,IdKeyedRepository,FilteredQueryRepository,AppendOnlyRepository,StatefulRepository,MVCCRepository); bespoke methods permitted only under ADR-0001 D7Repository protocols must stay
@runtime_checkableRepository CRUD: use
save(entity),get(id),delete(id) -> bool,list_items(...),query(...)returning tuplesDatetime in persistence: use
parse_iso_utc/format_iso_utcfrompersistence._shared; reject naive datetimes; usenormalize_utcfor already-typed
Files:
src/synthorg/persistence/conversational_factory.pysrc/synthorg/persistence/conversational_proposal_protocol.pysrc/synthorg/persistence/conversation_protocol.pysrc/synthorg/persistence/postgres/conversation_repo.pysrc/synthorg/persistence/sqlite/conversational_proposal_repo.pysrc/synthorg/persistence/postgres/conversational_proposal_repo.pysrc/synthorg/persistence/sqlite/conversation_repo.py
web/src/mocks/handlers/**/*.ts
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/mocks/handlers/**/*.ts: MSW handlers (MANDATORY):web/src/mocks/handlers/must mirrorweb/src/api/endpoints/*.ts1:1 with a default happy-path handler for every exported endpoint. UseonUnhandledRequest: 'error'in test setup; tests override per-case viaserver.use(...), nevervi.mock('@/api/endpoints/*').
Use typed envelope helpers (successFor,paginatedFor,voidSuccess) to keep MSW handlers in lockstep with endpoint return types
Files:
web/src/mocks/handlers/meta.ts
web/src/{api/endpoints,stores}/**/*.ts
📄 CodeRabbit inference engine (web/CLAUDE.md)
Cursor pagination (MANDATORY): list endpoints must use opaque cursor-based paging via
PaginationMeta. Stores must keepnextCursor+hasMorein state (not offset arithmetic) and early-return when!hasMore || !nextCursor. Display counts must come fromdata.length.
Files:
web/src/api/endpoints/meta.ts
web/src/api/endpoints/**/*.ts
📄 CodeRabbit inference engine (web/CLAUDE.md)
Health / readiness endpoints (MANDATORY):
getLiveness()is always 200 while the process is alive;getReadiness()is 200 healthy / 503 unavailable (binary'ok' | 'unavailable'outcome, no tri-state). Any new caller must handle the 503 path explicitly.
Files:
web/src/api/endpoints/meta.ts
docs/**/*.{d2,md}
📄 CodeRabbit inference engine (CLAUDE.md)
Use D2 for architecture/nested containers (theme 200, Dark Mauve, v0.7.1 pinned in CI); use Mermaid for flowcharts/sequence/pipelines; use Markdown tables for tabular data
Files:
docs/design/self-improvement.md
tests/conformance/persistence/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Dual-backend conformance:
tests/conformance/persistence/consumesbackendfixture (SQLite + Postgres); enforced bycheck_dual_backend_test_parity.py
Files:
tests/conformance/persistence/test_conversation_repository.pytests/conformance/persistence/test_conversational_proposal_repository.py
🧠 Learnings (9)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-19T23:18:33.420Z
Learning: Read `docs/design/` page before implementing; deviations need approval per DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-19T23:18:33.420Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-19T23:18:33.420Z
Learning: No region/currency/locale privilege; use metric units and British English
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-19T23:18:33.420Z
Learning: Every convention PR must ship its enforcement gate per docs/reference/convention-gates.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-19T23:18:33.420Z
Learning: After implementation: branch + commit + push (no auto-PR); use `/pre-pr-review` before creating PR; after PR: use `/aurelio-review-pr` for external feedback; fix all valid issues without deferring
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-19T23:18:33.420Z
Learning: Commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-19T23:18:33.420Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-19T23:18:33.420Z
Learning: Branches: `<type>/<slug>` naming from main
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-19T23:18:33.420Z
Learning: Squash merge; PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-19T23:18:33.420Z
Learning: GitHub queries: use `gh issue list` via Bash, NOT MCP `list_issues`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-19T23:18:33.420Z
Learning: After every squash merge, run `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-19T23:18:33.420Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
src/synthorg/api/state_services_facades_mcp3.pysrc/synthorg/api/rate_limits/policies.pytests/unit/meta/test_chief_of_staff_errors.pysrc/synthorg/observability/events/approval_gate.pysrc/synthorg/api/state.pysrc/synthorg/core/error_taxonomy.pysrc/synthorg/observability/events/chief_of_staff.pysrc/synthorg/persistence/conversational_factory.pysrc/synthorg/api/app_builders.pytests/unit/meta/test_api.pysrc/synthorg/meta/chief_of_staff/prompts.pysrc/synthorg/api/controllers/meta.pytests/unit/meta/chief_of_staff/test_config.pytests/unit/meta/chief_of_staff/test_proposer_wiring.pysrc/synthorg/persistence/conversational_proposal_protocol.pysrc/synthorg/core/enums.pysrc/synthorg/api/app.pysrc/synthorg/meta/errors.pysrc/synthorg/observability/events/persistence.pysrc/synthorg/persistence/conversation_protocol.pysrc/synthorg/api/state_services_facades.pysrc/synthorg/persistence/postgres/conversation_repo.pysrc/synthorg/meta/chief_of_staff/config.pytests/unit/meta/chief_of_staff/test_conversational_models.pysrc/synthorg/persistence/sqlite/conversational_proposal_repo.pysrc/synthorg/meta/chief_of_staff/propose.pysrc/synthorg/persistence/postgres/conversational_proposal_repo.pytests/conformance/persistence/test_conversation_repository.pysrc/synthorg/meta/chief_of_staff/models.pytests/unit/meta/chief_of_staff/test_propose.pytests/unit/api/controllers/test_conversational_intake_resume.pysrc/synthorg/api/controllers/_approval_review_gate.pytests/e2e/test_conversational_propose_e2e.pytests/conformance/persistence/test_conversational_proposal_repository.pysrc/synthorg/persistence/sqlite/conversation_repo.py
📚 Learning: 2026-05-16T18:36:19.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1944
File: docs/guides/contributing.md:95-95
Timestamp: 2026-05-16T18:36:19.195Z
Learning: In the SynthOrg repo, the “Doc Numeric Claims (MANDATORY)” RS-marker rule should be applied only to these docs: README.md; docs/index.md; docs/roadmap/index.md; docs/architecture/decisions.md; docs/reference/convention-gates.md. This rule is enforced by scripts/check_doc_numeric_macros.py (with runtime substitution by scripts/inject_runtime_stats.py), so reviewers should not flag similar numeric-claim issues in other paths (e.g., anything under docs/guides/). When checking those scoped files, the rule skips fenced code blocks and only flags digits that are adjacent to stat nouns (tests/providers/agents/stars/releases). Numeric CLI flags like “--num-workers=4” inside fenced bash code blocks are not subject to this rule.
Applied to files:
README.md
📚 Learning: 2026-05-16T18:36:31.446Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1944
File: docs/reference/conventions.md:787-789
Timestamp: 2026-05-16T18:36:31.446Z
Learning: In Aureliolo/synthorg, follow the `Doc Numeric Claims (MANDATORY)` rule enforced by `scripts/check_doc_numeric_macros.py` only for these markdown files: `README.md`, `docs/index.md`, `docs/roadmap/index.md`, `docs/architecture/decisions.md`, and `docs/reference/convention-gates.md`. The gate flags digits that appear adjacent to the stat nouns `tests`, `providers`, `agents`, `stars`, and `releases`—those numeric claims must use the required `<!--RS:...-->` macro format. Do not apply this rule to prose that mentions Python version numbers (e.g., “Python 3.14” / “Python 3.15”); those should not be flagged as requiring `<!--RS:...-->`.
Applied to files:
README.md
📚 Learning: 2026-05-16T18:36:35.250Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1944
File: docs/getting_started.md:109-109
Timestamp: 2026-05-16T18:36:35.250Z
Learning: In the synthorg repo, the “Doc Numeric Claims (MANDATORY)” RS-marker rule is enforced only for this exact set of Markdown files: README.md, docs/index.md, docs/roadmap/index.md, docs/architecture/decisions.md, and docs/reference/convention-gates.md. During code reviews, do not raise RS-marker/numeric-claims findings for numeric values in any other files (e.g., docs/getting_started.md, docs/guides/*, docs/reference/conventions.md), since they are not checked or injected by scripts/check_doc_numeric_macros.py or scripts/inject_runtime_stats.py.
Applied to files:
README.md
📚 Learning: 2026-05-16T18:36:31.446Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1944
File: docs/reference/conventions.md:787-789
Timestamp: 2026-05-16T18:36:31.446Z
Learning: In Aureliolo/synthorg, do not require adding `<!--RS:...-->` “Doc Numeric Claims (MANDATORY)” numeric macros for Python version numbers mentioned in documentation prose (e.g., “Python 3.14”, “Python 3.15”). The `scripts/check_doc_numeric_macros.py` gate only applies to `README.md`, `docs/index.md`, `docs/roadmap/index.md`, `docs/architecture/decisions.md`, and `docs/reference/convention-gates.md`, and it only flags digits adjacent to specific stat nouns (tests/providers/agents/stars/releases), not language version mentions like “Python 3.14”.
Applied to files:
README.mdCLAUDE.mddocs/design/self-improvement.md
📚 Learning: 2026-05-16T18:36:35.250Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1944
File: docs/getting_started.md:109-109
Timestamp: 2026-05-16T18:36:35.250Z
Learning: When reviewing Markdown in the synthorg repo, account for the CI gate `check_doc_numeric_macros.py`: it skips fenced code blocks entirely, and it only flags digits that are adjacent to these stat nouns: `tests`, `providers`, `agents`, `stars`, `releases`. Therefore, numeric examples such as CLI flag values (e.g., `--num-workers=4` in fenced bash blocks) and prose version numbers (e.g., `3.14`/`3.15`) are not expected to trigger this check; prioritize changes only when digits appear next to one of the listed nouns (e.g., “5 tests”, “10 stars”, etc.).
Applied to files:
README.mdCLAUDE.mddocs/design/self-improvement.md
📚 Learning: 2026-05-16T18:36:35.250Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1944
File: docs/getting_started.md:109-109
Timestamp: 2026-05-16T18:36:35.250Z
Learning: When reviewing markdown files for the "Doc Numeric Claims (MANDATORY)" RS-marker rule, only require/flag missing RS markers in the files that are actually in-scope for the rule. The scope is enforced via an identical _SCOPED_FILES allowlist in scripts/check_doc_numeric_macros.py and scripts/inject_runtime_stats.py, and currently includes: README.md; docs/index.md; docs/roadmap/index.md; docs/architecture/decisions.md; docs/reference/convention-gates.md. For any other markdown files (e.g., docs/getting_started.md, docs/guides/*), missing RS markers for numeric claims are no-ops and should NOT be flagged.
Applied to files:
README.mddocs/design/self-improvement.md
📚 Learning: 2026-05-16T18:36:35.250Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1944
File: docs/getting_started.md:109-109
Timestamp: 2026-05-16T18:36:35.250Z
Learning: When reviewing Markdown in the synthorg repo against the `check_doc_numeric_macros.py` gate, account for its documented behavior: it skips fenced code blocks entirely, and it only flags digits that are adjacent to specific stat nouns (`tests`, `providers`, `agents`, `stars`, `releases`). As a result, CLI-style numbers (e.g., `--num-workers=4`) inside fenced bash code blocks should never be treated as violations of this gate; only non-fenced text needs checking, and only around those specific nouns.
Applied to files:
docs/design/self-improvement.md
🔇 Additional comments (82)
tests/e2e/test_conversational_propose_e2e.py (1)
89-530: LGTM!CLAUDE.md (1)
90-90: LGTM!README.md (1)
43-43: LGTM!docs/design/self-improvement.md (1)
117-122: LGTM!Also applies to: 221-223, 235-243, 280-289
scripts/_ghost_wiring_manifest.txt (1)
45-45: LGTM!tests/unit/meta/chief_of_staff/test_conversational_models.py (1)
1-223: LGTM!tests/unit/meta/chief_of_staff/test_propose.py (1)
1-466: LGTM!tests/unit/meta/chief_of_staff/test_proposer_wiring.py (1)
1-114: LGTM!tests/unit/meta/chief_of_staff/test_config.py (1)
54-85: LGTM!tests/unit/meta/test_api.py (1)
4-12: LGTM!Also applies to: 37-69
tests/unit/meta/test_chief_of_staff_errors.py (1)
1-62: LGTM!src/synthorg/persistence/sqlite/schema.sql (2)
1075-1078: Same backend-compatibility concern as the SQLite migration file.
1127-1181: LGTM!src/synthorg/persistence/conversation_protocol.py (1)
35-177: LGTM!src/synthorg/persistence/conversational_factory.py (1)
52-120: LGTM!src/synthorg/persistence/conversational_proposal_protocol.py (1)
29-147: LGTM!src/synthorg/persistence/postgres/revisions/20260519000001_conversational_intake.sql (1)
16-56: LGTM!src/synthorg/persistence/postgres/schema.sql (1)
1153-1153: LGTM!Also applies to: 1198-1238
src/synthorg/persistence/sqlite/revisions/20260519000001_conversational_intake.sql (1)
32-74: LGTM!scripts/schema_drift_baseline.txt (1)
50-53: LGTM!src/synthorg/persistence/sqlite/conversation_repo.py (8)
1-65: LGTM!
67-95: LGTM!
97-145: LGTM!
148-217: LGTM!
219-286: LGTM!
288-354: LGTM!
356-384: LGTM!
386-546: LGTM!src/synthorg/persistence/sqlite/conversational_proposal_repo.py (4)
1-121: LGTM!
123-210: LGTM!
212-319: LGTM!
321-404: LGTM!src/synthorg/persistence/postgres/conversation_repo.py (4)
1-107: LGTM!
109-323: LGTM!
325-463: LGTM!
465-468: LGTM!src/synthorg/persistence/postgres/conversational_proposal_repo.py (3)
1-95: LGTM!
97-377: LGTM!
379-380: LGTM!tests/conformance/persistence/test_conversation_repository.py (3)
1-106: LGTM!
108-224: LGTM!
226-302: LGTM!tests/conformance/persistence/test_conversational_proposal_repository.py (2)
1-70: LGTM!
72-218: LGTM!src/synthorg/core/enums.py (1)
597-607: LGTM!Also applies to: 609-648
src/synthorg/core/error_taxonomy.py (1)
93-93: LGTM!Also applies to: 111-111, 136-136
src/synthorg/meta/chief_of_staff/config.py (1)
12-12: LGTM!Also applies to: 44-55, 83-89, 94-95
src/synthorg/meta/chief_of_staff/models.py (1)
21-28: LGTM!Also applies to: 233-445
src/synthorg/meta/errors.py (1)
62-145: LGTM!src/synthorg/observability/events/approval_gate.py (1)
39-52: LGTM!src/synthorg/observability/events/chief_of_staff.py (1)
46-56: LGTM!src/synthorg/observability/events/persistence.py (1)
612-644: LGTM!src/synthorg/meta/chief_of_staff/prompts.py (1)
222-273: LGTM!src/synthorg/meta/chief_of_staff/propose.py (12)
1-91: LGTM!
94-103: LGTM!
105-143: LGTM!
145-206: LGTM!
208-227: LGTM!
298-328: LGTM!
330-397: LGTM!
399-421: LGTM!
423-448: LGTM!
450-495: LGTM!
497-533: LGTM!
536-537: LGTM!src/synthorg/api/controllers/_approval_review_gate.py (2)
31-40: LGTM!
502-510: LGTM!tests/unit/api/controllers/test_conversational_intake_resume.py (3)
1-26: LGTM!
43-116: LGTM!
119-166: LGTM!src/synthorg/api/rate_limits/policies.py (1)
128-128: LGTM!src/synthorg/api/app_builders.py (1)
37-49: LGTM!Also applies to: 197-257
src/synthorg/api/state.py (1)
204-204: LGTM!Also applies to: 211-211
src/synthorg/api/state_services_facades.py (1)
119-119: LGTM!Also applies to: 177-178
src/synthorg/api/state_services_facades_mcp3.py (1)
34-42: LGTM!Also applies to: 84-85, 234-283
web/src/api/endpoints/meta.ts (1)
8-8: LGTM!Also applies to: 73-87, 120-147
web/src/api/types/dtos.gen.ts (1)
171-171: LGTM!web/src/api/types/enum-values.gen.ts (1)
43-43: LGTM!web/src/api/types/error-codes.gen.ts (1)
43-43: LGTM!Also applies to: 59-59, 78-78
web/src/api/types/openapi.gen.ts (2)
2120-2136: LGTM!Also applies to: 5809-5817, 6687-6692
7572-7572: LGTM!web/src/mocks/handlers/meta.ts (1)
8-8: LGTM!Also applies to: 36-72
| -- SQLite-only design note: this revision deliberately does NOT widen | ||
| -- the ``approvals.source`` CHECK to admit ``conversational_intake``. | ||
| -- The only mechanisms available to SQLite are (a) a full ``approvals`` | ||
| -- table rebuild + 9 index recreations and (b) a | ||
| -- ``PRAGMA writable_schema`` patch + reload; both forced multiple | ||
| -- migration-fixture-timing tests across the unit suite over their | ||
| -- 8s budget on Windows (the conformance / multi-tenancy arms migrate | ||
| -- from scratch per test, and the pre-push isolation gate runs them | ||
| -- twice each). The Postgres sibling DOES widen the constraint, which | ||
| -- is the production target. | ||
| -- | ||
| -- Practical impact on SQLite: the ``approvals`` table retains the | ||
| -- narrow CHECK ``source IN ('parked_context', 'review_gate')``. The | ||
| -- v1 conversational interface keeps the ``ApprovalStore`` in-memory | ||
| -- (no persistent ``ApprovalRepository`` is wired by default), so no | ||
| -- ``CONVERSATIONAL_INTAKE`` row ever reaches the SQLite ``approvals`` | ||
| -- table. Deployments that wire a persistent SQLite ``ApprovalStore`` | ||
| -- AND enable the conversational interface should switch to Postgres | ||
| -- (the codebase's production backend) or follow up with a rebuild. |
There was a problem hiding this comment.
SQLite migration keeps approvals.source incompatible with conversational-intake writes.
Line 5 intentionally skips widening the SQLite approvals.source CHECK, so any deployment persisting approvals on SQLite can reject rows using source='conversational_intake'. This creates backend-dependent runtime behaviour for the same feature path.
Please either rebuild SQLite approvals to include conversational_intake, or hard-block that configuration combination at startup with an explicit unsupported-mode failure.
| class TestConversationalIntakeResume: | ||
| async def test_non_conversational_source_is_inert(self) -> None: | ||
| state, _ = await _seed(source=ApprovalSource.REVIEW_GATE) | ||
| handled = await try_conversational_intake_resume( | ||
| state, # type: ignore[arg-type] | ||
| "a1", | ||
| approved=True, | ||
| ) | ||
| assert handled is False | ||
|
|
||
| async def test_approve_runs_pipeline_and_marks_executed(self) -> None: | ||
| pipeline = _FakePipeline() | ||
| state, repo = await _seed(pipeline=pipeline) | ||
| handled = await try_conversational_intake_resume( | ||
| state, # type: ignore[arg-type] | ||
| "a1", | ||
| approved=True, | ||
| ) | ||
| assert handled is True | ||
| assert len(pipeline.calls) == 1 | ||
| assert pipeline.calls[0].source is WorkSource.CONVERSATIONAL | ||
| assert repo.items["prop-a1"].status is ConversationalProposalStatus.EXECUTED | ||
|
|
||
| async def test_reject_skips_pipeline_and_marks_rejected(self) -> None: | ||
| pipeline = _FakePipeline() | ||
| state, repo = await _seed(pipeline=pipeline) | ||
| handled = await try_conversational_intake_resume( | ||
| state, # type: ignore[arg-type] | ||
| "a1", | ||
| approved=False, | ||
| ) | ||
| assert handled is True | ||
| assert pipeline.calls == [] | ||
| assert repo.items["prop-a1"].status is ConversationalProposalStatus.REJECTED | ||
|
|
||
| async def test_missing_proposal_is_owned_but_noop(self) -> None: | ||
| state, _ = await _seed(with_proposal=False) | ||
| handled = await try_conversational_intake_resume( | ||
| state, # type: ignore[arg-type] | ||
| "a1", | ||
| approved=True, | ||
| ) | ||
| assert handled is True | ||
|
|
||
| async def test_pipeline_unavailable_raises(self) -> None: | ||
| store = ApprovalStore() | ||
| await store.add(_approval("a1")) | ||
| repo = _FakeProposalRepo() | ||
| prop = _proposal("a1") | ||
| repo.items[prop.id] = prop | ||
| state = _FakeAppState( | ||
| approval_store=store, | ||
| proposal_repo=repo, | ||
| pipeline=None, | ||
| ) | ||
| with pytest.raises(ServiceUnavailableError): | ||
| await try_conversational_intake_resume( | ||
| state, # type: ignore[arg-type] | ||
| "a1", | ||
| approved=True, | ||
| ) | ||
|
|
||
| async def test_pipeline_failure_leaves_proposal_pending(self) -> None: | ||
| pipeline = _FakePipeline(error=RuntimeError("boom")) | ||
| state, repo = await _seed(pipeline=pipeline) | ||
| handled = await try_conversational_intake_resume( | ||
| state, # type: ignore[arg-type] | ||
| "a1", | ||
| approved=True, | ||
| ) | ||
| assert handled is True | ||
| assert repo.items["prop-a1"].status is ConversationalProposalStatus.PENDING |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | 💤 Low value
Consider adding test coverage for missing proposal repository.
The implementation has an early-return path when has_conversational_proposal_repo is False (logs error and returns True). This edge case isn't currently covered by the tests.
🧪 Proposed additional test case
async def test_missing_proposal_repo_returns_true(self) -> None:
store = ApprovalStore()
await store.add(_approval("a1"))
state = _FakeAppState(
approval_store=store,
proposal_repo=None, # repo not wired
pipeline=_FakePipeline(),
)
handled = await try_conversational_intake_resume(
state, # type: ignore[arg-type]
"a1",
approved=True,
)
assert handled is True🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/unit/api/controllers/test_conversational_intake_resume.py` around lines
168 - 239, Add a unit test that covers the early-return when the proposal
repository is missing: create an ApprovalStore with _approval("a1"), add it,
build a _FakeAppState with proposal_repo=None, pipeline=_FakePipeline(), then
call try_conversational_intake_resume(state, "a1", approved=True) and assert the
return is True; reference symbols to locate code:
try_conversational_intake_resume, _FakeAppState, ApprovalStore, _FakePipeline,
and _approval/_proposal helpers.
There was a problem hiding this comment.
Code Review
This pull request introduces the "Conversational Propose" feature for the Chief of Staff agent, enabling a natural language interface to clarify underspecified requests and park concrete work items in the human approval queue. The implementation includes new API endpoints, Pydantic models, and database repositories for SQLite and Postgres. Feedback identifies a potential race condition in the approval dispatch flow that could lead to duplicate work execution and suggests improving error handling consistency when repositories are missing.
| await app_state.work_pipeline.run(work_item) | ||
| except MemoryError, RecursionError: | ||
| raise | ||
| except ServiceUnavailableError: | ||
| raise | ||
| except Exception as exc: | ||
| # The approve decision is already persisted; a pipeline failure | ||
| # must not 5xx the response. Leave the proposal PENDING (not | ||
| # EXECUTED) so the failure is visible and not silently dropped. | ||
| logger.error( | ||
| APPROVAL_GATE_CONVERSATIONAL_FAILED, | ||
| approval_id=approval_id, | ||
| proposal_id=proposal.id, | ||
| error_type=type(exc).__name__, | ||
| error=safe_error_description(exc), | ||
| note="pipeline run failed; proposal left pending", | ||
| ) | ||
| return True | ||
|
|
||
| transitioned = await repo.transition_if( | ||
| proposal.id, | ||
| ConversationalProposalStatus.PENDING, | ||
| ConversationalProposalStatus.EXECUTED, | ||
| ) |
There was a problem hiding this comment.
There is a potential race condition here that could lead to duplicate work execution. The work_pipeline.run(work_item) call is awaited before the proposal status is transitioned to EXECUTED. If two concurrent approval decisions are processed for the same approval_id, both could reread the proposal as PENDING, both could successfully run the pipeline, and then both would attempt the CAS transition. While only one transition will succeed, the side-effectful work will have run twice. Consider performing a transition to an intermediate state like EXECUTING before running the pipeline, or ensuring the pipeline run itself is idempotent for a given proposal_id.
| if not app_state.has_conversational_proposal_repo: | ||
| logger.error( | ||
| APPROVAL_GATE_CONVERSATIONAL_FAILED, | ||
| approval_id=approval_id, | ||
| note="conversational proposal repo not wired", | ||
| ) | ||
| return True |
There was a problem hiding this comment.
Inconsistency in error handling: when the conversational_proposal_repo is missing, the function logs an error and returns True (swallowing the decision), but when the work_pipeline is missing (line 277), it raises a ServiceUnavailableError. For a hard misconfiguration like a missing repository, raising an error is generally preferred so the caller is aware that the approval decision could not be fully processed.
if not app_state.has_conversational_proposal_repo:
logger.error(
APPROVAL_GATE_CONVERSATIONAL_FAILED,
approval_id=approval_id,
note="conversational proposal repo not wired",
)
msg = "Conversational proposal repository unavailable"
raise ServiceUnavailableError(msg)CI (Test Conformance SQLite + Test Integration, 18 fails, single root cause): - Seed parent conversation row in every proposal-bearing conformance test so the fk_cp_conversation FK is satisfied before save(). CodeRabbit + Gemini inline feedback: - app.py: wire conversational_proposal_repo independently of proposer availability so a queued conversational approval can still route to the repo when the proposer is absent this boot. - _approval_review_gate.py: race-window fix; PENDING -> EXECUTING CAS acquired before work_pipeline.run, EXECUTING -> EXECUTED on success, EXECUTING -> PENDING revert on pipeline failure. Only the CAS winner ever drives the pipeline for a given proposal. - _approval_review_gate.py: missing proposal repo now raises ServiceUnavailableError (Gemini-b) instead of swallowing the approval with log+return-True. - meta.py: wrap chat_propose user message via wrap_untrusted(TAG_TASK_DATA, ...) at the API boundary (SEC-1). - meta.py: change chat_propose return type from ApiResponse[dict[str, Any]] to ApiResponse[ProposeResult]; regenerate dtos.gen.ts + openapi.gen.ts; switch web meta.ts endpoint to the generated DTO. - chief_of_staff/config.py: hoist propose-* Field literals to module-level annotated named constants. - chief_of_staff/models.py: add model_validator to ProposeResult enforcing the needs_clarification vs proposed branch invariants. - chief_of_staff/propose.py: re-raise MemoryError/RecursionError before the generic Exception handler in _run_decision. - chief_of_staff/propose.py: extract limit=1000 to module-level _MAX_TURNS_QUERY_LIMIT. - SQLite migration trap: hard-block propose_enabled + SQLite + persistent ApprovalRepository combination at startup so the narrow approvals.source CHECK cannot strand conversational_intake writes. EXECUTING state plumbing (CAS dance prerequisite): - ConversationalProposalStatus gets EXECUTING. - conversational_proposals.status CHECK widened to include 'executing' on both SQLite + Postgres migrations and schemas. Tests: - New test_missing_proposal_repo_raises asserts ServiceUnavailableError on missing repo (supersedes CR-9's recommendation, which CR matched to the old return-True behavior). - New test_concurrent_acquire_only_one_runs_pipeline asserts the EXECUTING CAS makes the loser inert. - Renamed test_pipeline_failure_leaves_proposal_pending -> test_pipeline_failure_reverts_executing_to_pending to match the new revert path.
There was a problem hiding this comment.
Actionable comments posted: 7
♻️ Duplicate comments (1)
src/synthorg/api/app.py (1)
1175-1230:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDo not return before wiring the proposal repository.
Line 1177 exits the hook before Lines 1222-1230 run, so a boot without
provider_registrynever attachesconversational_proposal_repo. That breaks approve/reject handling for already-parked conversational approvals, which is the exact fallback this hook says it should preserve.Suggested fix
if app_state.has_chief_of_staff_proposer: return - if provider_registry is None: - return from synthorg.meta.config import ( # noqa: PLC0415 load_self_improvement_config, ) from synthorg.persistence.conversational_factory import ( # noqa: PLC0415 build_conversational_repositories, ) + repositories = build_conversational_repositories(persistence) + if repositories is not None: + app_state.set_conversational_proposal_repo(repositories.proposal_repo) + if provider_registry is None: + return + meta_self_improvement = await load_self_improvement_config( app_state.settings_service if app_state.has_settings_service else None, ) @@ - repositories = build_conversational_repositories(persistence) - # Wire the proposal repo independently of the proposer: a - # conversational-intake approval that exists from a previous - # boot must still route through the repo even if the proposer - # itself is currently unavailable (e.g. provider absent this - # boot), or its decision would silently fail in - # ``try_conversational_intake_resume``. - if repositories is not None: - app_state.set_conversational_proposal_repo(repositories.proposal_repo) proposer = build_chief_of_staff_proposer( meta_self_improvement.chief_of_staff, provider_registry=provider_registry,🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/synthorg/api/app.py` around lines 1175 - 1230, The early return when provider_registry is None prevents wiring the conversational proposal repo; remove or relocate the "if provider_registry is None: return" so that after checking has_chief_of_staff_proposer you still call build_conversational_repositories(persistence) and, if repositories is not None, call app_state.set_conversational_proposal_repo(repositories.proposal_repo); keep any provider_registry-dependent logic (e.g. proposer wiring) conditional but ensure the repository wiring always runs so already-parked conversational approvals remain handled.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/synthorg/api/app.py`:
- Around line 1189-1201: Replace the migration-history framed comment with a
concise, invariant-focused explanation of the runtime guard: state that the code
hard-blocks the combination of SQLite + persistent ApprovalRepository when
propose_enabled is true because SQLite's current approvals.source CHECK
restricts allowed sources (e.g. only parked_context/review_gate), which would
cause a persistent ApprovalRepository to reject conversational_intake rows at
write time; note that Postgres does not have this restriction and is the
supported production path or else keep ApprovalStore in-memory on SQLite, and
that failing at startup is preferred to failing later on the first
/meta/chat/propose call.
In `@src/synthorg/api/approval_store.py`:
- Around line 123-131: Rewrite the property docstring to state current
behavior/rationale only (no migration history or reviewer/migration framing):
remove the historical/migration text about the "v1 SQLite migration", "Windows
8s-budget regression", and "workaround" and replace with a concise WHY
explanation that the persistent SQLite ApprovalRepository keeps approvals.source
narrow (allowed values: parked_context/review_gate) which prevents
conversational_intake rows from being accepted at write time, so wiring the
conversational proposer should be refused and Postgres is the sanctioned
production backend; reference the relevant symbols ApprovalRepository,
approvals.source, conversational_intake, conversational interface, proposer, and
Postgres in the docstring to make the rationale clear.
In `@src/synthorg/api/controllers/_approval_review_gate.py`:
- Around line 183-366: The function try_conversational_intake_resume is doing
too many responsibilities; split it into small branch-specific helpers: (1) a
loader/helper that calls _reread_approval_item and validates
ApprovalSource.CONVERSATIONAL_INTAKE and returns the proposal (use
ConversationalProposalFilterSpec and repo.query to locate the proposal), (2) a
reject handler that performs repo.transition_if(..., PENDING, REJECTED) and logs
the APPROVAL_GATE_CONVERSATIONAL_REJECTED/FAILED paths, (3) an
execute-acquire-and-run helper that performs repo.transition_if(..., PENDING,
EXECUTING), calls WorkItem.model_validate_json and app_state.work_pipeline.run,
and handles revert-on-exception via repo.transition_if(..., EXECUTING, PENDING)
with the same logging, and (4) a finalize helper that does the
EXECUTING->EXECUTED CAS and logs APPROVAL_GATE_CONVERSATIONAL_EXECUTED or the
CAS-failure warning; then make try_conversational_intake_resume orchestrate
these helpers and only keep high-level flow, preserving exact calls to
_reread_approval_item, repo.transition_if, WorkItem.model_validate_json,
app_state.work_pipeline.run and the same ServiceUnavailableError raises and log
messages.
- Around line 219-221: The current check conflates a transient read (item is
None) with a legitimate non-conversational approval and returns False, allowing
signal_resume_intent() to finish and leave a conversational approval stranded;
change the logic in the approval routing code that calls _reread_approval_item
so that item == None is treated as a transient/error case (do NOT return False)
— instead raise a transient/read error (or return a distinct sentinel that
causes the caller to retry) and only return False when item is present and
item.source != ApprovalSource.CONVERSATIONAL_INTAKE; reference
_reread_approval_item, ApprovalSource.CONVERSATIONAL_INTAKE and
signal_resume_intent so reviewers can find and update the conditional
accordingly.
In `@src/synthorg/meta/chief_of_staff/config.py`:
- Around line 16-25: The comments in src/synthorg/meta/chief_of_staff/config.py
reference real vendor/model names (e.g., "Claude/LiteLLM provider" and
"Claude/GPT-4o class models"); update those comments to use project placeholders
per guidelines: replace vendor mentions with "example-provider" (or
"test-provider" where appropriate) and model-class mentions with
"example-large-001" (or example-{large,medium,small}-001 as suitable) so the
comment near _PROPOSE_TEMPERATURE_DEFAULT and the token-size remark no longer
contains real vendor/model names.
In `@src/synthorg/meta/chief_of_staff/propose.py`:
- Around line 177-189: The current sequence allocation in propose.converse (it
reads prior_turns via _ordered_turns and sets sequence=len(prior_turns) before
calling _turn_repo.append) has a TOCTOU race causing uq_ct_conversation_sequence
collisions; fix by moving sequence assignment into the repository write path so
the DB computes the next sequence atomically (e.g., INSERT ... SELECT
COALESCE(MAX(sequence), -1)+1 or use a DB sequence) in the implementation of
_turn_repo.append, or alternatively catch the DB unique-constraint error for
uq_ct_conversation_sequence inside _turn_repo.append (or the converse caller)
and retry the append flow deterministically (re-fetch latest sequence and
re-attempt a limited number of times), ensuring ConversationTurn sequence is
assigned only once the atomic write succeeds.
- Around line 227-232: After fetching existing via
self._conversation_repo.get(args.conversation_id) add an authorization check
that the conversation owner matches the caller (e.g., compare
existing.created_by to args.user_id or args.created_by); if they differ, reject
reuse by raising an error (preferably
ConversationNotFoundError(conversation_id=args.conversation_id) to avoid leaking
existence, or a ConversationAccessDenied/Authorization error if that type
exists). Place this check right after retrieving existing and before the status
check (i.e., between the get() call and the ConversationStatus.CLOSED check) so
propose.py enforces that only the creator can resume a conversation.
---
Duplicate comments:
In `@src/synthorg/api/app.py`:
- Around line 1175-1230: The early return when provider_registry is None
prevents wiring the conversational proposal repo; remove or relocate the "if
provider_registry is None: return" so that after checking
has_chief_of_staff_proposer you still call
build_conversational_repositories(persistence) and, if repositories is not None,
call app_state.set_conversational_proposal_repo(repositories.proposal_repo);
keep any provider_registry-dependent logic (e.g. proposer wiring) conditional
but ensure the repository wiring always runs so already-parked conversational
approvals remain handled.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI (base), Organization UI (inherited)
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8cecb2ee-83f1-402c-9332-f0c0bef5e02a
📒 Files selected for processing (17)
src/synthorg/api/app.pysrc/synthorg/api/approval_store.pysrc/synthorg/api/controllers/_approval_review_gate.pysrc/synthorg/api/controllers/meta.pysrc/synthorg/core/enums.pysrc/synthorg/meta/chief_of_staff/config.pysrc/synthorg/meta/chief_of_staff/models.pysrc/synthorg/meta/chief_of_staff/propose.pysrc/synthorg/persistence/postgres/revisions/20260519000001_conversational_intake.sqlsrc/synthorg/persistence/postgres/schema.sqlsrc/synthorg/persistence/sqlite/revisions/20260519000001_conversational_intake.sqlsrc/synthorg/persistence/sqlite/schema.sqltests/conformance/persistence/test_conversational_proposal_repository.pytests/unit/api/controllers/test_conversational_intake_resume.pyweb/src/api/endpoints/meta.tsweb/src/api/types/dtos.gen.tsweb/src/api/types/openapi.gen.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Deploy Preview
- GitHub Check: Build Web Assets (melange)
- GitHub Check: Dashboard Test
- GitHub Check: Test Integration
- GitHub Check: Test Unit
- GitHub Check: Test E2E
- GitHub Check: Lighthouse Dashboard
- GitHub Check: Lighthouse Site
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (9)
web/src/**/*.{js,jsx,ts,tsx,mts}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/**/*.{js,jsx,ts,tsx,mts}: Always usecreateLoggerfrom@/lib/logger; never bareconsole.warn/console.error/console.debugin application code. Variable name must always belog. Onlylogger.tsitself may use bare console methods. Uselog.debug()(DEV-only, stripped in production),log.warn(),log.error().
Pass dynamic/untrusted values as separate args to logger calls (not interpolated into the message string) so they go throughsanitizeArg
Attacker-controlled fields inside structured objects must be wrapped insanitizeForLog()before embedding in log calls
Error-code constants (MANDATORY): importErrorCodeandErrorCategoryfrom@/api/types/errors(re-exported from the generatedweb/src/api/types/error-codes.gen.ts). Discriminate onErrorCode.<NAME>, never on raw integer literals.
Use@eslint-react/web-api-no-leaked-fetchto detectfetch()in effects withoutAbortControllercleanup
Files:
web/src/api/types/dtos.gen.tsweb/src/api/endpoints/meta.tsweb/src/api/types/openapi.gen.ts
web/src/api/types/**/*.gen.ts
📄 CodeRabbit inference engine (web/CLAUDE.md)
Generated DTO types (MANDATORY): NEVER hand-edit
web/src/api/types/*.gen.ts. Regenerate withuv run python scripts/generate_dto_types_ts.py. Import DTOs via the barrel (import type { AgentConfig } from '@/api/types').
Files:
web/src/api/types/dtos.gen.tsweb/src/api/types/openapi.gen.ts
web/src/**/*.{ts,tsx,mts}
📄 CodeRabbit inference engine (web/CLAUDE.md)
web/src/**/*.{ts,tsx,mts}: Use@typescript-eslint/no-floating-promisesto forbid unawaited promises so async work cannot survive the test that scheduled it and trip the active-handle gate
Use@typescript-eslint/no-misused-promises(withchecksVoidReturn: { attributes: false }) to forbid passing async functions where the callsite ignores the returned promise. React 19asyncevent handlers stay allowed via theattributes: falseexemption.
Files:
web/src/api/types/dtos.gen.tsweb/src/api/endpoints/meta.tsweb/src/api/types/openapi.gen.ts
web/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Web: see
web/CLAUDE.mdfor design system and component guidelines
Files:
web/src/api/types/dtos.gen.tsweb/src/api/endpoints/meta.tsweb/src/api/types/openapi.gen.ts
web/src/{api/endpoints,stores}/**/*.ts
📄 CodeRabbit inference engine (web/CLAUDE.md)
Cursor pagination (MANDATORY): list endpoints must use opaque cursor-based paging via
PaginationMeta. Stores must keepnextCursor+hasMorein state (not offset arithmetic) and early-return when!hasMore || !nextCursor. Display counts must come fromdata.length.
Files:
web/src/api/endpoints/meta.ts
web/src/api/endpoints/**/*.ts
📄 CodeRabbit inference engine (web/CLAUDE.md)
Health / readiness endpoints (MANDATORY):
getLiveness()is always 200 while the process is alive;getReadiness()is 200 healthy / 503 unavailable (binary'ok' | 'unavailable'outcome, no tri-state). Any new caller must handle the 503 path explicitly.
Files:
web/src/api/endpoints/meta.ts
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Numerics must live in
settings/definitions/; allowlist 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants (NAME: int|float|Final|Final[int]|Final[float] = literal); enforced byscripts/check_no_magic_numbers.pyNo
from __future__ import annotations(3.14 has PEP 649); PEP 758 except:except A, B:no parens unless bindingType hints on public functions; mypy strict; Google-style docstrings; line length 88; functions <50 lines; files <800 lines
Errors must follow
<Domain><Condition>Errornaming fromDomainErrorbase; never inheritException/RuntimeErrordirectly; enforced bycheck_domain_error_hierarchy.pyPydantic v2: frozen +
extra="forbid"on every frozen model;@computed_fieldauto-exempt; per-line# lint-allow: frozen-extra-forbid -- <reason>for boundaries; enforced bycheck_frozen_model_extra_forbid.pyArgs models at every system boundary;
parse_typed()for every external dict ingestion; enforced bycheck_boundary_typed.pyImmutability: use
model_copy(update=...)orcopy.deepcopy(); deepcopy at system boundariesAsync: use
asyncio.TaskGroupfor fan-out/fan-in; helpers catchException(re-raiseMemoryError/RecursionError)Clock seam:
clock: Clock | None = None; tests injectFakeClock; services own_lifecycle_lock; timed-out stops mark unrestartableUntrusted content (SEC-1): wrap via
wrap_untrusted()fromengine.prompt_safety; useHTMLParseGuardfor HTMLUse
from synthorg.observability import get_logger; variable name alwayslogger; neverimport loggingorprint()in app codeEvent names from
observability.events.<domain>constants; use structured kwargs (logger.info(EVENT, key=value))Error paths log WARNING/ERROR with context before raising; state transitions log INFO via
*_STATUS_TRANSITIONEDafter persistence writeSecret-log redaction (SEC-1): never use
error=str(exc)or interpolate{exc}; useerror_type=type(exc).__name__+ `error=safe...
Files:
src/synthorg/api/approval_store.pysrc/synthorg/api/controllers/meta.pysrc/synthorg/core/enums.pysrc/synthorg/meta/chief_of_staff/config.pysrc/synthorg/api/app.pysrc/synthorg/api/controllers/_approval_review_gate.pysrc/synthorg/meta/chief_of_staff/models.pysrc/synthorg/meta/chief_of_staff/propose.py
⚙️ CodeRabbit configuration file
This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.
Files:
src/synthorg/api/approval_store.pysrc/synthorg/api/controllers/meta.pysrc/synthorg/core/enums.pysrc/synthorg/meta/chief_of_staff/config.pysrc/synthorg/api/app.pysrc/synthorg/api/controllers/_approval_review_gate.pysrc/synthorg/meta/chief_of_staff/models.pysrc/synthorg/meta/chief_of_staff/propose.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Test markers:
@pytest.mark.{unit,integration,e2e,slow}; asyncauto; timeout 30s global; coverage 80% minWindows: unit tests use
WindowsSelectorEventLoopPolicy; subprocess tests override backTest doubles: use
FakeClockfor Clock seam,mock_of[T](**overrides)for typed-boundary substitutions,SimpleNamespacefor attribute-bags; bareMagicMockat typed boundaries is blocked byscripts/check_mock_spec.pyHypothesis: 10 deterministic CI examples; failures are real bugs (fix + add
@example(...)); never skip/xfail flaky tests; fix fundamentally
Files:
tests/conformance/persistence/test_conversational_proposal_repository.pytests/unit/api/controllers/test_conversational_intake_resume.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/conformance/persistence/test_conversational_proposal_repository.pytests/unit/api/controllers/test_conversational_intake_resume.py
tests/conformance/persistence/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Dual-backend conformance:
tests/conformance/persistence/consumesbackendfixture (SQLite + Postgres); enforced bycheck_dual_backend_test_parity.py
Files:
tests/conformance/persistence/test_conversational_proposal_repository.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:13:21.876Z
Learning: Read `docs/design/` page before implementing; deviations need approval (see DESIGN_SPEC.md)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:13:21.876Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:13:21.876Z
Learning: No region/currency/locale privileged; use metric units; British English (see docs/reference/regional-defaults.md)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:13:21.876Z
Learning: Every convention PR must ship its enforcement gate (see docs/reference/convention-gates.md)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:13:21.876Z
Learning: After issue implementation: branch + commit + push (no auto-PR); use `/pre-pr-review` then `/aurelio-review-pr` for feedback; fix everything valid before merging
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:13:21.876Z
Learning: xdist: `-n 8 --dist=loadfile` auto-applied via pyproject `addopts`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:13:21.876Z
Learning: Commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:13:21.876Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:13:21.876Z
Learning: Branches: `<type>/<slug>` from main
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:13:21.876Z
Learning: Pre-commit/pre-push hooks: use `.pre-commit-config.yaml`; tool-call gates in `.claude/settings.json` PreToolUse
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:13:21.876Z
Learning: Squash merge; PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:13:21.876Z
Learning: Use `gh issue list` via Bash, NOT MCP `list_issues`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:13:21.876Z
Learning: After every squash merge → `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:13:21.876Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
src/synthorg/api/approval_store.pysrc/synthorg/api/controllers/meta.pysrc/synthorg/core/enums.pysrc/synthorg/meta/chief_of_staff/config.pysrc/synthorg/api/app.pytests/conformance/persistence/test_conversational_proposal_repository.pysrc/synthorg/api/controllers/_approval_review_gate.pysrc/synthorg/meta/chief_of_staff/models.pysrc/synthorg/meta/chief_of_staff/propose.pytests/unit/api/controllers/test_conversational_intake_resume.py
🔇 Additional comments (8)
web/src/api/types/dtos.gen.ts (1)
78-78: LGTM!Also applies to: 172-172, 355-356
web/src/api/endpoints/meta.ts (1)
8-12: LGTM!Also applies to: 16-20, 114-141
src/synthorg/api/controllers/meta.py (1)
45-52: LGTM!Also applies to: 406-479
src/synthorg/persistence/postgres/schema.sql (1)
1153-1153: LGTM!Also applies to: 1198-1238
src/synthorg/persistence/sqlite/schema.sql (1)
1075-1078: LGTM!Also applies to: 1127-1181
src/synthorg/core/enums.py (1)
597-606: LGTM!Also applies to: 609-654
src/synthorg/meta/chief_of_staff/config.py (1)
71-82: LGTM!Also applies to: 110-137
web/src/api/types/openapi.gen.ts (1)
2120-2136: LGTM!Also applies to: 5229-5236, 5817-5825, 6695-6700, 7580-7580, 9972-9990, 17487-17517
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2019 +/- ##
==========================================
- Coverage 85.11% 85.01% -0.11%
==========================================
Files 1918 1926 +8
Lines 113601 114686 +1085
Branches 9692 9763 +71
==========================================
+ Hits 96690 97498 +808
- Misses 14534 14789 +255
- Partials 2377 2399 +22 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Round 2 covers CodeRabbit's re-review of round 1's commit. Comment / docstring hygiene (CR-#1, #2, #5): - app.py: rewrite the SQLite+persistent-ApprovalStore hard-block comment without migration framing; describe the invariant the guard enforces, not the migration's history. - approval_store.py: trim ``has_persistent_repo`` docstring to current-state behaviour only. - chief_of_staff/config.py: replace real vendor names (OpenAI/Claude/GPT-4o/LiteLLM) in propose-* constant comments with project-approved generic wording. App startup ordering (CR-#8 duplicate): - app.py: move conversational_proposal_repo wiring above the ``provider_registry is None: return`` early-return so the repo attaches even on boots without an LLM provider (preserves the approve/reject path for already-parked conversational approvals, which was the stated rationale for the round-1 decoupling). Conversational-intake resume (CR-#3 + CR-#4): - _approval_review_gate.py: split try_conversational_intake_resume (187 lines, well over the <50 line budget) into orchestrator + three branch-specific helpers: * _load_conversational_proposal: source check + repo + lookup. * _reject_conversational_proposal: PENDING -> REJECTED CAS. * _execute_conversational_proposal: PENDING -> EXECUTING -> EXECUTED CAS with revert-on-failure. The orchestrator is now 43 lines and reads as a flow table. - _approval_review_gate.py: ``_reread_approval_item`` returning ``None`` for a conversational approval now raises ServiceUnavailableError. Returning False would route the call through to non-conversational flows and silently strand the decided approval. Docstring updated to spell out the flow-specific semantics. Cross-tenant privacy (CR-#7, Critical): - propose.py: ``_resolve_conversation`` now rejects conversation-id reuse across users. A caller who learns another conversation_id could otherwise append turns and have prior history fed back to the model -- a cross-tenant break. Mapped to NotFound so the endpoint cannot be used to probe existence either. TOCTOU on sequence allocation (CR-#6, Heavy lift): - conversation_repo.py (SQLite + Postgres): bounded retry-on-IntegrityError in ``append`` when the (conversation_id, sequence) UNIQUE constraint fires. The repo re-queries the live max sequence, model_copies the turn, and retries up to 3 times. Other constraint violations (FK miss, CHECK on content/role) still translate directly to ConstraintViolationError without retry. Tests: - test_propose.py: new test_other_user_resume_blocked. - test_conversational_intake_resume.py: new test_unreadable_approval_item_raises (covers the new raise-on-None path). - test_conversation_repository.py: renamed test_append_duplicate_sequence_rejected -> test_append_duplicate_sequence_resequenced; asserts the new race-safe semantics (second append lands at sequence + 1).
Round 2's first commit applied CodeRabbit's suggested raise-on-None in ``_load_conversational_proposal``, intending to surface a transient approval-store read failure on a CONVERSATIONAL_INTAKE approval. The pre-push pytest-affected gate caught the regression: ``test_flow1_existence_check_error_returns_early`` constructs a ``_store(None)`` (item is None) and expects Flow 0 to yield so Flow 1 can run its parked-context probe. The new raise short-circuits that fallback for every approval whose store row is unreadable, not just CONVERSATIONAL_INTAKE ones (the source is exactly what we cannot determine when item is None). The original behaviour -- ``if item is None or item.source is not CONVERSATIONAL_INTAKE: return False, None`` -- is correct: - Flow 1 (``try_mid_execution_resume``) owns the parked-context probe for the item-is-None case and routes legitimate parked approvals correctly. - Flow 2 (review-gate) handles its own missing-task_id no-op. - A CONVERSATIONAL_INTAKE approval whose store row is unreadable is an acknowledged residual edge case; the right fix is at the approval-store layer (differentiate "transient error" from "row missing") and is out of scope for this PR. Reverts: - ``_load_conversational_proposal``: drop the raise, restore the yield-when-item-is-None semantics. - ``_reread_approval_item`` docstring: remove the misleading "Flow 0 raises" wording. - Drop ``test_unreadable_approval_item_raises`` (asserted the removed behaviour). Round-2 history will record CR-#4 as a documented skip.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/synthorg/meta/chief_of_staff/propose.py (1)
177-210:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftSerialise
converse()per conversation before the LLM turn.
prior_turnsandnext_sequenceare snapped before this request is durably linearised. The repo-side resequencing only avoidssequencecollisions; two concurrentconverse()calls on the sameconversation_idstill run the model against different histories and can persist clarifications/proposals in an order neither call saw. A double-submit or retry race can therefore produce incoherent 1:1 conversation state.A conversation-scoped lock or optimistic CAS around the whole turn pipeline (
load turns -> append user -> run model -> append assistant/proposals -> update conversation) would keep the transcript linear.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/synthorg/meta/chief_of_staff/propose.py` around lines 177 - 210, The propose pipeline snaps prior_turns and next_sequence before the request is linearised, allowing concurrent converse() calls to race; wrap the entire turn pipeline (load conversation/_resolve_conversation, fetch _ordered_turns, create-and-append user ConversationTurn via _turn_repo.append, run _run_decision, then call _record_clarification or _record_proposals) in a conversation-scoped critical section or optimistic CAS: either acquire a per-conversation async lock (e.g. self._conversation_locks[conversation.id]) around that sequence or perform the work in a DB transaction/CAS that verifies and increments a conversation version, recomputing next_sequence after the lock/transaction and retrying on conflict, so the model always runs against the canonical linearised history before persisting assistant turns.
♻️ Duplicate comments (1)
src/synthorg/api/controllers/_approval_review_gate.py (1)
216-218:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDon't treat an unreadable approval as “not conversational”.
_reread_approval_item()returnsNoneon store failures too. Returning(False, None)here letssignal_resume_intent()fall through to the parked-context / review-gate paths and complete without ever rejecting or executing the conversational proposal, leaving an already-decided conversational approval stranded.Suggested fix
item = await _reread_approval_item(app_state, approval_id) - if item is None or item.source is not ApprovalSource.CONVERSATIONAL_INTAKE: + if item is None: + logger.error( + APPROVAL_GATE_CONVERSATIONAL_FAILED, + approval_id=approval_id, + note="approval state unavailable", + ) + msg = "Approval state unavailable" + raise ServiceUnavailableError(msg) + if item.source is not ApprovalSource.CONVERSATIONAL_INTAKE: return False, None🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/synthorg/api/controllers/_approval_review_gate.py` around lines 216 - 218, The current check conflates a store-read failure with a non-conversational approval; change the logic so that _reread_approval_item(app_state, approval_id) being None is treated as a read failure and propagated (e.g., return (None, None) or raise) instead of returning (False, None), and only evaluate item.source != ApprovalSource.CONVERSATIONAL_INTAKE when item is not None; this ensures signal_resume_intent() can detect read failures rather than mistakenly treating them as non-conversational approvals.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/synthorg/persistence/postgres/conversation_repo.py`:
- Around line 392-405: The resequencing read using _TURN_NEXT_SEQUENCE_SQL
inside append() is not wrapped in the repository's QueryError contract; catch
database exceptions around the async with block that executes
_TURN_NEXT_SEQUENCE_SQL (the cursor.execute / cur.fetchone section) and rethrow
them as the repository's QueryError so callers don't observe raw psycopg
exceptions; ensure you import/use QueryError and wrap the entire resequencing
block (including await cur.execute and await cur.fetchone) to raise QueryError
with a helpful message and the original exception attached.
In `@src/synthorg/persistence/sqlite/conversation_repo.py`:
- Around line 470-479: The fallback SELECT used when sequence_race is true (the
query executed with _TURN_NEXT_SEQUENCE_SQL inside the append() flow) can raise
raw sqlite exceptions; catch exceptions from the await self._db.execute(...) /
fetchone() and rethrow them as QueryError (or pass them through the same
SQLite-to-QueryError mapping used elsewhere in append()), so the read that
computes next_sequence is wrapped in the same error mapping as the rest of
append() and will yield a QueryError instead of leaking sqlite3/aiosqlite
exceptions.
---
Outside diff comments:
In `@src/synthorg/meta/chief_of_staff/propose.py`:
- Around line 177-210: The propose pipeline snaps prior_turns and next_sequence
before the request is linearised, allowing concurrent converse() calls to race;
wrap the entire turn pipeline (load conversation/_resolve_conversation, fetch
_ordered_turns, create-and-append user ConversationTurn via _turn_repo.append,
run _run_decision, then call _record_clarification or _record_proposals) in a
conversation-scoped critical section or optimistic CAS: either acquire a
per-conversation async lock (e.g. self._conversation_locks[conversation.id])
around that sequence or perform the work in a DB transaction/CAS that verifies
and increments a conversation version, recomputing next_sequence after the
lock/transaction and retrying on conflict, so the model always runs against the
canonical linearised history before persisting assistant turns.
---
Duplicate comments:
In `@src/synthorg/api/controllers/_approval_review_gate.py`:
- Around line 216-218: The current check conflates a store-read failure with a
non-conversational approval; change the logic so that
_reread_approval_item(app_state, approval_id) being None is treated as a read
failure and propagated (e.g., return (None, None) or raise) instead of returning
(False, None), and only evaluate item.source !=
ApprovalSource.CONVERSATIONAL_INTAKE when item is not None; this ensures
signal_resume_intent() can detect read failures rather than mistakenly treating
them as non-conversational approvals.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI (base), Organization UI (inherited)
Review profile: ASSERTIVE
Plan: Pro
Run ID: ca38dcdc-d999-4279-8896-dcdfaea74643
📒 Files selected for processing (9)
src/synthorg/api/app.pysrc/synthorg/api/approval_store.pysrc/synthorg/api/controllers/_approval_review_gate.pysrc/synthorg/meta/chief_of_staff/config.pysrc/synthorg/meta/chief_of_staff/propose.pysrc/synthorg/persistence/postgres/conversation_repo.pysrc/synthorg/persistence/sqlite/conversation_repo.pytests/conformance/persistence/test_conversation_repository.pytests/unit/meta/chief_of_staff/test_propose.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Build Backend
- GitHub Check: Build Web Assets (melange)
- GitHub Check: Dashboard Test
- GitHub Check: Test Unit
- GitHub Check: Test E2E
- GitHub Check: Test Integration
- GitHub Check: Lighthouse Dashboard
- GitHub Check: Lighthouse Site
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (6)
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Numerics live in
settings/definitions/; allowlist only 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants per check_no_magic_numbers.pyComments explain WHY only; no reviewer citations / issue back-refs / migration framing per check_no_review_origin_in_code.py
No
from __future__ import annotations; use PEP 758 syntaxType hints on public functions; mypy strict; Google-style docstrings; line length 88; functions <50 lines; files <800 lines
Errors use
<Domain><Condition>Errornaming fromDomainErrorbase; never inheritException/RuntimeErrordirectly per check_domain_error_hierarchy.pyPydantic v2 frozen +
extra="forbid"on every frozen model project-wide;@computed_fieldfor derived;NotBlankStrfor identifiers per check_frozen_model_extra_forbid.pyArgs models at every system boundary;
parse_typed()for every external dict ingestion per check_boundary_typed.pyImmutability: use
model_copy(update=...)orcopy.deepcopy(); deepcopy at system boundariesAsync code: use
asyncio.TaskGroupfor fan-out/fan-in; helpers catchException(re-raiseMemoryError/RecursionError)Clock seam:
clock: Clock | None = None; tests injectFakeClock; lifecycle: services own_lifecycle_lock; timed-out stops mark unrestartableUntrusted content (SEC-1): use
wrap_untrusted()fromengine.prompt_safety;HTMLParseGuardfor HTMLLogging: use
from synthorg.observability import get_logger; variable alwayslogger; neverimport logging/print()in app codeEvent names from
observability.events.<domain>constants; structured kwargs (logger.info(EVENT, key=value))Secret-log redaction (SEC-1): never use
error=str(exc)or interpolate{exc}; useerror_type=type(exc).__name__+error=safe_error_description(exc); neverexc_info=True; OTel: forbidspan.record_exception(exc)per check_logger_exception_str_exc.pyNever use real vendor names in project code/tests; use
example-provider, `test-...
Files:
src/synthorg/api/approval_store.pysrc/synthorg/api/app.pysrc/synthorg/api/controllers/_approval_review_gate.pysrc/synthorg/meta/chief_of_staff/config.pysrc/synthorg/persistence/postgres/conversation_repo.pysrc/synthorg/persistence/sqlite/conversation_repo.pysrc/synthorg/meta/chief_of_staff/propose.py
⚙️ CodeRabbit configuration file
This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.
Files:
src/synthorg/api/approval_store.pysrc/synthorg/api/app.pysrc/synthorg/api/controllers/_approval_review_gate.pysrc/synthorg/meta/chief_of_staff/config.pysrc/synthorg/persistence/postgres/conversation_repo.pysrc/synthorg/persistence/sqlite/conversation_repo.pysrc/synthorg/meta/chief_of_staff/propose.py
src/synthorg/api/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
API startup: construction phase wires synchronous services; on_startup wires services needing connected persistence; agent_registry before auto_wire_meetings; tunnel_provider always wired; SettingsService before WorkflowExecutionObserver; OntologyService after persistence.connect()
Files:
src/synthorg/api/approval_store.pysrc/synthorg/api/app.pysrc/synthorg/api/controllers/_approval_review_gate.py
src/synthorg/{workers,api}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Runtime services: build from shared AgentEngine via
build_runtime_services; install via_install_runtime_services; swap functions hold lock; empty company rejects with 4014 error, /coordinate honestly 503s
Files:
src/synthorg/api/approval_store.pysrc/synthorg/api/app.pysrc/synthorg/api/controllers/_approval_review_gate.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Test markers:
@pytest.mark.{unit,integration,e2e,slow}; asyncauto; timeout 30s global; coverage 80% minWindows: unit tests use
WindowsSelectorEventLoopPolicy; subprocess tests override backTest doubles: FakeClock for Clock seam;
mock_of[T](**overrides)for typed-boundary substitutions; SimpleNamespace for attribute-bags; bare MagicMock at typed boundary blocked by check_mock_spec.pyFakeClock and
mock_ofimport fromtests._shared; inject viaclock=and helper's spec subscriptHypothesis: 10 deterministic CI examples; failures are real bugs (fix + add
@example(...))Flaky tests: never skip/xfail; fix fundamentally; use
asyncio.Event().wait()notsleep(large)
Files:
tests/conformance/persistence/test_conversation_repository.pytests/unit/meta/chief_of_staff/test_propose.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/conformance/persistence/test_conversation_repository.pytests/unit/meta/chief_of_staff/test_propose.py
tests/conformance/persistence/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Dual-backend conformance:
tests/conformance/persistence/consumesbackendfixture (SQLite + Postgres) per check_dual_backend_test_parity.py
Files:
tests/conformance/persistence/test_conversation_repository.py
src/synthorg/persistence/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Only
src/synthorg/persistence/may import sqlite/psycopg or emit raw SQL; new repositories must inherit from generic categories inpersistence/_generics.pyper ADR-0001Repository CRUD: use
save(entity),get(id),delete(id) -> bool,list_items(...),query(...)returning tuplesDatetime in persistence: use
parse_iso_utc/format_iso_utcfrompersistence._shared(reject naive);normalize_utcfor already-typed
Files:
src/synthorg/persistence/postgres/conversation_repo.pysrc/synthorg/persistence/sqlite/conversation_repo.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:58:20.575Z
Learning: Read `docs/design/` page before implementing; deviations need approval per DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:58:20.575Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:58:20.575Z
Learning: No region/currency/locale privilege; use metric units and British English per regional-defaults.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:58:20.575Z
Learning: Every convention PR ships its enforcement gate per convention-gates.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:58:20.575Z
Learning: After issue: branch + commit + push (no auto-PR); use `/pre-pr-review`; after PR: `/aurelio-review-pr`; fix EVERYTHING valid; no deferring
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:58:20.575Z
Learning: Commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:58:20.575Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:58:20.575Z
Learning: Branches: `<type>/<slug>` from main
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:58:20.575Z
Learning: Pre-commit/pre-push hooks in `.pre-commit-config.yaml`; tool-call gates in `.claude/settings.json` PreToolUse
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:58:20.575Z
Learning: Squash merge with PR body as commit message; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:58:20.575Z
Learning: GitHub queries: use `gh issue list` via Bash, NOT MCP `list_issues`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:58:20.575Z
Learning: After every squash merge → `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:58:20.575Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T00:58:20.575Z
Learning: Use Go via `go -C cli` command, never `cd cli`; see `cli/CLAUDE.md`
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
src/synthorg/api/approval_store.pysrc/synthorg/api/app.pysrc/synthorg/api/controllers/_approval_review_gate.pytests/conformance/persistence/test_conversation_repository.pysrc/synthorg/meta/chief_of_staff/config.pysrc/synthorg/persistence/postgres/conversation_repo.pysrc/synthorg/persistence/sqlite/conversation_repo.pysrc/synthorg/meta/chief_of_staff/propose.pytests/unit/meta/chief_of_staff/test_propose.py
🔇 Additional comments (1)
tests/unit/meta/chief_of_staff/test_propose.py (1)
351-373: LGTM!
CodeRabbit re-review of round 2 (head 83e3921) surfaced 3 valid findings plus 1 duplicate re-flag (replied + flagged for resolve). QueryError wrapping on the resequencing read (CR-A1, CR-A2): - conversation_repo.py (SQLite + Postgres): wrap the ``_TURN_NEXT_SEQUENCE_SQL`` lookup that runs inside the ``append`` retry branch in its own try/except. SQLite catches ``(sqlite3.Error, aiosqlite.Error)``; Postgres catches ``psycopg.Error``. Both translate to ``QueryError`` with the same logging shape the rest of ``append`` already uses, so callers no longer see raw driver exceptions leak from the resequencing path. Conversation-scoped turn pipeline serialisation (CR-A3, Heavy lift): - propose.py: add per-conversation ``asyncio.Lock`` registry to ``ChiefOfStaffProposer``. ``converse()`` now resolves the conversation first (so a fresh ``conversation_id`` doesn't need locking) and then runs the rest of the turn pipeline under the per-conversation lock; the body is extracted into ``_run_turn``. This closes the broader TOCTOU window CR-A6 (round 2) only addressed at the sequence-collision layer: two concurrent ``converse()`` calls on the same conversation now serialise fully, so the model always runs against the canonical linearised history and assistant turns commit in order. ``_lock_for`` uses a lazy ``_conversation_locks_guard`` so the lock binds to the live request-handling loop rather than the loop that constructed the proposer. The lock dict grows with conversation count; for v1's short-thread expectation this is acceptable, follow-up if scale demands it. CodeRabbit duplicate re-flag (round 2 CR-#4): - Replied on inline thread 3270499316 explaining the disproof recorded in round 2's history (the raise-on-None fix breaks test_flow1_existence_check_error_returns_early because Flow 1's parked-context probe owns the item-None fallback for non-conversational approvals). Asked @coderabbitai to resolve the thread. Tests: - test_propose.py: new test_per_conversation_lock_serialises_turns asserts two concurrent ``converse()`` calls on the same id land 4 distinct turns at sequences 0/1/2/3 (would collide on 0 absent the lock). - test_propose.py: new test_locks_isolated_per_conversation asserts ``_lock_for`` returns the same instance per id and a fresh instance per distinct id.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/synthorg/meta/chief_of_staff/propose.py`:
- Around line 150-182: The _conversation_locks dict is unbounded; add a cleanup
to remove the per-conversation lock when a conversation enters a terminal state:
implement a new helper (e.g. _remove_conversation_lock(conversation_id: str))
that acquires _conversation_locks_guard and pops
self._conversation_locks[conversation_id] if present, and call this helper from
the code paths that mark a conversation terminal (e.g. inside _cap_conversation
and/or _record_proposals) after the final commit; keep using
_conversation_locks_guard to synchronise removal so _lock_for and removals
cannot race.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI (base), Organization UI (inherited)
Review profile: ASSERTIVE
Plan: Pro
Run ID: c6250a49-78e4-4126-88de-3a6b6f1e1586
📒 Files selected for processing (4)
src/synthorg/meta/chief_of_staff/propose.pysrc/synthorg/persistence/postgres/conversation_repo.pysrc/synthorg/persistence/sqlite/conversation_repo.pytests/unit/meta/chief_of_staff/test_propose.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: Test E2E
- GitHub Check: Dashboard Test
- GitHub Check: Test Integration
- GitHub Check: Test Unit
- GitHub Check: Build Web Assets (melange)
- GitHub Check: Lighthouse Site
- GitHub Check: Lighthouse Dashboard
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (6)
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Test markers:
@pytest.mark.{unit,integration,e2e,slow}. Async auto. Timeout 30s global. Coverage 80% min. xdist -n 8 --dist=loadfile auto-applied via pyproject addoptsWindows: unit tests use WindowsSelectorEventLoopPolicy (3.14 IOCP teardown race). Subprocess tests override back
Test doubles: use FakeClock for Clock seam, mock_ofT for typed-boundary substitutions, SimpleNamespace for attribute-bags. Bare MagicMock at typed boundary is blocked by check_mock_spec.py (zero-tolerance)
Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add
@example(...)). Never skip/xfail flaky tests; fix fundamentally. Use asyncio.Event().wait() not sleep(large)
Files:
tests/unit/meta/chief_of_staff/test_propose.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/meta/chief_of_staff/test_propose.py
{src,tests}/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Vendor-agnostic test data: NEVER use real vendor names in project code/tests. Use example-provider, test-provider, example-{large,medium,small}-001. Allowed in .claude/, third-party imports, providers/presets.py, web/public/provider-logos/
Files:
tests/unit/meta/chief_of_staff/test_propose.pysrc/synthorg/persistence/sqlite/conversation_repo.pysrc/synthorg/persistence/postgres/conversation_repo.pysrc/synthorg/meta/chief_of_staff/propose.py
src/synthorg/persistence/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
New repository protocols must inherit from one or more generic categories in
persistence/_generics.py(SingletonRepository, IdKeyedRepository, FilteredQueryRepository, AppendOnlyRepository, StatefulRepository, MVCCRepository); bespoke methods allowed only under ADR-0001 D7Repository protocols must stay
@runtime_checkableRepository CRUD methods:
save(entity),get(id),delete(id) -> bool,list_items(...),query(...)returning tuplesDatetime in persistence: use
parse_iso_utc/format_iso_utcfrompersistence._shared(reject naive); usenormalize_utcfor already-typed
Files:
src/synthorg/persistence/sqlite/conversation_repo.pysrc/synthorg/persistence/postgres/conversation_repo.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
No hardcoded numeric values; numerics must live in
settings/definitions/. Allowlist: 0, 1, -1, HTTP codes, hex masks, powers-of-2, module-level annotated named constants (NAME: int|float|Final|Final[int]|Final[float] = literal). Enforced by scripts/check_no_magic_numbers.pyUse
from synthorg.observability import get_loggerwith variable namelogger; neverimport loggingorprint()in app codeEvent names must come from
observability.events.<domain>constants; use structured kwargs (logger.info(EVENT, key=value))Error paths log WARNING/ERROR with context before raising; state transitions log INFO via *_STATUS_TRANSITIONED after persistence write
Never use
error=str(exc)or interpolate{exc}in logs; useerror_type=type(exc).__name__+error=safe_error_description(exc). Neverexc_info=True. For OTel: forbidspan.record_exception(exc); usespan.set_attribute("exception.message", safe_error_description(exc))withrecord_exception=False, set_status_on_exception=False. Enforced by check_logger_exception_str_exc.pyComments must explain WHY only; no reviewer citations, issue back-refs, or migration framing. Enforced by check_no_review_origin_in_code.py + check_no_migration_framing.py
No
from __future__ import annotationsin Python 3.14 (has PEP 649). Use PEP 758 except:except A, B:no parens unless bindingType hints required on public functions; mypy strict. Google-style docstrings. Line length 88; functions <50 lines; files <800 lines
Errors must use format
<Domain><Condition>Errorinheriting fromDomainError, never directly from Exception/RuntimeError/etc. Enforced by check_domain_error_hierarchy.pyEvery Pydantic v2 frozen model project-wide must use
extra="forbid". Gate: check_frozen_model_extra_forbid.py.@computed_fieldauto-exempt; use# lint-allow: frozen-extra-forbid -- <reason>forextra="allow"/"ignore"boundaries. Use@computed_fieldfor derived fields. UseNotBlankStrfor identifiersArgs models requ...
Files:
src/synthorg/persistence/sqlite/conversation_repo.pysrc/synthorg/persistence/postgres/conversation_repo.pysrc/synthorg/meta/chief_of_staff/propose.py
⚙️ CodeRabbit configuration file
This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.
Files:
src/synthorg/persistence/sqlite/conversation_repo.pysrc/synthorg/persistence/postgres/conversation_repo.pysrc/synthorg/meta/chief_of_staff/propose.py
src/synthorg/!(persistence)/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Only
src/synthorg/persistence/may import sqlite/psycopg or emit raw SQL
Files:
src/synthorg/meta/chief_of_staff/propose.py
src/synthorg/meta/chief_of_staff/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Conversational propose: POST /meta/chat/propose is opt-in (meta.chief_of_staff.propose_enabled); ChiefOfStaffProposer built by build_chief_of_staff_proposer (ENFORCED manifest entry) and 503s when provider, persistence, or work pipeline missing. Human content wrapped via wrap_untrusted(TAG_TASK_DATA, ...) (SEC-1)
Files:
src/synthorg/meta/chief_of_staff/propose.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T01:27:43.039Z
Learning: Read design spec from `docs/design/` page before implementing; deviations require approval per DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T01:27:43.039Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T01:27:43.039Z
Learning: No region/currency/locale should be privileged; use metric units; use British English in all text
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T01:27:43.039Z
Learning: Every convention PR must ship its enforcement gate per docs/reference/convention-gates.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T01:27:43.039Z
Learning: Configuration precedence: DB > env > code default via SettingsService/ConfigResolver (Cat-1) or env > code default (Cat-2, read_only_post_init); Cat-3 bootstrap secrets are pure env. YAML is company-template ingestion only, not a precedence tier
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T01:27:43.039Z
Learning: Timeout/slow test failures indicate source-code regression; never edit tests/baselines/unit_timing.json or scripts/*_baseline.{txt,json} / scripts/_*_baseline.py. Both families PreToolUse-blocked. Per-invocation bypass requires explicit approval: ALLOW_BASELINE_GROWTH=1 git commit
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T01:27:43.039Z
Learning: After issue completion: branch + commit + push (no auto-PR); use `/pre-pr-review`. After PR merge use `/aurelio-review-pr` for external feedback. Fix all valid issues; no deferring
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T01:27:43.039Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T01:27:43.039Z
Learning: Git commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced. Signed commits required on protected refs. Branches: `<type>/<slug>` from main. Squash merge. PR body becomes squash commit; trailers (Release-As, Closes `#N`) in PR body
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T01:27:43.039Z
Learning: After every squash merge → `/post-merge-cleanup`
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
tests/unit/meta/chief_of_staff/test_propose.pysrc/synthorg/persistence/sqlite/conversation_repo.pysrc/synthorg/persistence/postgres/conversation_repo.pysrc/synthorg/meta/chief_of_staff/propose.py
🔇 Additional comments (4)
tests/unit/meta/chief_of_staff/test_propose.py (1)
1-544: LGTM!src/synthorg/persistence/sqlite/conversation_repo.py (1)
1-614: LGTM!src/synthorg/persistence/postgres/conversation_repo.py (1)
1-532: LGTM!src/synthorg/meta/chief_of_staff/propose.py (1)
1-149: LGTM!Also applies to: 183-603
Round 4 covers CodeRabbit's re-review of round 3's commit. CR-#4 (round 2 duplicate) was closed: CR replied accepting the disproof and resolved the thread. Per-conversation lock cleanup (CR-C1, Nitpick): - propose.py: ``_release_conversation_lock`` helper pops the dict entry under ``_conversation_locks_guard`` so a removal cannot race with a concurrent ``_lock_for`` resolving the same id. Called from ``_record_proposals`` (after the conversation transitions to PROPOSED, the v1 1:1 terminal state) and ``_cap_conversation`` (after force-closing to CLOSED). The ``_record_clarification`` path is NOT a terminal state -- the conversation stays ACTIVE for the next turn -- so the lock is retained. Tests: - test_propose.py::test_lock_released_on_terminal_propose asserts the dict drops the conversation id after a successful propose. - test_propose.py::test_lock_released_on_clarification_cap asserts the same after the clarification cap force-closes the thread.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/synthorg/meta/chief_of_staff/propose.py`:
- Around line 184-197: The cleanup currently performed in
_release_conversation_lock removes the mapping while the old per-conversation
lock may still be held, allowing a race where a new converse() creates a fresh
lock and runs _run_turn() concurrently; instead, stop popping the entry inside
_release_conversation_lock and move that removal into converse() after the outer
"async with <that conversation's lock>" has fully exited, checking that
self._conversation_locks.get(conversation_id) is still the same lock instance
before popping it; apply the same change for the other similar sites (around
lines ~475-478 and ~614-617) and consider updating _resolve_conversation() to
treat ConversationStatus.PROPOSED as terminal if v1 semantics require rejecting
PROPOSED transitions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI (base), Organization UI (inherited)
Review profile: ASSERTIVE
Plan: Pro
Run ID: 97985311-bc21-44a2-a896-49638c90e764
📒 Files selected for processing (2)
src/synthorg/meta/chief_of_staff/propose.pytests/unit/meta/chief_of_staff/test_propose.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Build Backend
- GitHub Check: Build Web Assets (melange)
- GitHub Check: Dashboard Test
- GitHub Check: Test E2E
- GitHub Check: Test Unit
- GitHub Check: Test Integration
- GitHub Check: Lighthouse Dashboard
- GitHub Check: Lighthouse Site
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use
SettingsService/ConfigResolver(Cat-1) or env > code default (Cat-2,read_only_post_init); Cat-3 bootstrap secrets are pure env at the boot site. Noos.environ.getoutside startup; pre-init Cat-2 reads usesettings.bootstrap_resolver.resolve_init_valueNumerics live in
settings/definitions/; allowlist 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants. Enforced byscripts/check_no_magic_numbers.pyComments should explain WHY only; no reviewer citations / issue back-refs / migration framing. Enforced by
check_no_review_origin_in_code.py+check_no_migration_framing.pyNo
from __future__ import annotations(3.14 has PEP 649). Use PEP 758 except:except A, B:no parens unless bindingErrors:
<Domain><Condition>ErrorfromDomainError; never inheritException/RuntimeError/etc directly. Enforced bycheck_domain_error_hierarchy.pyPydantic v2 frozen +
extra="forbid"on every frozen model;@computed_fieldfor derived;NotBlankStrfor identifiers. Per-line# lint-allow: frozen-extra-forbid -- <reason>for exceptions. Enforced bycheck_frozen_model_extra_forbid.pyArgs models at every system boundary;
parse_typed()for every external dict ingestion. Enforced bycheck_boundary_typed.pyImmutability: use
model_copy(update=...)orcopy.deepcopy(); deepcopy at system boundariesAsync: use
asyncio.TaskGroupfor fan-out/fan-in; helpers catchException(re-raiseMemoryError/RecursionError)Untrusted content (SEC-1): use
wrap_untrusted()fromengine.prompt_safety; useHTMLParseGuardfor HTMLTelemetry: opt-in, off by default. Every event property must be in
_ALLOWED_PROPERTIES. See telemetry.md
Files:
tests/unit/meta/chief_of_staff/test_propose.pysrc/synthorg/meta/chief_of_staff/propose.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Test markers:
@pytest.mark.{unit,integration,e2e,slow}. Asyncauto. Timeout 30s global. Coverage 80% minxdist
-n 8 --dist=loadfileauto-applied via pyprojectaddopts; Windows unit tests useWindowsSelectorEventLoopPolicy(3.14 IOCP teardown race)Test doubles: FakeClock for Clock seam,
mock_of[T](**overrides)for typed-boundary substitutions, SimpleNamespace for attribute-bags. BareMagicMockat typed boundaries forbidden byscripts/check_mock_spec.pyHypothesis: 10 deterministic CI examples; failures are real bugs (fix + add
@example(...)). Never skip/xfail on flaky tests; fix fundamentally. Useasyncio.Event().wait()notsleep(large)
Files:
tests/unit/meta/chief_of_staff/test_propose.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/meta/chief_of_staff/test_propose.py
{src/**/*.py,tests/**/*.py}
📄 CodeRabbit inference engine (CLAUDE.md)
Vendor-agnostic: NEVER use real vendor names in project code/tests. Use
example-provider,test-provider,example-{large,medium,small}-001. Allowed in.claude/, third-party imports,providers/presets.py,web/public/provider-logos/
Files:
tests/unit/meta/chief_of_staff/test_propose.pysrc/synthorg/meta/chief_of_staff/propose.py
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Type hints on public functions; mypy strict. Google-style docstrings. Line length 88; functions <50 lines; files <800 lines
Clock seam:
clock: Clock | None = None; tests injectFakeClock. Lifecycle: services own_lifecycle_lock; timed-out stops mark unrestartableUse
from synthorg.observability import get_logger; variable alwayslogger. Neverimport logging/print()in app codeEvent names from
observability.events.<domain>constants; structured kwargs (logger.info(EVENT, key=value))Error paths log WARNING/ERROR with context before raising; state transitions log INFO via
*_STATUS_TRANSITIONEDafter persistence writeSecret-log redaction (SEC-1): never
error=str(exc)or interpolate{exc}; useerror_type=type(exc).__name__+error=safe_error_description(exc). Neverexc_info=True. OTel: forbidspan.record_exception(exc). Enforced bycheck_logger_exception_str_exc.py
Files:
src/synthorg/meta/chief_of_staff/propose.py
⚙️ CodeRabbit configuration file
This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.
Files:
src/synthorg/meta/chief_of_staff/propose.py
src/synthorg/meta/chief_of_staff/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Conversational propose:
POST /meta/chat/proposeis opt-in (meta.chief_of_staff.propose_enabled);ChiefOfStaffProposerbuilt bybuild_chief_of_staff_proposer(ENFORCED manifest entry); 503 when ANY of provider, persistence, or work pipeline missing. Human content wrapped viawrap_untrusted(TAG_TASK_DATA, ...)
Files:
src/synthorg/meta/chief_of_staff/propose.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T01:51:19.899Z
Learning: Read `docs/design/` page before implementing; deviations need approval. See DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T01:51:19.899Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T01:51:19.899Z
Learning: No region/currency/locale privileged; use metric units; British English for regional defaults
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T01:51:19.899Z
Learning: Every convention PR ships its enforcement gate
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T01:51:19.899Z
Learning: After issue completion: branch + commit + push (no auto-PR); use `/pre-pr-review`. After PR: `/aurelio-review-pr` for external feedback. Fix EVERYTHING valid; no deferring
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T01:51:19.899Z
Learning: Commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced. Signed commits required on protected refs
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T01:51:19.899Z
Learning: Branches: `<type>/<slug>` from main. Pre-commit/pre-push hooks: `.pre-commit-config.yaml`. Tool-call gates: `.claude/settings.json` PreToolUse
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T01:51:19.899Z
Learning: Squash merge. PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T01:51:19.899Z
Learning: After every squash merge → `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T01:51:19.899Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
tests/unit/meta/chief_of_staff/test_propose.pysrc/synthorg/meta/chief_of_staff/propose.py
Round 5 covers CodeRabbit's re-review of the round-4 lock cleanup. Lock cleanup race (CR-D1, Critical): - propose.py: round 4 invoked ``_release_conversation_lock`` from inside ``_record_proposals`` / ``_cap_conversation`` -- i.e. WHILE the per-conversation lock was still held by the current ``converse()`` caller. That window let a concurrent ``converse()`` on the same conversation observe an empty dict slot, mint a fresh lock, and run ``_run_turn`` in parallel with the still-unwinding first caller. The lock invariant the round-3 PR introduced was therefore defeated by the round-4 cleanup. Fix: - ``_release_conversation_lock`` now takes a keyword-only ``expected_lock: asyncio.Lock`` and only pops the dict entry when ``self._conversation_locks.get(conversation_id) is expected_lock``. The identity check is essential -- if a concurrent caller has already replaced the entry with a fresh lock, popping it would invalidate that caller's serialisation. - ``converse()`` captures the lock instance from ``_lock_for``, runs the turn pipeline inside ``async with lock``, and only AFTER the ``async with`` releases calls ``_release_conversation_lock`` with ``expected_lock=lock``. Cleanup is gated on the result shape: ``result.status == "proposed"`` (terminal -> PROPOSED) or ``result.conversation_closed`` (terminal -> CLOSED). - The in-lock cleanup calls from ``_record_proposals`` and ``_cap_conversation`` are removed entirely. Single-source the release in ``converse()``; only one ``_release_conversation_lock`` call site remains. Note on PROPOSED resumability: CR also raised whether ``_resolve_conversation`` should treat PROPOSED as terminal and reject resumes. Documented skip -- that's a separate design call (a user can legitimately want to refine a parked-but-rejected conversation), and the lock + identity-check guarantee no parallel ``_run_turn`` regardless of PROPOSED resumability. Tests: - test_propose.py: new test_release_identity_check_skips_replaced_lock asserts the new ``expected_lock`` parameter -- releasing under the OLD lock identity is a no-op when the slot has already been replaced; releasing under the CURRENT identity actually pops. - Existing test_lock_released_on_terminal_propose + test_lock_released_on_clarification_cap still pass under the new post-lock cleanup wiring.
There was a problem hiding this comment.
♻️ Duplicate comments (2)
src/synthorg/meta/chief_of_staff/propose.py (2)
242-244:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winRevalidate the conversation inside the lock before running the turn.
_resolve_conversation()completed before Line 242, so a second request can readACTIVE, wait behind another request that moves the thread toPROPOSEDorCLOSED, and then still append turns once it acquires this lock. That lets a stale caller park extra approvals against a terminal conversation. Reload or at least re-check existing conversations after entering the locked section.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/synthorg/meta/chief_of_staff/propose.py` around lines 242 - 244, After acquiring the per-conversation lock (via self._lock_for(conversation.id)) revalidate the conversation state before calling self._run_turn: reload the conversation (e.g., call self._resolve_conversation(conversation.id) or fetch the latest object) inside the async with lock block and verify it is still in the expected ACTIVE state; if it has moved to PROPOSED or CLOSED, abort/raise/return early instead of calling _run_turn to prevent appending turns to a stale/terminal conversation.
203-204:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winQueued waiters still make this lock cleanup unsafe.
The identity check only covers the “dict entry was replaced with a different lock object” case. It does not cover callers that already captured
expected_lockand are queued on it: A can release, B can acquire the same lock, then Line 203-Line 204 pops the registry entry and C mints a fresh lock that runs concurrently with B. Unless this registry tracks outstanding users/waiters, the safe immediate fix is to stop deleting terminal locks here.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/synthorg/meta/chief_of_staff/propose.py` around lines 203 - 204, The current cleanup that does self._conversation_locks.pop(conversation_id, None) while checking identity against expected_lock is unsafe because queued waiters might still be using expected_lock; stop deleting terminal locks here — remove that pop so locks remain registered, or implement a proper reference-count/waiter-tracking scheme (increment when a caller acquires or queues on a lock and decrement on release, and only remove the dict entry when the count reaches zero and there are no waiters). Update the code paths that create/return locks (the routines that access self._conversation_locks and expected_lock) to either no-op the pop or to adjust the refcount/waiter tracking accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@src/synthorg/meta/chief_of_staff/propose.py`:
- Around line 242-244: After acquiring the per-conversation lock (via
self._lock_for(conversation.id)) revalidate the conversation state before
calling self._run_turn: reload the conversation (e.g., call
self._resolve_conversation(conversation.id) or fetch the latest object) inside
the async with lock block and verify it is still in the expected ACTIVE state;
if it has moved to PROPOSED or CLOSED, abort/raise/return early instead of
calling _run_turn to prevent appending turns to a stale/terminal conversation.
- Around line 203-204: The current cleanup that does
self._conversation_locks.pop(conversation_id, None) while checking identity
against expected_lock is unsafe because queued waiters might still be using
expected_lock; stop deleting terminal locks here — remove that pop so locks
remain registered, or implement a proper reference-count/waiter-tracking scheme
(increment when a caller acquires or queues on a lock and decrement on release,
and only remove the dict entry when the count reaches zero and there are no
waiters). Update the code paths that create/return locks (the routines that
access self._conversation_locks and expected_lock) to either no-op the pop or to
adjust the refcount/waiter tracking accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI (base), Organization UI (inherited)
Review profile: ASSERTIVE
Plan: Pro
Run ID: c055e8e4-a741-4af6-9354-b1c712d67f67
📒 Files selected for processing (2)
src/synthorg/meta/chief_of_staff/propose.pytests/unit/meta/chief_of_staff/test_propose.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Build Web Assets (melange)
- GitHub Check: Dashboard Test
- GitHub Check: Test Unit
- GitHub Check: Test Integration
- GitHub Check: Lighthouse Dashboard
- GitHub Check: Lighthouse Site
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Test markers:
@pytest.mark.{unit,integration,e2e,slow}; async auto; global timeout 30s; minimum coverage 80%Windows: unit tests use WindowsSelectorEventLoopPolicy (3.14 IOCP teardown race); subprocess tests override back
Test doubles: FakeClock for Clock seam, mock_ofT for typed-boundary substitutions, SimpleNamespace for attribute-bags; bare MagicMock at typed boundary blocked by scripts/check_mock_spec.py (zero-tolerance, no baseline)
FakeClock and mock_of imported from tests._shared; inject via clock= and helper spec subscript
Hypothesis: 10 deterministic CI examples; failures are real bugs (fix + add
@example(...))Flaky tests: NEVER skip/xfail; fix fundamentally; use asyncio.Event().wait() instead of sleep(large)
Files:
tests/unit/meta/chief_of_staff/test_propose.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/meta/chief_of_staff/test_propose.py
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Follow configuration precedence: DB > env > code default via SettingsService/ConfigResolver (Cat-1) or env > code default (Cat-2, read_only_post_init); Cat-3 bootstrap secrets are pure env at boot site; no os.environ.get outside startup; pre-init Cat-2 reads use settings.bootstrap_resolver.resolve_init_value (see docs/reference/configuration-precedence.md)
No hardcoded numeric values; numerics live in settings/definitions/; allowlist only: 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants (NAME: int|float|Final|Final[int]|Final[float] = literal); enforced by scripts/check_no_magic_numbers.py
Comments must explain WHY only; no reviewer citations, issue back-refs, or migration framing; enforced by check_no_review_origin_in_code.py and check_no_migration_framing.py
No
from __future__ import annotations(Python 3.14 has PEP 649); use PEP 758 syntax exceptexcept A, B:requires parens when bindingType hints on public functions; mypy strict mode; Google-style docstrings; line length 88; functions <50 lines; files <800 lines
Error classes must use naming pattern Error and inherit from DomainError, never directly from Exception/RuntimeError; enforced by check_domain_error_hierarchy.py
Every Pydantic v2 frozen model must have extra="forbid" project-wide; gate check_frozen_model_extra_forbid.py;
@computed_fieldauto-exempt; per-line lint-allow comments for extra="allow"/"ignore" at boundariesUse
@computed_fieldfor derived properties in Pydantic modelsUse NotBlankStr for identifiers in Pydantic models
Args models at every system boundary; use parse_typed() for every external dict ingestion; enforced by check_boundary_typed.py
Use model_copy(update=...) or copy.deepcopy() for immutability; deepcopy at system boundaries
Use asyncio.TaskGroup for fan-out/fan-in; helpers catch Exception (re-raise MemoryError/RecursionError)
Use Clock seam pattern: clock: Clock | None = None; tests inject FakeClock; service...
Files:
src/synthorg/meta/chief_of_staff/propose.py
src/synthorg/meta/chief_of_staff/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Conversational propose: POST /meta/chat/propose is opt-in (meta.chief_of_staff.propose_enabled); ChiefOfStaffProposer built by build_chief_of_staff_proposer (ENFORCED manifest entry, EPIC
#1955); 503s when ANY of provider, connected persistence, or work pipeline missing; human content wrapped via wrap_untrusted(TAG_TASK_DATA, ...) (SEC-1); approval decisions route through Flow 0 of signal_resume_intent
Files:
src/synthorg/meta/chief_of_staff/propose.py
src/**/*.py
⚙️ CodeRabbit configuration file
This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.
Files:
src/synthorg/meta/chief_of_staff/propose.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T02:15:09.416Z
Learning: Read design specification before implementing; deviations require approval (see docs/DESIGN_SPEC.md)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T02:15:09.416Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T02:15:09.416Z
Learning: Reuse web/src/components/ui/ design system components; use design tokens only (see web/CLAUDE.md)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T02:15:09.416Z
Learning: No region/currency/locale privilege; use metric units; use British English (see docs/reference/regional-defaults.md)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T02:15:09.416Z
Learning: Every convention PR ships its enforcement gate (see docs/reference/convention-gates.md)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T02:15:09.416Z
Learning: After issue: create branch + commit + push (no auto-PR); use /pre-pr-review before creating PR; gh pr create is blocked; after PR: use /aurelio-review-pr for external feedback; fix all valid issues before merging
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T02:15:09.416Z
Learning: Git commits: <type>: <description> (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T02:15:09.416Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via synthorg-repo-bot)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T02:15:09.416Z
Learning: Branches: <type>/<slug> from main
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T02:15:09.416Z
Learning: Squash merge; PR body becomes squash commit; trailers (Release-As, Closes `#N`) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T02:15:09.416Z
Learning: GitHub queries: use gh issue list via Bash, NOT MCP list_issues
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T02:15:09.416Z
Learning: After every squash merge → /post-merge-cleanup
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T02:15:09.416Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API (see cli/CLAUDE.md)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T02:15:09.416Z
Learning: For web development: see web/CLAUDE.md; for CLI: see cli/CLAUDE.md (use go -C cli, never cd cli); for shell scripts: see ~/.claude/rules/common/bash.md
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
tests/unit/meta/chief_of_staff/test_propose.pysrc/synthorg/meta/chief_of_staff/propose.py
🔇 Additional comments (1)
tests/unit/meta/chief_of_staff/test_propose.py (1)
610-635: LGTM!
Round 6 covers CodeRabbit's re-review of round 5. CR identified two distinct concurrency issues. One required a real correctness fix (inside-lock revalidation). The other is an honest revert: the lock cleanup chain CR has been pushing on since round 4 cannot be made race-free without refcount + waiter tracking, which v1 doesn't need at its expected scale. Inside-lock conversation revalidation (CR-E1, Critical): - propose.py: ``_run_turn`` now re-reads the conversation under the lock and raises ``ConversationClosedError`` if the status is no longer ACTIVE. Without this, two concurrent ``converse()`` calls could both read ACTIVE in ``_resolve_conversation`` outside the lock; A acquires the lock, transitions to PROPOSED in ``_record_proposals``, releases; B then acquires the lock and silently parks extra approvals against a now-PROPOSED conversation (``_park_proposal`` runs before ``transition_if`` no-ops on the stale ACTIVE->PROPOSED transition, so the approvals land in the queue even though the status flip warns). The outside-lock ``_resolve_conversation`` pre-check is kept as a fast-path for obvious CLOSED rejections; the inside-lock check is now the authoritative gate. Lock cleanup reverted (CR-E2, Critical): - propose.py: removed ``_release_conversation_lock`` and the ``converse()`` post-lock cleanup added in round 4 / refined in round 5. CR's CR-E2 walks the remaining race: even with the identity check, callers already queued on ``expected_lock`` can acquire the lock AFTER its owner exits the ``async with`` but BEFORE the cleanup pop runs; the pop then leaves a slot for a third caller to mint a fresh lock and run concurrently with the queued waiter. Closing this race needs a refcount of in-flight + queued users, which v1 does not need: the dict at v1's 1:1 per-user scale is bounded by active human users, well below the threshold that would matter for memory. The round-4 nitpick (unbounded dict growth) was explicitly labeled "Trivial / Low value" by CR. Round 5 + 6 demonstrate it cannot be fixed simply; the safe path is to keep the locks registered for the process lifetime. Revisit with refcount tracking if a future scale requirement demands it. Tests: - test_propose.py: removed test_lock_released_on_terminal_propose, test_lock_released_on_clarification_cap, and test_release_identity_check_skips_replaced_lock -- they asserted behaviour that is no longer the design. - test_propose.py: new test_run_turn_aborts_if_conversation_terminal_under_lock -- patches the fake conversation repo's ``get`` so the second call (the inside-lock re-fetch) returns PROPOSED; asserts converse raises ConversationClosedError and that no turns were appended.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/synthorg/meta/chief_of_staff/propose.py (1)
424-436:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftMake multi-proposal parking atomic or compensating.
This loop commits proposals one at a time. If
_park_proposal()fails on the 2nd+ item, this method raises with the conversation stillACTIVEand no assistant summary turn, but the earlier approvals are already visible in the queue. A retry can then park duplicate approvals for the items that succeeded before the failure. Please add batch-level idempotency or compensation here instead of treating each proposal independently.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/synthorg/meta/chief_of_staff/propose.py` around lines 424 - 436, The loop over decision.proposals calls _park_proposal repeatedly, causing partial commits if one call fails; make the multi-proposal operation atomic by first validating all proposals and preparing a batch (e.g., gather project for each item and pre-check idempotency keys) and then either (a) perform a single transactional batch persist of all proposals or (b) implement compensation: track created proposal IDs from _park_proposal and on any failure call a new helper to reverse/unpark those created items before re-raising; ensure you use the same idempotency key or check existence inside _park_proposal to avoid duplicates and update the propose flow to only mark the conversation/assistant summary after the entire batch succeeds (references: decision.proposals, _park_proposal, raise ConversationalProposeResponseInvalidError, conversation state/summary update).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/synthorg/meta/chief_of_staff/propose.py`:
- Around line 244-246: Before raising ConversationClosedError in the propose
flow, log a warning with the conversation context: after fetching current via
self._conversation_repo.get(conversation.id) and detecting current is None or
current.status != ConversationStatus.ACTIVE, call self._logger.warning (or the
class logger used elsewhere) including conversation.id and any relevant current
state, then raise ConversationClosedError(conversation_id=conversation.id);
reference the conditional that checks current and ConversationStatus.ACTIVE and
the ConversationClosedError symbol so the log is emitted immediately before the
raise.
---
Outside diff comments:
In `@src/synthorg/meta/chief_of_staff/propose.py`:
- Around line 424-436: The loop over decision.proposals calls _park_proposal
repeatedly, causing partial commits if one call fails; make the multi-proposal
operation atomic by first validating all proposals and preparing a batch (e.g.,
gather project for each item and pre-check idempotency keys) and then either (a)
perform a single transactional batch persist of all proposals or (b) implement
compensation: track created proposal IDs from _park_proposal and on any failure
call a new helper to reverse/unpark those created items before re-raising;
ensure you use the same idempotency key or check existence inside _park_proposal
to avoid duplicates and update the propose flow to only mark the
conversation/assistant summary after the entire batch succeeds (references:
decision.proposals, _park_proposal, raise
ConversationalProposeResponseInvalidError, conversation state/summary update).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI (base), Organization UI (inherited)
Review profile: ASSERTIVE
Plan: Pro
Run ID: 92c268d5-82fb-4474-b600-d5a34bfad2a2
📒 Files selected for processing (2)
src/synthorg/meta/chief_of_staff/propose.pytests/unit/meta/chief_of_staff/test_propose.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: Lighthouse Site
- GitHub Check: Lighthouse Dashboard
- GitHub Check: Dashboard Test
- GitHub Check: Test Integration
- GitHub Check: Test Unit
- GitHub Check: Test E2E
- GitHub Check: Build Web Assets (melange)
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{py,pyi}
📄 CodeRabbit inference engine (CLAUDE.md)
Read
docs/design/page before implementing; deviations need approval per DESIGN_SPEC.mdOnly
src/synthorg/persistence/may import sqlite/psycopg or emit raw SQL; new repository protocols must inherit from generic categories inpersistence/_generics.py(SingletonRepository,IdKeyedRepository,FilteredQueryRepository,AppendOnlyRepository,StatefulRepository,MVCCRepository); protocols stay@runtime_checkableUse
SettingsService/ConfigResolver(Cat-1) or env > code default (Cat-2,read_only_post_init) for configuration precedence; Cat-3 bootstrap secrets are pure env at boot site; noos.environ.getoutside startup; usesettings.bootstrap_resolver.resolve_init_valuefor pre-init Cat-2 readsDo not hardcode numerics; numerics live in
settings/definitions/; allowlist only 0/1/-1, HTTP codes, hex masks, powers-of-2, and module-level annotated named constants of formNAME: int|float|Final|Final[int]|Final[float] = literal; enforced byscripts/check_no_magic_numbers.pyNever import
loggingor useprint()in app code; usefrom synthorg.observability import get_loggerwith variable namelogger; emit event names fromobservability.events.<domain>constants via structured kwargsOn error paths, log WARNING/ERROR with context before raising; on state transitions, log INFO via
*_STATUS_TRANSITIONEDAFTER persistence write; never interpolate exceptions (error=str(exc)) or useexc_info=True; useerror_type=type(exc).__name__+error=safe_error_description(exc)(SEC-1)Errors must be
<Domain><Condition>Errorinheriting fromDomainError, never directly fromException/RuntimeError/etc; enforced bycheck_domain_error_hierarchy.pyUse Pydantic v2 frozen models with
extra="forbid"on every frozen model project-wide (gatecheck_frozen_model_extra_forbid.py);@computed_fieldauto-exempt; use per-line# lint-allow: frozen-extra-forbid -- <reason>forextra="allow"/"ignore"boundariesUse
@computed_field...
Files:
tests/unit/meta/chief_of_staff/test_propose.pysrc/synthorg/meta/chief_of_staff/propose.py
tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use test markers:
@pytest.mark.{unit,integration,e2e,slow}; async tests useautomode; 30s global timeout; coverage 80% minxdist
-n 8 --dist=loadfileauto-applied via pyprojectaddopts(loadfile prevents Python 3.14+ Windows ProactorEventLoop leak)Windows unit tests use
WindowsSelectorEventLoopPolicy(Python 3.14 IOCP teardown race); subprocess tests override backUse
FakeClockfromtests._sharedfor Clock seam injection; usemock_of[T](**overrides)fromtests._sharedfor typed-boundary substitutions; useSimpleNamespacefor attribute-bags; bareMagicMockat typed boundaries forbidden byscripts/check_mock_spec.pyHypothesis: 10 deterministic CI examples; failures are real bugs (fix + add
@example(...)); never skip/xfail flaky tests—fix fundamentally; useasyncio.Event().wait()notsleep(large)
Files:
tests/unit/meta/chief_of_staff/test_propose.py
⚙️ CodeRabbit configuration file
Test files do not require Google-style docstrings on classes or functions -- ruff D rules are only enforced on src/. A bare
@settings() decorator with no arguments on Hypothesis property tests is a no-op and should not be suggested -- the HYPOTHESIS_PROFILE env var controls example counts via registered profiles, which@given() honors automatically.
Files:
tests/unit/meta/chief_of_staff/test_propose.py
src/synthorg/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Wrap untrusted content via
wrap_untrusted()fromengine.prompt_safety; useHTMLParseGuardfor HTML content (SEC-1)Never use real vendor names in project code/tests; use
example-provider,test-provider,example-{large,medium,small}-001; real names allowed only in.claude/, third-party imports,providers/presets.py,web/public/provider-logos/Telemetry is opt-in and off by default; every event property must be in
_ALLOWED_PROPERTIES; seetelemetry.md
Files:
src/synthorg/meta/chief_of_staff/propose.py
src/**/*.py
⚙️ CodeRabbit configuration file
This project uses Python 3.14+ with PEP 758 except syntax: "except A, B:" (comma-separated, no parentheses) is correct and mandatory -- do NOT flag it as a typo or suggest parenthesized form. The "except builtins.MemoryError, RecursionError: raise" pattern is intentional project convention for system-error propagation. When evaluating the 50-line function limit, count only the function body excluding the signature lines, decorators, and docstring. Functions 1-5 lines over due to docstrings or multi-line signatures should not be flagged. Do not suggest extracting single-use helper functions called exactly once -- this reduces readability without improving maintainability.
Files:
src/synthorg/meta/chief_of_staff/propose.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T02:39:17.304Z
Learning: Every convention PR ships its enforcement gate; see `docs/reference/convention-gates.md`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T02:39:17.304Z
Learning: Never edit `tests/baselines/unit_timing.json` or any `scripts/*_baseline.{txt,json}` / `scripts/_*_baseline.py`; both families are PreToolUse-blocked; per-invocation bypass: `ALLOW_BASELINE_GROWTH=1 git commit ...` (requires explicit user approval)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T02:39:17.304Z
Learning: After issue: branch + commit + push (no auto-PR); use `/pre-pr-review` (`gh pr create` is blocked by `scripts/check_no_pr_create.sh`); after PR: `/aurelio-review-pr` for external feedback; fix EVERYTHING valid; no deferring
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T02:39:17.304Z
Learning: Present every plan for accept/deny before coding (MANDATORY planning guideline)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T02:39:17.304Z
Learning: After every squash merge → `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-20T02:39:17.304Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
tests/unit/meta/chief_of_staff/test_propose.pysrc/synthorg/meta/chief_of_staff/propose.py
Logging on inside-lock terminal abort (CR-F1, Major Quick win): - propose.py _run_turn: emit a structured warning before raising ConversationClosedError on the inside-lock conversation-status recheck. Captures the conversation id and the observed (post-flip) status so operators can correlate the race in logs. Multi-proposal parking atomicity (CR-F2, Major Heavy lift): - _park_proposal is now self-atomic: if the approval-store add fails after the proposal row was committed, the proposal row is removed before re-raising. The cleanup is best-effort -- the original exception is preserved even if the proposal delete itself fails (logged under park_proposal_cleanup_failed). - _record_proposals tracks every successful park in the batch and unwinds them on any later failure via the new _unwind_parked_proposal helper. Unwind order is reverse of _park_proposal (approval first, proposal second) so no caller ever sees a dangling approval pointing at a deleted proposal. Unwind failures are logged but never mask the original exception. - New ApprovalStoreProtocol.delete + ApprovalStore.delete: returns bool, removes from in-memory cache + persistent repo (if wired). Repo failures restore the cache entry and re-raise; pure-cache removals succeed without touching a non-existent repo. Without compensation, a partial commit leaves earlier approvals visible in the queue but the conversation stays ACTIVE with no proposal_id for the client retry to dedupe against; the retry would then double-park those items. Tests: - test_propose.py::test_record_proposals_unwinds_on_partial_park_failure: scripted two-proposal response + proposal_repo.save patched to raise on the second call; asserts converse raises, proposal repo ends empty, approval store ends empty, conversation status stays ACTIVE. - test_protocol.py: ApprovalStoreProtocol surface now includes 'delete'; the surface-stability assertion is updated.
Pre-push mypy caught a missing protocol member on the ``_RecordingApprovalStore`` test stub in ``tests/unit/hr/test_registry_autonomy.py``. Round 7 added ``delete`` to ``ApprovalStoreProtocol`` to enable the multi-proposal parking compensation; the test stub was hand-rolled (no ``AsyncMock(spec=...)``) so it needs the new method explicitly. Implementation mirrors the in-memory shape of the other stub methods: filter the recorded list by id, return whether anything was removed. Other test stubs that depend on the protocol use ``AsyncMock(spec=ApprovalStoreProtocol)`` or ``mock_of[ApprovalStoreProtocol](...)`` which pick the new ``delete`` up automatically; no other test files needed changes.
Summary
EPIC #1967 keystone: a human talks to ONE fixed top agent (Chief of Staff) in natural language. The agent asks clarifying questions when the request is underspecified, then parks one or more concrete
WorkItems in the human approval queue. On approval each item runs through the work pipeline; no autonomous acting (direct MCP acting is the separate child #1972).The pipeline spine (
build_work_pipeline, #1960) and approval producer (#1956) already landed and are ENFORCED in the ghost-wiring manifest. This PR drives them from a conversation.What changed
Service + API
src/synthorg/meta/chief_of_staff/propose.py::ChiefOfStaffProposer(single-responsibility; keepschat.pyexplain-only). Structured clarify-or-propose LLM turn viacore/json_parsing.py. SEC-1: human content wrapped viawrap_untrusted(TAG_TASK_DATA, ...).POST /meta/chat/proposeonMetaControllerwithConversationalProposeRequestargs model,meta.chat.propose(5/60s) rate-limit, 503 dependency-matrix (proposer + work pipeline must be wired).src/synthorg/api/controllers/_approval_review_gate.py::try_conversational_intake_resumeFlow 0 before parked-context / review-gate. On approve: reconstructsWorkItemfrom the proposal and drivesapp_state.work_pipeline.run. On reject: marks proposal REJECTED, pipeline untouched.build_chief_of_staff_proposerfactory +_wire_chief_of_staff_proposeron_startup hook + AppState set-once accessors (chief_of_staff_proposer,conversational_proposal_repo).ChiefOfStaffConfigpropose_* fields (opt-in, default off).Data model
ApprovalSource.CONVERSATIONAL_INTAKEdiscriminator + newConversationRole/ConversationStatus/ConversationalProposalStatusenums.Conversation,ConversationTurn,ProposedWork,ProposeDecision(XOR validator),ConversationalProposal,ProposeArgs,ProposedApprovalSummary,ProposeResult.Persistence (dual-backend)
ConversationRepository(StatefulRepository),ConversationTurnRepository(AppendOnlyRepository),ConversationalProposalRepository(Stateful + FilteredQuery). Generic categories per ADR-0001; no bespoke methods.conversational_factory.build_conversational_repositories(backend-aware; degrades gracefully to None for absent / unknown backends so app boot stays clean).20260519000001_conversational_intake.sql). Postgres widensapprovals.sourceCHECK viaDROP/ADD CONSTRAINT. SQLite intentionally keeps the narrow CHECK (rebuild via writable_schema was perf-too-costly under the unit suite's--count=2isolation gate; the v1 ApprovalStore is in-memory by default soCONVERSATIONAL_INTAKErows never reach SQLite. Documented inline.).schema_drift_baseline.txtgrown by 4 canonical TEXT/TIMESTAMPTZ entries for the new timestamp columns (user-approved, identical to every other dual-backend table). Committed withALLOW_BASELINE_GROWTH=1.Errors / events
<Domain><Condition>Errorfamily inmeta/errors.py:ConversationNotFoundError(404),ConversationClosedError(409),ConversationalProposeUnavailableError(503),ConversationalProposeResponseInvalidError(502). NewErrorCodeentries 3017 / 4015 / 7010.observability/events/{chief_of_staff,approval_gate,persistence}.py(clarify/propose/cap/transition + dispatcher branch + persistence repo error/read events).Manifest discipline (hard acceptance criterion)
scripts/_ghost_wiring_manifest.txt: newENFORCED build_chief_of_staff_proposer #1968line. Verified byscripts/check_no_ghost_wiring.py.Web / generated types
web/src/api/types/{openapi,dtos,enum-values,error-codes}.gen.tsfor the new endpoint and enums.postChatPropose()client +ConversationalProposeResponseinterface inweb/src/api/endpoints/meta.ts; MSW handler inweb/src/mocks/handlers/meta.ts(1:1 parity rule).Docs
docs/design/self-improvement.md:/meta/chat/proposeendpoint,propose.py+ model trio in package inventory, three-flow approval-decision routing (Flow 0 conversational-intake), YAML defaults forpropose_*.README.md: v1 scope clarified (clarify + propose; no autonomous acting).CLAUDE.md: one-liner on proposer wiring + 503 conditions.Test plan
-m unit): enums, models, errors, settings, proposer (clarify / propose / untrusted-wrap / parse-failure / clarification cap / project-resolution), wiring (build_chief_of_staff_proposer503 matrix; backend-aware factory degradation), dispatch branch (approve-runs-pipeline, reject-skips-pipeline, missing-proposal-owned-but-noop, pipeline-unavailable-503, pipeline-failure-leaves-PENDING), API model validation + route + rate-limit-policy registration.tests/conformance/persistence/,pytest.mark.integration):Conversation,ConversationTurn,ConversationalProposalagainst sqlite + postgres. Schema-drift gates green on both backends.tests/e2e/test_conversational_propose_e2e.py,pytest.mark.e2e): builds the REAL pipeline viabuild_runtime_services, REALChiefOfStaffProposerwith a scripted clarify-then-propose provider, REALApprovalStore, REAL approval-decision seam (signal_resume_intent). Asserts: vague request → clarify → answer → PENDING approval in store → approve →Taskcreated and advanced past CREATED by a real agent. Reject path: pipeline never runs.Review coverage
11 specialist review agents ran pre-push (code, python, security, persistence, conventions, logging, async-concurrency, api-contract-drift, issue-resolution-verifier, docs-consistency, comment-quality-rot). 18 valid findings (7 High, 10 Medium, 1 doc polish) addressed in commit 80b34f9. 10 false positives dismissed with reasons in
_audit/pre-pr-review/triage.md.issue-resolution-verifierreturned PASS on every acceptance criterion and the parent epic's manifest-discipline rule.Closes #1968