From 27a47929fae1f960c647d79f1bc14f0c0174ccd6 Mon Sep 17 00:00:00 2001 From: Sami Rusani Date: Mon, 16 Mar 2026 22:55:30 +0100 Subject: [PATCH] Sprint 5S: project truth synchronization after Gmail auth hardening --- .ai/handoff/CURRENT_STATE.md | 29 ++++--- ARCHITECTURE.md | 33 ++++--- BUILD_REPORT.md | 163 +++++++++++------------------------ REVIEW_REPORT.md | 37 ++++---- ROADMAP.md | 26 +++--- 5 files changed, 116 insertions(+), 172 deletions(-) diff --git a/.ai/handoff/CURRENT_STATE.md b/.ai/handoff/CURRENT_STATE.md index 319fc38..aa6ef14 100644 --- a/.ai/handoff/CURRENT_STATE.md +++ b/.ai/handoff/CURRENT_STATE.md @@ -2,29 +2,32 @@ ## Canonical Truth -- The working repo state is current through Sprint 5J, including compile-path semantic artifact retrieval and deterministic hybrid lexical-plus-semantic artifact merge in compile. +- The working repo state is current through Sprint 5R, including narrow PDF/DOCX/RFC822 artifact ingestion, read-only Gmail account persistence and single-message ingestion, protected Gmail credential storage, refresh-token renewal, and rotated refresh-token persistence. - Use [PRODUCT_BRIEF.md](/Users/samirusani/Desktop/Codex/AliceBot/PRODUCT_BRIEF.md) for product scope, [ARCHITECTURE.md](/Users/samirusani/Desktop/Codex/AliceBot/ARCHITECTURE.md) for implemented technical boundaries, [ROADMAP.md](/Users/samirusani/Desktop/Codex/AliceBot/ROADMAP.md) for forward planning, and [RULES.md](/Users/samirusani/Desktop/Codex/AliceBot/RULES.md) for durable operating rules. - Historical build and review reports remain the source of sprint-by-sprint detail; the active handoff should stay compact and current. ## Implemented Repo Slice -- `apps/api` is the only shipped product surface. It implements continuity, tracing, deterministic context compilation, governed memory admission and review, embeddings, semantic retrieval, entities, policy and tool governance, approval persistence and resolution, approved-only `proxy.echo` execution, execution budgets, task/task-step lifecycle reads and mutations, explicit manual continuation lineage, explicit task-step linkage for approval and execution synchronization, deterministic rooted local task-workspace provisioning, explicit task-artifact registration, narrow local text-artifact ingestion into durable chunk rows, artifact-chunk embeddings, direct lexical and semantic artifact retrieval, compile-path semantic artifact retrieval, and deterministic hybrid lexical-plus-semantic artifact merge inside the compile response. +- `apps/api` is the only shipped product surface. It implements continuity, tracing, deterministic context compilation, governed memory admission and review, embeddings, semantic retrieval, entities, policy and tool governance, approval persistence and resolution, approved-only `proxy.echo` execution, execution budgets, task/task-step lifecycle reads and mutations, explicit manual continuation lineage, explicit task-step linkage for approval and execution synchronization, deterministic rooted local task-workspace provisioning, explicit task-artifact registration, narrow local text/PDF/DOCX/RFC822 artifact ingestion into durable chunk rows, artifact-chunk embeddings, direct lexical and semantic artifact retrieval, compile-path semantic artifact retrieval, deterministic hybrid lexical-plus-semantic artifact merge inside the compile response, and a narrow read-only Gmail seam with protected credential storage plus single-message ingestion into the existing RFC822 artifact path. - The live schema includes continuity, trace, memory, embedding, entity, governance, `tasks`, `task_steps`, `task_workspaces`, `task_artifacts`, `task_artifact_chunks`, and `task_artifact_chunk_embeddings` tables with row-level security on user-owned data. +- The live schema also includes `gmail_accounts` and `gmail_account_credentials` for the shipped Gmail seam; account metadata reads stay secret-free while protected credentials remain isolated to the credential table. - `apps/web` and `workers` remain starter scaffolds only. ## Current Boundaries - Task workspaces are implemented only as deterministic rooted local directories plus durable `task_workspaces` records. -- Task artifacts are implemented only as explicit rooted local-file registrations under those workspaces plus narrow deterministic ingestion for `text/plain` and `text/markdown`. +- Task artifacts are implemented only as explicit rooted local-file registrations under those workspaces plus narrow deterministic ingestion for `text/plain`, `text/markdown`, `application/pdf`, DOCX text from `word/document.xml`, and `message/rfc822`. - Artifact retrieval operates only over persisted chunk rows and persisted chunk embeddings for one visible task or one visible artifact at a time; compile does not read raw files directly. - Compile can now include artifact chunks from lexical retrieval, semantic retrieval, or a deterministic hybrid merge of both into one artifact section with explicit per-chunk source provenance. - The shipped multi-step task path is still explicit and narrow: later steps are appended manually with lineage, while approval and execution synchronization use explicit linked `task_step_id` references. +- The shipped Gmail path is still explicit and narrow: one read-only account seam, one selected-message ingestion path, secret-free account reads, protected credential renewal for expired refresh-capable credentials, and rotated refresh-token persistence only inside `gmail_account_credentials`. - The only execution handler in the repo is the in-process no-external-I/O `proxy.echo` path. ## Not Implemented -- Rich document parsing beyond the narrow local text ingestion seam. -- Read-only Gmail or Calendar connectors. +- Rich document parsing beyond the shipped narrow local text/PDF/DOCX/RFC822 ingestion seam. +- Gmail search, mailbox sync, attachment ingestion, write-capable Gmail actions, and Calendar connectors. +- External secret-manager integration for Gmail protected credentials. - Runner-style orchestration or automatic multi-step progression. - Artifact reranking or weighted fusion beyond the current lexical-first hybrid compile merge. - Auth beyond the current database user-context model. @@ -33,17 +36,17 @@ - Memory extraction and retrieval quality remain the main product risk. - Auth is still incomplete beyond database user context. -- Artifact ingestion and retrieval are intentionally narrow and local; richer document parsing, connectors, and any retrieval changes beyond the shipped hybrid compile contract still need their own accepted seams. +- Artifact ingestion and retrieval are intentionally narrow and local; broader document parsing, broader Gmail scope, external secret storage, and any retrieval changes beyond the shipped hybrid compile contract still need their own accepted seams. ## Latest Accepted Verification -- Latest accepted runtime verification totals for the shipped Sprint 5J seams were: - - `./.venv/bin/python -m pytest tests/unit` -> `380 passed` - - `./.venv/bin/python -m pytest tests/integration` -> `118 passed` -- Sprint 5K is documentation-only truth synchronization; it does not change runtime, schema, or API behavior. +- Latest accepted runtime verification totals for the shipped Sprint 5R seams were: + - `./.venv/bin/python -m pytest tests/unit` -> `437 passed` + - `./.venv/bin/python -m pytest tests/integration` -> `139 passed` +- Sprint 5S is documentation-only truth synchronization; it does not change runtime, schema, or API behavior. ## Planning Guardrails -- Plan from the implemented Sprint 5J repo state, not from older milestone narratives. -- Do not describe richer document parsing, connectors, runner work, UI work, or artifact reranking beyond the current lexical-first hybrid compile merge as shipped. -- The immediate next move after this truth-sync sprint is a narrow richer-document-parsing sprint that builds on the existing rooted workspace, durable chunk, and shipped hybrid artifact compile baseline. +- Plan from the implemented Sprint 5R repo state, not from older milestone narratives. +- Do not describe broader Gmail scope, Calendar work, runner work, UI work, external secret-manager integration, or artifact reranking beyond the current lexical-first hybrid compile merge as shipped. +- The immediate next move after this truth-sync sprint is one more narrow Gmail auth-adjacent sprint, most likely external secret-manager integration for the existing protected credential seam, without combining it with search, sync, Calendar, or UI expansion. diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md index 3e7ad09..c9b66d1 100644 --- a/ARCHITECTURE.md +++ b/ARCHITECTURE.md @@ -2,17 +2,17 @@ ## Current Implemented Slice -AliceBot now implements the accepted repo slice through Sprint 5Q. The shipped backend includes: +AliceBot now implements the accepted repo slice through Sprint 5R. The shipped backend includes: - foundation continuity storage over `users`, `threads`, `sessions`, and append-only `events` - deterministic tracing and context compilation over durable continuity, memory, entity, and entity-edge records - governed memory admission, explicit-preference extraction, memory review labels, review queue reads, evaluation summary reads, explicit embedding config and memory-embedding storage, direct semantic retrieval, and deterministic hybrid compile-path memory merge - deterministic prompt assembly and one no-tools response path that persists assistant replies as immutable continuity events - user-scoped consents, policies, policy evaluation, tool registry, allowlist evaluation, tool routing, approval request persistence, approval resolution, approved-only proxy execution through the in-process `proxy.echo` handler, durable execution review, and execution-budget lifecycle plus enforcement -- a narrow read-only Gmail connector seam with user-scoped `gmail_accounts` metadata persistence, separate user-scoped `gmail_account_credentials` protected credential storage, deterministic account reads without secret exposure, refresh-token-capable protected credential renewal for expired access tokens, and one explicit selected-message ingestion path that materializes one Gmail message as a rooted `.eml` task artifact and then reuses the existing RFC822 artifact ingestion pipeline +- a narrow read-only Gmail connector seam with user-scoped `gmail_accounts` metadata persistence, separate user-scoped `gmail_account_credentials` protected credential storage, deterministic account reads without secret exposure, refresh-token-capable protected credential renewal for expired access tokens, rotated refresh-token persistence when the provider returns a replacement token, and one explicit selected-message ingestion path that materializes one Gmail message as a rooted `.eml` task artifact and then reuses the existing RFC822 artifact ingestion pipeline - durable `tasks`, `task_steps`, `task_workspaces`, `task_artifacts`, `task_artifact_chunks`, and `task_artifact_chunk_embeddings`, deterministic task-step sequencing, explicit task-step transitions, explicit manual continuation with lineage through `parent_step_id`, `source_approval_id`, and `source_execution_id`, explicit `tool_executions.task_step_id` linkage for execution synchronization, deterministic rooted local task-workspace provisioning, explicit rooted local artifact registration, deterministic local plain-text, markdown, narrow PDF text, narrow DOCX text, and narrow RFC822 email text ingestion into durable chunk rows, deterministic lexical artifact-chunk retrieval over durable chunk rows, explicit user-scoped artifact-chunk embedding persistence tied to existing embedding configs, explicit task-scoped or artifact-scoped semantic artifact-chunk retrieval over those durable embeddings, and compile-path artifact retrieval that can include lexical results, semantic results, or one deterministic hybrid lexical-plus-semantic merged artifact section with per-chunk source provenance -The current multi-step boundary is narrow and explicit. Manual continuation is implemented and review-passed. Approval resolution and proxy execution now both use explicit task-step linkage rather than first-step inference. Task workspaces are now implemented only as deterministic rooted local boundaries, and task artifacts are now implemented only as explicit rooted local-file registrations, narrow deterministic artifact ingestion under those workspaces, lexical retrieval over persisted chunk rows, explicit artifact-chunk embedding storage tied to existing embedding configs, direct semantic retrieval over those durable artifact-chunk embeddings for one visible task or one visible artifact at a time, and compile-path artifact retrieval that deterministically merges lexical and semantic candidates into one artifact section when both are requested for the same scope. The live richer-document boundary is still intentionally narrow: plain text and markdown ingest directly, PDF support is limited to narrow local text extraction, DOCX support is limited to narrow local text extraction from `word/document.xml`, RFC822 email support is limited to top-level selected headers plus extractable plain-text body content while excluding nested `message/rfc822` content, and the live connector boundary is limited to one read-only Gmail account seam plus one explicit selected-message ingestion path into the rooted RFC822 artifact pipeline, with credentials resolved through a dedicated protected credential seam, renewed through one explicit refresh path when an expired refresh-capable credential is present, and never exposed on the normal account metadata table surface. OCR, image extraction, layout reconstruction, Gmail search, mailbox sync, attachments, Calendar connectors, reranking beyond the current lexical-first hybrid merge, and new side-effect surfaces are still planned later and must not be described as live behavior. +The current multi-step boundary is narrow and explicit. Manual continuation is implemented and review-passed. Approval resolution and proxy execution now both use explicit task-step linkage rather than first-step inference. Task workspaces are now implemented only as deterministic rooted local boundaries, and task artifacts are now implemented only as explicit rooted local-file registrations, narrow deterministic artifact ingestion under those workspaces, lexical retrieval over persisted chunk rows, explicit artifact-chunk embedding storage tied to existing embedding configs, direct semantic retrieval over those durable artifact-chunk embeddings for one visible task or one visible artifact at a time, and compile-path artifact retrieval that deterministically merges lexical and semantic candidates into one artifact section when both are requested for the same scope. The live richer-document boundary is still intentionally narrow: plain text and markdown ingest directly, PDF support is limited to narrow local text extraction, DOCX support is limited to narrow local text extraction from `word/document.xml`, RFC822 email support is limited to top-level selected headers plus extractable plain-text body content while excluding nested `message/rfc822` content, and the live connector boundary is limited to one read-only Gmail account seam plus one explicit selected-message ingestion path into the rooted RFC822 artifact pipeline, with credentials resolved through a dedicated protected credential seam, renewed through one explicit refresh path when an expired refresh-capable credential is present, any provider-returned rotated refresh token persisted back through that same protected credential seam before Gmail fetches continue, and never exposed on the normal account metadata table surface. OCR, image extraction, layout reconstruction, Gmail search, mailbox sync, attachments, Calendar connectors, reranking beyond the current lexical-first hybrid merge, and new side-effect surfaces are still planned later and must not be described as live behavior. ## Implemented Now @@ -60,11 +60,11 @@ The current multi-step boundary is narrow and explicit. Manual continuation is i ### Repo Boundaries In This Slice -- `apps/api`: implemented API, store, contracts, service logic, and migrations for continuity, tracing, memory, embeddings, entities, policies, tools, approvals, proxy execution, execution budgets, a narrow read-only Gmail connector seam with protected refresh-token lifecycle support, tasks, task steps, task workspaces, task artifacts, artifact-chunk embeddings, deterministic lexical artifact chunk retrieval, deterministic semantic artifact chunk retrieval over durable embeddings, compile-path semantic artifact retrieval, and deterministic hybrid lexical-plus-semantic artifact merge in compile. +- `apps/api`: implemented API, store, contracts, service logic, and migrations for continuity, tracing, memory, embeddings, entities, policies, tools, approvals, proxy execution, execution budgets, a narrow read-only Gmail connector seam with protected refresh-token lifecycle and refresh-token rotation handling support, tasks, task steps, task workspaces, task artifacts, artifact-chunk embeddings, deterministic lexical artifact chunk retrieval, deterministic semantic artifact chunk retrieval over durable embeddings, compile-path semantic artifact retrieval, and deterministic hybrid lexical-plus-semantic artifact merge in compile. - `apps/web`: minimal shell only; no shipped workflow UI. - `workers`: scaffold only; no background jobs or runner logic are implemented. - `infra`: local development bootstrap assets only. -- `tests`: unit and Postgres-backed integration coverage for the shipped seams above, including Sprint 4O task-step lineage/manual continuation, Sprint 4S step-linked execution synchronization, Sprint 5A task-workspace provisioning, Sprint 5C task-artifact registration, Sprint 5D local artifact ingestion plus chunk reads, Sprint 5E lexical artifact-chunk retrieval, Sprint 5F compile-path artifact chunk integration, Sprint 5G artifact-chunk embedding persistence and reads, Sprint 5H direct semantic artifact-chunk retrieval, Sprint 5I compile-path semantic artifact retrieval, Sprint 5J deterministic hybrid lexical-plus-semantic artifact merge in compile, Sprint 5L narrow PDF artifact ingestion, Sprint 5M narrow DOCX artifact ingestion, Sprint 5N narrow RFC822 email artifact ingestion, Sprint 5O read-only Gmail account plus single-message ingestion coverage, Sprint 5P Gmail credential hardening coverage, and Sprint 5Q Gmail refresh-token lifecycle coverage. +- `tests`: unit and Postgres-backed integration coverage for the shipped seams above, including Sprint 4O task-step lineage/manual continuation, Sprint 4S step-linked execution synchronization, Sprint 5A task-workspace provisioning, Sprint 5C task-artifact registration, Sprint 5D local artifact ingestion plus chunk reads, Sprint 5E lexical artifact-chunk retrieval, Sprint 5F compile-path artifact chunk integration, Sprint 5G artifact-chunk embedding persistence and reads, Sprint 5H direct semantic artifact-chunk retrieval, Sprint 5I compile-path semantic artifact retrieval, Sprint 5J deterministic hybrid lexical-plus-semantic artifact merge in compile, Sprint 5L narrow PDF artifact ingestion, Sprint 5M narrow DOCX artifact ingestion, Sprint 5N narrow RFC822 email artifact ingestion, Sprint 5O read-only Gmail account plus single-message ingestion coverage, Sprint 5P Gmail credential hardening coverage, Sprint 5Q Gmail refresh-token lifecycle coverage, and Sprint 5R Gmail refresh-token rotation handling coverage. ## Core Flows Implemented Now @@ -100,12 +100,13 @@ The current multi-step boundary is narrow and explicit. Manual continuation is i 4. Expose deterministic user-scoped Gmail account list and detail reads without secret material. 5. Accept a user-scoped `POST /v0/gmail-accounts/{gmail_account_id}/messages/{provider_message_id}/ingest` request for one visible Gmail account and one visible task workspace. 6. Resolve the Gmail access token through the protected credential seam before any Gmail fetch, file write, or artifact registration, and renew it first through one explicit refresh path when the visible protected credential is refresh-capable and expired. -7. Derive one deterministic workspace-relative artifact path as `gmail//.eml`. -8. Reject duplicate `(task_workspace_id, relative_path)` collisions before any Gmail fetch or file write. -9. Fetch exactly one selected Gmail message through the read-only Gmail API path `users/me/messages/{provider_message_id}?format=raw`. -10. Require Gmail to return RFC822 `raw` content, validate it against the existing narrow `message/rfc822` extraction rules, and reject unsupported content deterministically. -11. Materialize the message as one rooted `.eml` file inside the selected task workspace and then reuse the existing task-artifact registration plus artifact-ingestion seam. -12. Persist only the resulting `task_artifacts` and `task_artifact_chunks` rows; account-wide sync, search, attachments, Calendar, and write-capable actions remain out of scope. +7. When the refresh provider returns a replacement refresh token, persist that rotated token back through the same protected credential seam before Gmail fetches continue. +8. Derive one deterministic workspace-relative artifact path as `gmail//.eml`. +9. Reject duplicate `(task_workspace_id, relative_path)` collisions before any Gmail fetch or file write. +10. Fetch exactly one selected Gmail message through the read-only Gmail API path `users/me/messages/{provider_message_id}?format=raw`. +11. Require Gmail to return RFC822 `raw` content, validate it against the existing narrow `message/rfc822` extraction rules, and reject unsupported content deterministically. +12. Materialize the message as one rooted `.eml` file inside the selected task workspace and then reuse the existing task-artifact registration plus artifact-ingestion seam. +13. Persist only the resulting `task_artifacts` and `task_artifact_chunks` rows; account-wide sync, search, attachments, Calendar, and write-capable actions remain out of scope. ### Governed Memory And Retrieval @@ -263,7 +264,7 @@ The current multi-step boundary is narrow and explicit. Manual continuation is i ## Testing Coverage Implemented Now - Unit and integration tests cover continuity, compiler, response generation, memory admission, review labels, review queue, embeddings, semantic retrieval, compile-path hybrid memory retrieval, artifact lexical retrieval, artifact semantic retrieval, compile-path semantic artifact retrieval, hybrid artifact compile merge, entities, policies, tools, approvals, proxy execution, execution budgets, and execution review. -- Sprints 4O through 5J added explicit task lifecycle and artifact retrieval coverage: +- Sprints 4O through 5R added explicit task lifecycle, artifact retrieval, richer-document ingestion, and narrow Gmail coverage: - migrations for `tasks`, `task_steps`, and task-step lineage - staged/backfilled migration coverage for `tool_executions.task_step_id` - task and task-step store contracts @@ -289,6 +290,14 @@ The current multi-step boundary is narrow and explicit. Manual continuation is i - direct semantic artifact-chunk retrieval by task and by artifact - compile-path semantic artifact retrieval including trace visibility, exclusion rules, and scope isolation - deterministic hybrid artifact compile merge with dual-source provenance, deduplication, lexical-first precedence, and shared limit enforcement + - narrow PDF ingestion success and failure paths + - narrow DOCX ingestion success and failure paths + - narrow RFC822 ingestion success and failure paths + - read-only Gmail account connect/list/detail coverage with secret-free reads + - selected Gmail message ingestion through the rooted RFC822 artifact path + - protected Gmail credential storage isolation in `gmail_account_credentials` + - refresh-token renewal for expired refresh-capable Gmail credentials + - rotated refresh-token persistence when the provider returns a replacement token - task-artifact and task-artifact-chunk per-user isolation - trace visibility for continuation and transition events - user isolation for task and task-step reads and mutations diff --git a/BUILD_REPORT.md b/BUILD_REPORT.md index 58ae38d..fc1a684 100644 --- a/BUILD_REPORT.md +++ b/BUILD_REPORT.md @@ -2,142 +2,75 @@ ## sprint objective -Implement Sprint 5R: extend the existing Gmail refresh path so single-message ingestion can persist and use a provider-rotated refresh token without exposing secrets in Gmail account reads or expanding into broader Gmail, Calendar, secret-manager, runner, compile-contract, or UI scope. +Implement Sprint 5S: synchronize `ARCHITECTURE.md`, `ROADMAP.md`, and `.ai/handoff/CURRENT_STATE.md` with the accepted repo state through Sprint 5R so planning and handoff start from the shipped Gmail/document baseline instead of stale Sprint 5J-era truth. ## completed work -- Added rotation-aware Gmail refresh handling in `apps/api/src/alicebot_api/gmail.py`. -- Changed the refresh helper to capture an optional provider-returned replacement `refresh_token` alongside the renewed access token and expiry. -- Applied the protected-credential replacement rule: - - if Google returns a non-empty `refresh_token`, persist that replacement token - - otherwise keep the existing stored `refresh_token` -- Kept renewal writes inside the existing `gmail_account_credentials` seam and continued using the existing single-message ingestion path after a successful protected-credential update. -- Added a deterministic Gmail-specific persistence failure when renewed protected credentials cannot be written back, and mapped that failure to the existing `409` error envelope in `apps/api/src/alicebot_api/main.py`. -- Preserved secret-free Gmail account reads; no secret fields were added to list/detail/connect/ingest responses. -- Added unit coverage for: - - renewal without refresh-token rotation - - renewal with refresh-token rotation - - deterministic failure when protected-credential persistence fails - - raw refresh helper parsing of provider-returned rotated refresh tokens -- Added integration coverage for: - - renewal success without refresh-token rotation - - renewal success with refresh-token rotation - - deterministic failure when rotated credentials cannot be persisted - - unchanged secret-free response shape during the rotation-capable path +- Updated `ARCHITECTURE.md` to advance the documented implemented slice from Sprint 5Q to Sprint 5R. +- Corrected the Gmail architecture description to state the shipped seam is read-only, single-message-only, protected-credential-backed, refresh-token-capable, and refresh-token-rotation-capable. +- Corrected `ROADMAP.md` so Milestone 5 now reflects shipped narrow PDF, DOCX, and RFC822 ingestion plus the shipped read-only Gmail seam and auth hardening through Sprint 5R. +- Reframed `ROADMAP.md` next-focus language away from stale richer-document-parsing-first planning and toward the next narrow Gmail auth-adjacent seam on top of the shipped baseline. +- Updated `.ai/handoff/CURRENT_STATE.md` so canonical truth, implemented slice, current boundaries, not-implemented scope, verification totals, and planning guardrails all match the repo through Sprint 5R. +- Replaced the stale Sprint 5R implementation report in `BUILD_REPORT.md` with this Sprint 5S truth-synchronization report. ## incomplete work -- None inside Sprint 5R scope. +- None inside Sprint 5S scope. ## files changed -- `apps/api/src/alicebot_api/gmail.py` -- `apps/api/src/alicebot_api/main.py` -- `tests/unit/test_gmail.py` -- `tests/unit/test_gmail_main.py` -- `tests/unit/test_gmail_refresh.py` -- `tests/integration/test_gmail_accounts_api.py` +- `ARCHITECTURE.md` +- `ROADMAP.md` +- `.ai/handoff/CURRENT_STATE.md` - `BUILD_REPORT.md` ## tests run -- `./.venv/bin/python -m pytest tests/unit` - - Result: `437 passed in 0.64s` -- `./.venv/bin/python -m pytest tests/integration` - - Result: `139 passed in 45.52s` - -## exact commands run - -- `./.venv/bin/python -m pytest tests/unit/test_gmail.py tests/unit/test_gmail_refresh.py tests/unit/test_gmail_main.py` -- `./.venv/bin/python -m pytest tests/unit/test_gmail.py tests/unit/test_gmail_refresh.py tests/unit/test_gmail_main.py tests/integration/test_gmail_accounts_api.py -k 'gmail'` -- `./.venv/bin/python -m pytest tests/unit` -- `./.venv/bin/python -m pytest tests/integration` - -## unit and integration test results - -- Gmail-focused unit tests passed after updating the remaining stale test double to the new refresh return shape. -- One intermediate sandboxed combined Gmail run could not reach the local Postgres fixture on `localhost:5432`; this was an environment restriction, not an application test failure. -- Required final verification passed: - - `tests/unit`: `437 passed in 0.64s` - - `tests/integration`: `139 passed in 45.52s` - -## one example Gmail account response proving secret-free reads remain intact - -```json -{ - "account": { - "id": "", - "provider": "gmail", - "auth_kind": "oauth_access_token", - "provider_account_id": "acct-owner-rotated-001", - "email_address": "owner@gmail.example", - "display_name": "Owner", - "scope": "https://www.googleapis.com/auth/gmail.readonly", - "created_at": "", - "updated_at": "" - } -} -``` - -## one example Gmail ingestion response through the rotation-capable path - -```json -{ - "account": { - "id": "", - "provider": "gmail", - "auth_kind": "oauth_access_token", - "provider_account_id": "acct-owner-rotated-001", - "email_address": "owner@gmail.example", - "display_name": "Owner", - "scope": "https://www.googleapis.com/auth/gmail.readonly", - "created_at": "", - "updated_at": "" - }, - "message": { - "provider_message_id": "msg-001", - "artifact_relative_path": "gmail/acct-owner-rotated-001/msg-001.eml", - "media_type": "message/rfc822" - }, - "artifact": { - "id": "", - "task_id": "", - "task_workspace_id": "", - "status": "registered", - "ingestion_status": "ingested", - "relative_path": "gmail/acct-owner-rotated-001/msg-001.eml", - "media_type_hint": "message/rfc822", - "created_at": "", - "updated_at": "" - }, - "summary": { - "total_count": "", - "total_characters": "", - "media_type": "message/rfc822", - "chunking_rule": "normalized_utf8_text_fixed_window_1000_chars_v1", - "order": ["sequence_no_asc", "id_asc"] - } -} -``` +- `git diff --name-only -- ARCHITECTURE.md ROADMAP.md .ai/handoff/CURRENT_STATE.md BUILD_REPORT.md` + - Result: only the named truth artifacts are changed in this sprint diff. +- `git diff --check -- ARCHITECTURE.md ROADMAP.md .ai/handoff/CURRENT_STATE.md BUILD_REPORT.md` + - Result: no diff formatting errors. + +## evidence used + +- Repo implementation anchors: + - `apps/api/src/alicebot_api/artifacts.py` + - `apps/api/src/alicebot_api/gmail.py` + - `apps/api/alembic/versions/20260316_0026_gmail_accounts.py` + - `apps/api/alembic/versions/20260316_0027_gmail_account_credentials.py` + - `apps/api/alembic/versions/20260316_0028_gmail_refresh_token_lifecycle.py` +- Accepted verification and sprint truth anchors: + - `BUILD_REPORT.md` from Sprint 5R before this update + - `REVIEW_REPORT.md` showing Sprint 5R `PASS` + - `tests/integration/test_task_artifacts_api.py` + - `tests/integration/test_gmail_accounts_api.py` + - `tests/unit/test_gmail.py` + - `tests/unit/test_gmail_refresh.py` + +## specific stale statements corrected + +- `ROADMAP.md` previously said the accepted repo state was current only through Sprint 5J. +- `ROADMAP.md` previously described richer document parsing as the next pending step even though narrow PDF, DOCX, and RFC822 ingestion are already shipped. +- `.ai/handoff/CURRENT_STATE.md` previously said canonical truth was current only through Sprint 5J. +- `.ai/handoff/CURRENT_STATE.md` previously listed read-only Gmail as not implemented even though the repo ships Gmail account persistence and selected-message ingestion. +- `.ai/handoff/CURRENT_STATE.md` previously limited artifact ingestion to `text/plain` and `text/markdown` even though the repo also ships narrow PDF, DOCX, and RFC822 ingestion. +- `ARCHITECTURE.md` previously stopped its top-level version marker at Sprint 5Q and its testing summary at Sprint 5Q even though Sprint 5R rotation handling is implemented and accepted. ## blockers/issues -- No remaining implementation blockers. -- Integration verification required elevated access because the default sandbox blocked connections to the local Postgres test fixture on `localhost:5432`. +- No implementation blockers. +- No runtime or schema changes were made; this sprint stayed documentation-only by design. -## what remains intentionally deferred to later milestones +## what remains intentionally deferred after truth synchronization -- Gmail search -- mailbox sync or backfill jobs -- attachment ingestion -- write-capable Gmail actions +- Gmail search, mailbox sync, attachment ingestion, and write-capable Gmail actions - Calendar connector scope -- OAuth UI or callback handling -- external secret-manager integration -- compile-contract changes +- external secret-manager integration for Gmail protected credentials +- richer document parsing beyond the shipped narrow PDF/DOCX/RFC822 seams - runner-style orchestration - UI work +- auth beyond the current database user-context model ## recommended next step -Keep the next Gmail sprint narrow around one adjacent auth seam only, such as external secret-manager integration for the existing protected credential store, without combining it with search, sync, Calendar, or UI expansion. +Open one more narrow Gmail auth-adjacent sprint, most likely external secret-manager integration for the existing `gmail_account_credentials` seam, without combining it with broader connector breadth, search, sync, Calendar, runner, or UI scope. diff --git a/REVIEW_REPORT.md b/REVIEW_REPORT.md index 0cc5b9c..85d7a32 100644 --- a/REVIEW_REPORT.md +++ b/REVIEW_REPORT.md @@ -6,20 +6,15 @@ PASS ## criteria met -- Sprint stayed narrow. The code changes are limited to the Gmail renewal seam, the ingest endpoint error mapping, Gmail-focused tests, and `BUILD_REPORT.md`. -- Rotated refresh tokens are now handled in the protected credential seam. `apps/api/src/alicebot_api/gmail.py` captures an optional provider-returned `refresh_token` during renewal and persists it back through `gmail_account_credentials`. -- The replacement rule matches the packet: - - if the provider returns a non-empty replacement `refresh_token`, persist it - - otherwise keep the existing stored `refresh_token` -- Gmail account reads remain secret-free. No secret fields were added to list/detail/connect/ingest responses, and existing account list/detail isolation tests still pass. -- Single-message Gmail ingestion still works for both cases: - - stable refresh-token renewal - - rotated refresh-token renewal -- Rotated-credential persistence failures are deterministic and happen before Gmail fetch/artifact writes. The new error is mapped to the existing `409` envelope, and tests verify no artifact or credential corruption when persistence fails. -- Required verification passed: - - `./.venv/bin/python -m pytest tests/unit` -> `437 passed in 0.63s` - - `./.venv/bin/python -m pytest tests/integration` -> `139 passed in 42.27s` -- No out-of-scope Gmail search, sync, attachments, write actions, Calendar, external secret-manager, compile-contract, runner, or UI work entered the sprint. +- Sprint stayed documentation-only. The review diff contains updates to `ARCHITECTURE.md`, `ROADMAP.md`, `.ai/handoff/CURRENT_STATE.md`, and `BUILD_REPORT.md`; no runtime, schema, API, connector-breadth, runner, or UI code changes were introduced. +- `ARCHITECTURE.md` now matches the shipped Gmail seam through Sprint 5R: read-only account persistence, secret-free reads, protected credentials in `gmail_account_credentials`, refresh-token renewal, rotated refresh-token persistence, and one explicit selected-message ingestion path into the RFC822 artifact pipeline. +- `ROADMAP.md` no longer describes the repo as current only through Sprint 5J and no longer treats richer document parsing as the next pending shipped baseline. It now reflects the accepted Milestone 5 state through Sprint 5R and frames the next sprint from that actual baseline. +- `.ai/handoff/CURRENT_STATE.md` no longer stops at Sprint 5J. It now reflects the shipped narrow PDF, DOCX, RFC822, and Gmail auth seams, current verification totals, and the immediate next narrow boundary. +- The updated truth artifacts clearly separate implemented seams from deferred work such as richer parsing, Gmail search/sync/attachments, Calendar, external secret-manager integration, runner work, and UI work. +- `BUILD_REPORT.md` includes the required truth-sync contents: exact truth artifacts updated, evidence used, stale statements corrected, confirmation that no runtime or schema changes were made, and intentionally deferred follow-up scope. +- Verification succeeded: + - `./.venv/bin/python -m pytest tests/unit` -> `437 passed in 0.96s` + - `./.venv/bin/python -m pytest tests/integration` -> `139 passed in 40.02s` ## criteria missed @@ -27,24 +22,26 @@ PASS ## quality issues -- None material for Sprint 5R. +- No material implementation-quality issues for Sprint 5S. +- Minor rigor note: the `BUILD_REPORT.md` diff check command is scoped to the named truth files, so that command alone does not prove the whole worktree is documentation-only. The actual repo diff does satisfy the sprint boundary, so this does not block approval. ## regression risks -- Low. The change is localized to the existing Gmail renewal path and is covered by both unit and Postgres-backed integration tests for stable-token renewal, rotated-token renewal, failure handling, secret-free responses, and user isolation. +- Low. This sprint is documentation-only, and the live codebase and tests support the updated statements about narrow document ingestion and the Gmail credential/rotation seam. +- Integration verification depends on local Postgres access on `localhost:5432`; inside the default sandbox it fails with an environment permission error, but it passes when run with the required local access. ## docs issues -- None. `BUILD_REPORT.md` includes the rotation change summary, replacement rule, commands run, test results, secret-free account example, rotation-capable ingestion example, and deferred scope. +- None blocking. The updated docs are materially aligned with the implemented repo state through Sprint 5R. ## should anything be added to RULES.md? -- No. This is a narrow connector implementation detail, not a new repository-wide rule. +- No. This sprint does not establish a new repo-wide operating rule. ## should anything update ARCHITECTURE.md? -- No. The sprint does not introduce a new architectural boundary or subsystem; it hardens the existing Gmail protected-credential seam. +- No further update is needed beyond the changes already made in this sprint. ## recommended next action -- Accept Sprint 5R and move to the next narrow auth-adjacent milestone without broadening scope. +- Accept Sprint 5S and open the next narrow Gmail auth-adjacent sprint from this synchronized baseline, with external secret-manager integration as the strongest next candidate if Control Tower still wants that seam next. diff --git a/ROADMAP.md b/ROADMAP.md index a3d9e21..936fca2 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -2,27 +2,29 @@ ## Current Position -- The accepted repo state is current through Sprint 5J. -- Milestone 5 now ships the rooted local workspace and artifact baseline end to end: workspace provisioning, artifact registration, narrow text ingestion, durable chunk storage, lexical artifact retrieval, compile-path artifact inclusion, artifact-chunk embeddings, direct semantic artifact retrieval, compile-path semantic artifact retrieval, and deterministic hybrid lexical-plus-semantic artifact merge in compile. +- The accepted repo state is current through Sprint 5R. +- Milestone 5 now ships the rooted local workspace and artifact baseline end to end: workspace provisioning, artifact registration, narrow text ingestion, narrow PDF/DOCX/RFC822 ingestion, durable chunk storage, lexical artifact retrieval, compile-path artifact inclusion, artifact-chunk embeddings, direct semantic artifact retrieval, compile-path semantic artifact retrieval, and deterministic hybrid lexical-plus-semantic artifact merge in compile. +- The same milestone also now ships the narrow Gmail seam: read-only Gmail account persistence, secret-free account reads, protected credential storage in `gmail_account_credentials`, refresh-token renewal for expired access tokens, rotated refresh-token persistence when the provider returns a replacement token, and one explicit selected-message ingestion path that lands in the existing RFC822 artifact pipeline. - This roadmap is future-facing from that shipped baseline; historical sprint-by-sprint detail lives in accepted build and review artifacts, not here. ## Next Delivery Focus -### Open Richer Document Parsing On Top Of The Shipped Artifact Retrieval Baseline +### Open One More Narrow Gmail Auth Seam On Top Of The Shipped Baseline -- Extend ingestion beyond the current `text/plain` and `text/markdown` seam without changing the rooted `task_workspaces` and durable `task_artifact_chunks` contracts. -- Keep retrieval building on persisted chunk rows and persisted embeddings; new parsing work should feed the existing compile-path lexical/semantic/hybrid artifact retrieval seam rather than inventing a parallel context path. -- Keep the next sprint narrow: richer document parsing first, then reassess connectors only after the parsing seam is accepted. +- Keep the next sprint auth-adjacent and narrow, building on the shipped protected-credential-backed Gmail seam rather than widening connector breadth. +- The next best seam is external secret-manager integration for the existing `gmail_account_credentials` boundary, without changing the read-only account contract or the single-message ingestion contract. +- Do not combine that work with Gmail search, mailbox sync, attachment ingestion, Calendar scope, UI work, or broader connector orchestration. -### Preserve Current Compile, Governance, And Task Guarantees +### Preserve Current Document, Compile, Governance, And Task Guarantees +- Keep the shipped PDF, DOCX, and RFC822 ingestion seams narrow and deterministic; richer parsing, OCR, layout reconstruction, attachment handling, and broader email processing still need separate later seams. - Keep approvals, execution budgets, task/task-step state, and trace visibility deterministic as Milestone 5 continues. - Preserve the shipped compile contract of one merged artifact section with explicit source provenance, deterministic lexical-first precedence, and trace-visible inclusion and exclusion decisions. -- Do not widen the current no-external-I/O proxy surface or introduce runner, connector, or UI scope until those areas are explicitly opened. +- Do not widen the current no-external-I/O proxy surface or introduce broader connector, runner, or UI scope until those areas are explicitly opened. ## After The Next Narrow Sprint -- Open read-only connector work only after richer document parsing remains deterministic under the current artifact and governance seams. +- Reassess broader connector work only after the current Gmail protected-credential boundary remains stable under externalized secret storage and the truth artifacts stay synchronized. - Revisit workflow UI only after backend document and connector seams are accepted and the truth artifacts stay current. - Revisit broader task orchestration only after the current explicit task-step seams remain stable under workspace, artifact, document, and connector flows. - Continue to defer broader tool execution breadth and production auth/deployment hardening until the current governed surface remains stable. @@ -30,12 +32,12 @@ ## Dependencies - Live truth docs must stay synchronized with accepted repo state so sprint planning does not start from stale assumptions. -- Rich document parsing should build on the shipped rooted local workspace, durable artifact chunk, and hybrid compile retrieval contracts. -- Connector work should remain read-only, approval-aware, and downstream of the document parsing seam. +- Rich document parsing work should continue to build on the shipped rooted local workspace, durable artifact chunk, and hybrid compile retrieval contracts. +- Connector work should remain read-only, single-message-only, approval-aware, and protected-credential-backed until a later sprint explicitly opens broader scope. - Runner-style orchestration should stay deferred until the repo no longer depends on narrow current-step assumptions for safety and explainability. ## Ongoing Risks - Memory extraction and retrieval quality remain the largest product risk. - Auth beyond database user context is still missing. -- Milestone 5 can drift if richer document parsing, connectors, UI, and orchestration work are mixed into one sprint instead of landing as narrow seams on top of the shipped artifact retrieval baseline. +- Milestone 5 can drift if Gmail auth hardening, broader connector breadth, UI, richer parsing, and orchestration work are mixed into one sprint instead of landing as narrow seams on top of the shipped document-ingestion and Gmail baseline.