diff --git a/.ai/active/SPRINT_PACKET.md b/.ai/active/SPRINT_PACKET.md index 9ca8895..78bacec 100644 --- a/.ai/active/SPRINT_PACKET.md +++ b/.ai/active/SPRINT_PACKET.md @@ -2,133 +2,123 @@ ## Sprint Title -Sprint 5O: Read-Only Gmail Connection and Single-Message Ingestion V0 +Sprint 5P: Gmail Credential Hardening ## Sprint Type -feature +repair ## Sprint Reason -Sprint 5N proved the Gmail-adjacent document seam locally by ingesting RFC822 email artifacts into the existing chunk substrate. The next safe step is the first live read-only Gmail slice, but only enough to connect one account and ingest one visible Gmail message through that same RFC822-to-chunk seam. This opens connector work without collapsing into search, sync, Calendar, UI, or write-capable behavior. +Sprint 5O successfully opened the first live read-only Gmail seam, but it left a known security debt: Gmail access tokens are still persisted in plaintext on `gmail_accounts`. Before any broader Gmail auth lifecycle, search, sync, Calendar, or UI work, that gap needs to be closed. ## Sprint Intent -Add the first live read-only Gmail connector seam by supporting user-scoped Gmail connection metadata plus ingestion of one selected Gmail message into the existing artifact-ingestion pipeline as an RFC822-style artifact, without adding write actions, background sync, Calendar, or UI. +Harden the narrow Gmail connector seam by replacing plaintext credential storage with an explicit protected credential mechanism and by removing credential exposure from normal account surfaces, without widening into Gmail search, sync, write actions, Calendar, or UI. ## Git Instructions -- Branch Name: `codex/sprint-5o-gmail-connection-single-message-ingestion` +- Branch Name: `codex/sprint-5p-gmail-credential-hardening` - Base Branch: `main` - PR Strategy: one sprint branch, one PR, no stacked PRs unless Control Tower explicitly opens a follow-up sprint - Merge Policy: squash merge only after reviewer `PASS` and explicit Control Tower merge approval ## Why This Sprint -- Sprint 5A shipped deterministic rooted task-workspace provisioning. -- Sprint 5C through 5J shipped the durable artifact, chunk, lexical, semantic, and hybrid compile substrate. -- Sprint 5N shipped narrow local RFC822 parsing on that same artifact-ingestion seam. -- The product brief requires read-only Gmail connectors in v1. -- The narrowest safe connector step is not mailbox sync or UI; it is a user-scoped read-only Gmail connection plus one explicit message ingestion path that reuses the already-accepted RFC822 extraction seam. +- Sprint 5O shipped the first narrow read-only Gmail account and single-message ingestion seam. +- The accepted review explicitly calls out plaintext Gmail credential storage as acceptable only for that narrow prototype slice and recommends hardening before broader connector rollout. +- Connector security is the immediate risk, not more Gmail breadth. +- The narrowest safe next step is credential hardening only, keeping the current single-message Gmail ingestion seam otherwise intact. ## In Scope -- Add schema and migration support for: - - `gmail_accounts` -- Define typed contracts for: - - Gmail account create or connect requests - - Gmail account list and detail responses - - single-message Gmail ingestion requests - - single-message Gmail ingestion responses -- Implement a narrow Gmail connector seam that: - - stores one user-scoped read-only Gmail account connection record with only the metadata needed for later reads - - uses one explicit Gmail read-only auth/config path only - - fetches one selected Gmail message by explicit provider message id - - converts that message into the existing artifact registration plus RFC822-style ingestion pipeline - - persists the resulting artifact under one visible task workspace - - reuses the existing `task_artifacts` and `task_artifact_chunks` contracts - - preserves per-user isolation throughout account read and message ingestion flows -- Implement the minimal API or service paths needed for: - - connecting one Gmail account - - listing Gmail accounts - - reading one Gmail account - - ingesting one Gmail message into one visible task workspace +- Add schema and migration support only as needed to remove plaintext credential storage from `gmail_accounts`, for example: + - protected credential blob or reference fields + - optional token metadata fields if needed for the narrow hardened seam +- Define typed contract changes for: + - Gmail account connect requests if a hardened credential payload shape is required + - Gmail account responses with secrets removed + - any narrow Gmail ingestion error shape changes needed for hardened credential lookup failures +- Implement a narrow Gmail credential seam that: + - persists Gmail credentials through one explicit protected storage mechanism + - does not return secrets through account list or detail responses + - lets the existing single-message Gmail ingestion path resolve credentials through the hardened mechanism + - preserves deterministic account reads and per-user isolation - Add unit and integration tests for: - - Gmail account persistence - - deterministic account listing - - single-message Gmail ingestion through the existing artifact seam - - rejection of cross-user workspace access - - rejection of unsupported or missing Gmail messages + - protected credential persistence + - absence of secret material in Gmail account responses + - successful single-message Gmail ingestion using the hardened credential path + - deterministic failure when required credentials are missing or invalid + - per-user isolation - stable response shape ## Out of Scope -- No Gmail message search. +- No Gmail search. - No mailbox sync or backfill jobs. - No attachment ingestion. - No write-capable Gmail actions. - No Calendar connector scope. -- No OAuth UX or web callback UI beyond the minimal backend contract needed to represent a connected account. +- No OAuth UX or callback UI. - No compile contract changes. - No runner-style orchestration. - No UI work. ## Required Deliverables -- Migration for `gmail_accounts`. -- Stable contracts for Gmail account connect/list/detail and single-message ingestion. -- Minimal read-only Gmail account persistence seam. -- Minimal explicit single-message Gmail ingestion path that feeds the existing artifact and RFC822 chunk seams. -- Unit and integration coverage for persistence, isolation, ingestion routing, and response stability. +- Migration removing plaintext credential storage from the live Gmail seam in favor of a protected credential mechanism. +- Stable Gmail account contracts that no longer expose secret material. +- Updated single-message Gmail ingestion path that resolves credentials through the hardened mechanism. +- Unit and integration coverage for credential protection, ingestion continuity, failure handling, and isolation. - Updated `BUILD_REPORT.md` with exact verification results and explicit deferred scope. ## Acceptance Criteria -- A client can persist one user-scoped read-only Gmail account connection record. -- A client can list and read Gmail account records deterministically. -- A client can ingest one selected Gmail message into one visible task workspace through the existing artifact-ingestion seam. -- Gmail message ingestion results in durable `task_artifacts` and `task_artifact_chunks` rows compatible with existing retrieval and compile behavior. -- Cross-user account and workspace access is rejected deterministically. +- Gmail account records no longer persist plaintext access tokens in the normal application table surface. +- Gmail account list and detail responses do not expose secret material. +- The existing single-message Gmail ingestion path continues to work through the hardened credential mechanism. +- Missing or invalid credentials fail deterministically and do not corrupt task workspace or artifact state. - `./.venv/bin/python -m pytest tests/unit` passes. - `./.venv/bin/python -m pytest tests/integration` passes. -- No Gmail search, mailbox sync, attachments, write actions, Calendar, compile-contract, runner, or UI scope enters the sprint. +- No Gmail search, sync, attachments, write actions, Calendar, compile-contract, runner, or UI scope enters the sprint. ## Implementation Constraints -- Keep connector work narrow and boring. -- Reuse the existing rooted workspace, artifact, and RFC822 chunk seams rather than creating a parallel email-content store. -- Keep Gmail handling explicitly read-only. -- Support one explicit selected-message ingestion path only; do not introduce account-wide sync or search in the same sprint. -- Preserve existing retrieval and compile contracts by feeding the already-shipped chunk substrate. +- Keep the repair narrow and boring. +- Do not broaden the Gmail feature surface while fixing credential storage. +- Prefer one explicit protected credential mechanism over ad hoc masking or partial hiding. +- Preserve the current Gmail account and single-message ingestion seams as much as possible outside the security fix. +- If credential hardening needs one minimal rule added to `RULES.md`, keep it scoped to connector-secret handling only. ## Suggested Work Breakdown -1. Add `gmail_accounts` schema and migration. -2. Define Gmail account and single-message ingestion contracts. -3. Implement deterministic Gmail account create, list, and detail behavior. -4. Implement explicit selected-message Gmail ingestion into the existing artifact and RFC822 ingestion seam. +1. Add the schema/migration changes required for protected Gmail credential storage. +2. Update Gmail account contracts so secrets are accepted only on write and never returned on read. +3. Route single-message Gmail ingestion through the hardened credential lookup path. +4. Add deterministic failure handling for missing or invalid protected credentials. 5. Add unit and integration tests. 6. Update `BUILD_REPORT.md` with executed verification. ## Build Report Requirements `BUILD_REPORT.md` must include: -- the exact Gmail account and single-message ingestion contract changes introduced -- the Gmail message-to-artifact conversion rule used +- the exact Gmail credential contract and schema changes introduced +- the protected credential storage mechanism used - exact commands run - unit and integration test results -- one example Gmail account response -- one example single-message ingestion response +- one example Gmail account response proving secret removal +- one example Gmail ingestion response through the hardened path - what remains intentionally deferred to later milestones ## Review Focus `REVIEW_REPORT.md` should verify: -- the sprint stayed limited to read-only Gmail connection metadata and single-message ingestion -- Gmail message ingestion reuses the existing rooted workspace, artifact, and RFC822 chunk seams -- persistence, isolation, and ingestion determinism are test-backed -- no hidden Gmail search, mailbox sync, attachments, write actions, Calendar, compile-contract, runner, or UI scope entered the sprint +- the sprint stayed limited to Gmail credential hardening +- plaintext credential persistence is removed from the normal `gmail_accounts` surface +- Gmail account reads no longer expose secrets +- the existing single-message Gmail ingestion path still works through the hardened seam +- no hidden Gmail search, sync, attachments, write actions, Calendar, compile-contract, runner, or UI scope entered the sprint ## Exit Condition -This sprint is complete when the repo can persist deterministic user-scoped read-only Gmail account records and ingest one selected Gmail message into the existing artifact/chunk seam with passing Postgres-backed tests, while still deferring broader Gmail connector behavior, Calendar, and UI. +This sprint is complete when the repo no longer stores plaintext Gmail credentials in the normal application table surface, the existing read-only single-message Gmail ingestion seam still works through the hardened credential path, and the full path is verified with Postgres-backed tests while broader Gmail and Calendar connector work remains deferred. diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md index 007de79..6a76ea1 100644 --- a/ARCHITECTURE.md +++ b/ARCHITECTURE.md @@ -2,17 +2,17 @@ ## Current Implemented Slice -AliceBot now implements the accepted repo slice through Sprint 5O. The shipped backend includes: +AliceBot now implements the accepted repo slice through Sprint 5P. The shipped backend includes: - foundation continuity storage over `users`, `threads`, `sessions`, and append-only `events` - deterministic tracing and context compilation over durable continuity, memory, entity, and entity-edge records - governed memory admission, explicit-preference extraction, memory review labels, review queue reads, evaluation summary reads, explicit embedding config and memory-embedding storage, direct semantic retrieval, and deterministic hybrid compile-path memory merge - deterministic prompt assembly and one no-tools response path that persists assistant replies as immutable continuity events - user-scoped consents, policies, policy evaluation, tool registry, allowlist evaluation, tool routing, approval request persistence, approval resolution, approved-only proxy execution through the in-process `proxy.echo` handler, durable execution review, and execution-budget lifecycle plus enforcement -- a narrow read-only Gmail connector seam with user-scoped `gmail_accounts` persistence, deterministic account reads, and one explicit selected-message ingestion path that materializes one Gmail message as a rooted `.eml` task artifact and then reuses the existing RFC822 artifact ingestion pipeline +- a narrow read-only Gmail connector seam with user-scoped `gmail_accounts` metadata persistence, separate user-scoped `gmail_account_credentials` protected credential storage, deterministic account reads without secret exposure, and one explicit selected-message ingestion path that materializes one Gmail message as a rooted `.eml` task artifact and then reuses the existing RFC822 artifact ingestion pipeline - durable `tasks`, `task_steps`, `task_workspaces`, `task_artifacts`, `task_artifact_chunks`, and `task_artifact_chunk_embeddings`, deterministic task-step sequencing, explicit task-step transitions, explicit manual continuation with lineage through `parent_step_id`, `source_approval_id`, and `source_execution_id`, explicit `tool_executions.task_step_id` linkage for execution synchronization, deterministic rooted local task-workspace provisioning, explicit rooted local artifact registration, deterministic local plain-text, markdown, narrow PDF text, narrow DOCX text, and narrow RFC822 email text ingestion into durable chunk rows, deterministic lexical artifact-chunk retrieval over durable chunk rows, explicit user-scoped artifact-chunk embedding persistence tied to existing embedding configs, explicit task-scoped or artifact-scoped semantic artifact-chunk retrieval over those durable embeddings, and compile-path artifact retrieval that can include lexical results, semantic results, or one deterministic hybrid lexical-plus-semantic merged artifact section with per-chunk source provenance -The current multi-step boundary is narrow and explicit. Manual continuation is implemented and review-passed. Approval resolution and proxy execution now both use explicit task-step linkage rather than first-step inference. Task workspaces are now implemented only as deterministic rooted local boundaries, and task artifacts are now implemented only as explicit rooted local-file registrations, narrow deterministic artifact ingestion under those workspaces, lexical retrieval over persisted chunk rows, explicit artifact-chunk embedding storage tied to existing embedding configs, direct semantic retrieval over those durable artifact-chunk embeddings for one visible task or one visible artifact at a time, and compile-path artifact retrieval that deterministically merges lexical and semantic candidates into one artifact section when both are requested for the same scope. The live richer-document boundary is still intentionally narrow: plain text and markdown ingest directly, PDF support is limited to narrow local text extraction, DOCX support is limited to narrow local text extraction from `word/document.xml`, RFC822 email support is limited to top-level selected headers plus extractable plain-text body content while excluding nested `message/rfc822` content, and the live connector boundary is limited to one read-only Gmail account seam plus one explicit selected-message ingestion path into the rooted RFC822 artifact pipeline. OCR, image extraction, layout reconstruction, Gmail search, mailbox sync, attachments, Calendar connectors, reranking beyond the current lexical-first hybrid merge, and new side-effect surfaces are still planned later and must not be described as live behavior. +The current multi-step boundary is narrow and explicit. Manual continuation is implemented and review-passed. Approval resolution and proxy execution now both use explicit task-step linkage rather than first-step inference. Task workspaces are now implemented only as deterministic rooted local boundaries, and task artifacts are now implemented only as explicit rooted local-file registrations, narrow deterministic artifact ingestion under those workspaces, lexical retrieval over persisted chunk rows, explicit artifact-chunk embedding storage tied to existing embedding configs, direct semantic retrieval over those durable artifact-chunk embeddings for one visible task or one visible artifact at a time, and compile-path artifact retrieval that deterministically merges lexical and semantic candidates into one artifact section when both are requested for the same scope. The live richer-document boundary is still intentionally narrow: plain text and markdown ingest directly, PDF support is limited to narrow local text extraction, DOCX support is limited to narrow local text extraction from `word/document.xml`, RFC822 email support is limited to top-level selected headers plus extractable plain-text body content while excluding nested `message/rfc822` content, and the live connector boundary is limited to one read-only Gmail account seam plus one explicit selected-message ingestion path into the rooted RFC822 artifact pipeline, with credentials resolved through a dedicated protected credential seam instead of the normal account metadata table surface. OCR, image extraction, layout reconstruction, Gmail search, mailbox sync, attachments, Calendar connectors, reranking beyond the current lexical-first hybrid merge, and new side-effect surfaces are still planned later and must not be described as live behavior. ## Implemented Now @@ -38,7 +38,7 @@ The current multi-step boundary is narrow and explicit. Manual continuation is i - memory and retrieval tables: `memories`, `memory_revisions`, `memory_review_labels`, `embedding_configs`, `memory_embeddings` - graph tables: `entities`, `entity_edges` - governance tables: `consents`, `policies`, `tools`, `approvals`, `tool_executions`, `execution_budgets` - - connector tables: `gmail_accounts` + - connector tables: `gmail_accounts`, `gmail_account_credentials` - task lifecycle tables: `tasks`, `task_steps`, `task_workspaces`, `task_artifacts`, `task_artifact_chunks`, `task_artifact_chunk_embeddings` - `events`, `trace_events`, and `memory_revisions` are append-only by application contract and database enforcement. - `memory_review_labels` are append-only by database enforcement. @@ -64,7 +64,7 @@ The current multi-step boundary is narrow and explicit. Manual continuation is i - `apps/web`: minimal shell only; no shipped workflow UI. - `workers`: scaffold only; no background jobs or runner logic are implemented. - `infra`: local development bootstrap assets only. -- `tests`: unit and Postgres-backed integration coverage for the shipped seams above, including Sprint 4O task-step lineage/manual continuation, Sprint 4S step-linked execution synchronization, Sprint 5A task-workspace provisioning, Sprint 5C task-artifact registration, Sprint 5D local artifact ingestion plus chunk reads, Sprint 5E lexical artifact-chunk retrieval, Sprint 5F compile-path artifact chunk integration, Sprint 5G artifact-chunk embedding persistence and reads, Sprint 5H direct semantic artifact-chunk retrieval, Sprint 5I compile-path semantic artifact retrieval, Sprint 5J deterministic hybrid lexical-plus-semantic artifact merge in compile, Sprint 5L narrow PDF artifact ingestion, Sprint 5M narrow DOCX artifact ingestion, Sprint 5N narrow RFC822 email artifact ingestion, and Sprint 5O read-only Gmail account plus single-message ingestion coverage. +- `tests`: unit and Postgres-backed integration coverage for the shipped seams above, including Sprint 4O task-step lineage/manual continuation, Sprint 4S step-linked execution synchronization, Sprint 5A task-workspace provisioning, Sprint 5C task-artifact registration, Sprint 5D local artifact ingestion plus chunk reads, Sprint 5E lexical artifact-chunk retrieval, Sprint 5F compile-path artifact chunk integration, Sprint 5G artifact-chunk embedding persistence and reads, Sprint 5H direct semantic artifact-chunk retrieval, Sprint 5I compile-path semantic artifact retrieval, Sprint 5J deterministic hybrid lexical-plus-semantic artifact merge in compile, Sprint 5L narrow PDF artifact ingestion, Sprint 5M narrow DOCX artifact ingestion, Sprint 5N narrow RFC822 email artifact ingestion, Sprint 5O read-only Gmail account plus single-message ingestion coverage, and Sprint 5P Gmail credential hardening coverage. ## Core Flows Implemented Now @@ -95,15 +95,17 @@ The current multi-step boundary is narrow and explicit. Manual continuation is i ### Narrow Gmail Connector 1. Accept a user-scoped `POST /v0/gmail-accounts` request for one read-only Gmail account metadata record. -2. Persist exactly the narrow connector metadata currently required for reads later: `provider_account_id`, `email_address`, optional `display_name`, the fixed Gmail read-only scope, and one access token. -3. Expose deterministic user-scoped Gmail account list and detail reads. -4. Accept a user-scoped `POST /v0/gmail-accounts/{gmail_account_id}/messages/{provider_message_id}/ingest` request for one visible Gmail account and one visible task workspace. -5. Derive one deterministic workspace-relative artifact path as `gmail//.eml`. -6. Reject duplicate `(task_workspace_id, relative_path)` collisions before any Gmail fetch or file write. -7. Fetch exactly one selected Gmail message through the read-only Gmail API path `users/me/messages/{provider_message_id}?format=raw`. -8. Require Gmail to return RFC822 `raw` content, validate it against the existing narrow `message/rfc822` extraction rules, and reject unsupported content deterministically. -9. Materialize the message as one rooted `.eml` file inside the selected task workspace and then reuse the existing task-artifact registration plus artifact-ingestion seam. -10. Persist only the resulting `task_artifacts` and `task_artifact_chunks` rows; account-wide sync, search, attachments, Calendar, and write-capable actions remain out of scope. +2. Persist exactly the narrow connector metadata required for later reads on `gmail_accounts`: `provider_account_id`, `email_address`, optional `display_name`, and the fixed Gmail read-only scope. +3. Persist the Gmail access token only in the dedicated `gmail_account_credentials` protected credential seam bound to the same visible user/account ownership scope. +4. Expose deterministic user-scoped Gmail account list and detail reads without secret material. +5. Accept a user-scoped `POST /v0/gmail-accounts/{gmail_account_id}/messages/{provider_message_id}/ingest` request for one visible Gmail account and one visible task workspace. +6. Resolve the Gmail access token through the protected credential seam before any Gmail fetch, file write, or artifact registration. +7. Derive one deterministic workspace-relative artifact path as `gmail//.eml`. +8. Reject duplicate `(task_workspace_id, relative_path)` collisions before any Gmail fetch or file write. +9. Fetch exactly one selected Gmail message through the read-only Gmail API path `users/me/messages/{provider_message_id}?format=raw`. +10. Require Gmail to return RFC822 `raw` content, validate it against the existing narrow `message/rfc822` extraction rules, and reject unsupported content deterministically. +11. Materialize the message as one rooted `.eml` file inside the selected task workspace and then reuse the existing task-artifact registration plus artifact-ingestion seam. +12. Persist only the resulting `task_artifacts` and `task_artifact_chunks` rows; account-wide sync, search, attachments, Calendar, and write-capable actions remain out of scope. ### Governed Memory And Retrieval diff --git a/BUILD_REPORT.md b/BUILD_REPORT.md index 02a3efd..da20535 100644 --- a/BUILD_REPORT.md +++ b/BUILD_REPORT.md @@ -2,134 +2,122 @@ ## sprint objective -Implement Sprint 5O: a narrow read-only Gmail connector seam that can persist user-scoped Gmail account metadata and ingest one explicitly selected Gmail message into the existing task artifact and RFC822 chunk pipeline, without adding search, sync, attachments, write actions, Calendar, compile changes, runner work, or UI. +Implement Sprint 5P: harden the existing narrow Gmail connector seam so Gmail access tokens are no longer stored on the normal `gmail_accounts` table surface, Gmail account reads never expose secrets, and the existing single-message Gmail ingestion path continues to work through an explicit protected credential lookup. ## completed work -- Added a new `gmail_accounts` table and migration with user-scoped row-level security and deterministic listing order. -- Added stable Gmail contracts for: - - account connect - - account list - - account detail - - single-message ingestion -- Implemented `apps/api/src/alicebot_api/gmail.py` with: - - Gmail account persistence - - deterministic account serialization - - a single explicit Gmail read-only fetch path using `users/me/messages/{message_id}?format=raw` - - pre-ingestion RFC822 validation against the existing artifact rules - - deterministic Gmail-message-to-artifact path generation - - duplicate artifact-path rejection before any Gmail fetch or filesystem write - - workspace artifact locking aligned with the normal artifact registration seam so duplicate detection, file checks, and write attempts occur inside the same serialized critical section - - reuse of existing `register_task_artifact_record()` and `ingest_task_artifact_record()` -- Added API endpoints for: - - `POST /v0/gmail-accounts` - - `GET /v0/gmail-accounts` - - `GET /v0/gmail-accounts/{gmail_account_id}` - - `POST /v0/gmail-accounts/{gmail_account_id}/messages/{provider_message_id}/ingest` -- Added a reusable byte-level artifact extraction helper so Gmail ingestion can validate raw RFC822 content before persisting anything through the artifact seam. +- Added migration `20260316_0027_gmail_account_credentials.py` to move Gmail tokens out of `gmail_accounts`. +- Introduced a protected Gmail credential table with: + - row-level security + - `gmail_account_id` ownership binding + - a checked credential blob shape + - backfill from existing `gmail_accounts.access_token` +- Removed plaintext `access_token` storage from the normal `gmail_accounts` table surface by dropping the column in the new migration. +- Kept the Gmail connect write contract narrow: + - connect still accepts `access_token` on write + - account list/detail responses still return the same stable metadata shape without secret material +- Updated the Gmail service seam to: + - persist account metadata and protected credentials separately + - resolve the access token through the protected credential lookup during ingestion + - fail deterministically when protected credentials are missing or malformed + - fail before Gmail fetches, artifact registration, or filesystem writes when credentials are unusable - Added unit and integration coverage for: - - Gmail account persistence - - deterministic listing - - stable response shapes - - single-message ingestion through the existing artifact and chunk seam - - sanitized path collision rejection without overwriting the existing `.eml` - - cross-user workspace rejection - - missing Gmail message rejection - - unsupported Gmail message rejection - -## exact Gmail account and single-message ingestion contract changes introduced - -- Gmail account connect request: + - protected credential persistence + - secret removal from Gmail account responses + - hardened single-message ingestion success + - deterministic missing/invalid credential failures + - per-user isolation + - stable response shape + +## exact Gmail credential contract and schema changes introduced + +- Gmail account connect request remains: - `user_id: UUID` - `provider_account_id: str` - `email_address: str` - `display_name: str | null` - `scope: "https://www.googleapis.com/auth/gmail.readonly"` - `access_token: str` -- Gmail account record response: - - `id: str` - - `provider: "gmail"` - - `auth_kind: "oauth_access_token"` - - `provider_account_id: str` - - `email_address: str` - - `display_name: str | null` - - `scope: "https://www.googleapis.com/auth/gmail.readonly"` - - `created_at: str` - - `updated_at: str` -- Gmail account list response: - - `items: GmailAccountRecord[]` - - `summary: { total_count, order }` -- Gmail account detail response: - - `account: GmailAccountRecord` -- Single-message ingestion request: - - path params: `gmail_account_id`, `provider_message_id` - - body: `user_id: UUID`, `task_workspace_id: UUID` -- Single-message ingestion response: - - `account: GmailAccountRecord` - - `message: { provider_message_id, artifact_relative_path, media_type }` - - `artifact: TaskArtifactRecord` - - `summary: TaskArtifactChunkListSummary` - -## Gmail message-to-artifact conversion rule used - -- Fetch Gmail message raw bytes from the read-only Gmail API path using the stored access token. -- Require Gmail to return RFC822 `raw` content. -- Validate the raw bytes against the existing `message/rfc822` artifact extraction rules before registration. -- Materialize the message inside the selected visible task workspace at: - - `gmail//.eml` -- Register that `.eml` file as a `message/rfc822` task artifact. -- Ingest it through the existing artifact pipeline so chunks land in `task_artifact_chunks`. +- Gmail account read responses remain secret-free: + - `id` + - `provider` + - `auth_kind` + - `provider_account_id` + - `email_address` + - `display_name` + - `scope` + - `created_at` + - `updated_at` +- `gmail_accounts` schema change: + - dropped plaintext column `access_token` +- New `gmail_account_credentials` schema: + - `gmail_account_id uuid primary key references gmail_accounts(id) on delete cascade` + - `user_id uuid not null` + - `auth_kind text not null check = 'oauth_access_token'` + - `credential_blob jsonb not null` + - `created_at timestamptz not null` + - `updated_at timestamptz not null` + - composite ownership FK to `gmail_accounts (id, user_id)` + - RLS owner policy +- Protected credential blob shape: + - `{"credential_kind": "gmail_oauth_access_token_v1", "access_token": ""}` + +## protected credential storage mechanism used + +Gmail credentials are now stored in a dedicated `gmail_account_credentials` table guarded by row-level security and ownership checks, with the Gmail account record carrying only non-secret metadata. The ingestion path resolves the token through that separate protected table instead of reading it from `gmail_accounts`. ## incomplete work -- None inside Sprint 5O scope. +- None inside Sprint 5P scope. ## files changed -- `apps/api/alembic/versions/20260316_0026_gmail_accounts.py` +- `apps/api/alembic/versions/20260316_0027_gmail_account_credentials.py` - `ARCHITECTURE.md` -- `apps/api/src/alicebot_api/artifacts.py` +- `RULES.md` - `apps/api/src/alicebot_api/contracts.py` - `apps/api/src/alicebot_api/gmail.py` - `apps/api/src/alicebot_api/main.py` - `apps/api/src/alicebot_api/store.py` - `tests/integration/test_gmail_accounts_api.py` -- `tests/unit/test_20260316_0026_gmail_accounts.py` +- `tests/integration/test_migrations.py` +- `tests/unit/test_20260316_0027_gmail_account_credentials.py` - `tests/unit/test_gmail.py` - `tests/unit/test_gmail_main.py` - `BUILD_REPORT.md` ## exact commands run -- `./.venv/bin/python -m pytest tests/unit/test_gmail.py tests/unit/test_gmail_main.py tests/unit/test_20260316_0026_gmail_accounts.py` -- `./.venv/bin/python -m pytest tests/integration/test_gmail_accounts_api.py` -- `./.venv/bin/python -m pytest tests/unit/test_gmail.py tests/unit/test_gmail_main.py` +- `./.venv/bin/python -m pytest tests/unit/test_gmail.py` +- `./.venv/bin/python -m pytest tests/unit/test_gmail_main.py tests/unit/test_20260316_0026_gmail_accounts.py tests/unit/test_20260316_0027_gmail_account_credentials.py` - `./.venv/bin/python -m pytest tests/integration/test_gmail_accounts_api.py` +- `./.venv/bin/python -m pytest tests/integration/test_migrations.py` - `./.venv/bin/python -m pytest tests/unit` - `./.venv/bin/python -m pytest tests/integration` ## tests run -- `./.venv/bin/python -m pytest tests/unit/test_gmail.py tests/unit/test_gmail_main.py tests/unit/test_20260316_0026_gmail_accounts.py` - - Result: `14 passed in 0.57s` -- `./.venv/bin/python -m pytest tests/unit/test_gmail.py tests/unit/test_gmail_main.py` - - Result: `12 passed in 0.28s` -- `./.venv/bin/python -m pytest tests/integration/test_gmail_accounts_api.py` - - Result: `4 passed in 1.47s` +- `./.venv/bin/python -m pytest tests/unit/test_gmail.py` + - Result: `12 passed in 0.11s` +- `./.venv/bin/python -m pytest tests/unit/test_gmail_main.py tests/unit/test_20260316_0026_gmail_accounts.py tests/unit/test_20260316_0027_gmail_account_credentials.py` + - Result: `11 passed in 0.55s` - `./.venv/bin/python -m pytest tests/integration/test_gmail_accounts_api.py` - - Result: `5 passed in 1.62s` + - Result: `6 passed in 2.10s` +- `./.venv/bin/python -m pytest tests/integration/test_migrations.py` + - Result: `3 passed in 1.35s` - `./.venv/bin/python -m pytest tests/unit` - - Result: `409 passed in 0.67s` + - Result: `417 passed in 0.67s` - `./.venv/bin/python -m pytest tests/integration` - - Result: `132 passed in 39.29s` + - Result: `134 passed in 41.74s` ## unit and integration test results -- Full unit suite passed. -- Full integration suite passed. -- Gmail-specific unit and integration coverage passed independently before the full-suite runs. +- Required unit suite passed. +- Required integration suite passed. +- Gmail-focused unit and integration coverage passed independently before the full-suite runs. +- Migration round-trip coverage now explicitly verifies Gmail credential backfill on upgrade and token restoration on downgrade for revision `20260316_0027`. -## one example Gmail account response +## one example Gmail account response proving secret removal ```json { @@ -147,7 +135,7 @@ Implement Sprint 5O: a narrow read-only Gmail connector seam that can persist us } ``` -## one example single-message ingestion response +## one example Gmail ingestion response through the hardened path ```json { @@ -180,7 +168,7 @@ Implement Sprint 5O: a narrow read-only Gmail connector seam that can persist us }, "summary": { "total_count": 1, - "total_characters": 90, + "total_characters": 19, "media_type": "message/rfc822", "chunking_rule": "normalized_utf8_text_fixed_window_1000_chars_v1", "order": ["sequence_no_asc", "id_asc"] @@ -190,8 +178,8 @@ Implement Sprint 5O: a narrow read-only Gmail connector seam that can persist us ## blockers/issues -- No code blockers remained. -- Full integration verification required local Postgres access outside the default sandbox. +- No remaining code blockers. +- Integration verification required local Postgres access outside the default sandbox. ## what remains intentionally deferred to later milestones @@ -200,11 +188,12 @@ Implement Sprint 5O: a narrow read-only Gmail connector seam that can persist us - attachment ingestion - write-capable Gmail actions - Calendar connector scope -- OAuth UX or callback UI +- OAuth UI or callback handling +- refresh-token lifecycle work - compile-contract changes -- runner-style orchestration +- runner orchestration - UI work ## recommended next step -Open a follow-up sprint for credential hardening and a fuller Gmail auth lifecycle if the product needs more than this narrow single-message read-only ingestion path. +Add the next narrow Gmail auth milestone only if needed: refresh-token or external secret-manager support, without broadening into search, sync, Calendar, or UI in the same change. diff --git a/REVIEW_REPORT.md b/REVIEW_REPORT.md index f174cd2..79d764a 100644 --- a/REVIEW_REPORT.md +++ b/REVIEW_REPORT.md @@ -6,17 +6,20 @@ PASS ## criteria met -- Sprint 5O stays within scope. I found no Gmail search, mailbox sync, attachment ingestion, write-capable Gmail actions, Calendar scope, compile-contract changes, runner work, or UI work. -- The Gmail connector seam remains narrow and reuses the existing rooted workspace, artifact registration, and RFC822 chunk-ingestion pipeline. -- The earlier sanitized-path collision issue is now fixed for both sequential and concurrent access paths. Gmail ingestion takes `lock_task_artifacts()` before duplicate detection, file existence checks, and writes, matching the normal artifact registration critical section. -- Regression coverage was tightened so the duplicate-path unit test now proves lock acquisition happens before duplicate lookup, and the integration coverage still proves the second colliding request returns `409` without overwriting the first `.eml`. -- `ARCHITECTURE.md` now reflects Sprint 5O, the live `gmail_accounts` schema, the Gmail endpoints, and the narrow read-only Gmail connector flow. -- `BUILD_REPORT.md` matches the current implementation and verification state. +- Sprint 5P remains limited to Gmail credential hardening. I found no Gmail search, sync, attachments, write actions, Calendar, compile-contract, runner, or UI scope. +- Plaintext Gmail access tokens are removed from the normal `gmail_accounts` table surface. Revision `20260316_0027` backfills tokens into `gmail_account_credentials` and drops `gmail_accounts.access_token`. +- Gmail account list and detail reads remain secret-free. The Gmail account response shape is stable and the integration suite asserts `access_token` is absent from connect, list, and detail payloads. +- The existing single-message Gmail ingestion seam still works through the hardened path. Ingestion now resolves the token through the protected credential seam before Gmail fetches and then reuses the existing RFC822 artifact pipeline. +- Missing or invalid protected credentials fail deterministically before Gmail fetches, workspace writes, or artifact registration, with unit and integration coverage for the no-side-effects path. +- Per-user isolation remains intact through user-scoped connections, RLS on `gmail_account_credentials`, and the ownership FK binding credentials to the visible Gmail account row. +- `BUILD_REPORT.md` now matches the implemented files and current verification state. +- `ARCHITECTURE.md` now reflects Sprint 5P, the `gmail_account_credentials` table, and the protected credential lookup in the narrow Gmail flow. +- `RULES.md` now includes the narrow connector-secret handling rule. +- Migration coverage now includes a Gmail-specific round-trip test proving `20260316_0027` backfills tokens on upgrade and restores `gmail_accounts.access_token` on downgrade. - Verified locally: - - `./.venv/bin/python -m pytest tests/unit/test_gmail.py tests/unit/test_gmail_main.py` -> `12 passed in 0.27s` - - `./.venv/bin/python -m pytest tests/integration/test_gmail_accounts_api.py` -> `5 passed in 1.89s` - - `./.venv/bin/python -m pytest tests/unit` -> `409 passed in 0.63s` - - `./.venv/bin/python -m pytest tests/integration` -> `132 passed in 39.72s` + - `./.venv/bin/python -m pytest tests/unit` -> `417 passed in 0.55s` + - `./.venv/bin/python -m pytest tests/integration/test_migrations.py` -> `3 passed in 1.46s` + - `./.venv/bin/python -m pytest tests/integration` -> `134 passed in 39.86s` ## criteria missed @@ -24,12 +27,12 @@ PASS ## quality issues -- No blocking implementation, regression, or scope issues remain for Sprint 5O. +- No blocking implementation, regression, or scope issues remain for Sprint 5P. ## regression risks -- Residual risk is limited to intentionally deferred areas already called out by the sprint packet: Gmail search, sync, attachments, write actions, broader auth lifecycle hardening, Calendar, and UI. -- Gmail access tokens are still persisted in plaintext on `gmail_accounts`. That is acceptable for this narrow sprint slice but should be hardened before broader connector rollout. +- Residual risk is limited to intentionally deferred work already called out by the sprint packet: refresh-token lifecycle, external secret-manager support, Gmail search, sync, attachments, write actions, Calendar, compile-contract changes, runner orchestration, and UI. +- The protected credential mechanism is still database-local storage rather than encryption or an external secret manager. That matches Sprint 5P and should not be described more broadly than that. ## docs issues @@ -37,7 +40,7 @@ PASS ## should anything be added to RULES.md? -- Yes. Add a durable connector-security rule stating that connector credentials should not remain in plaintext application tables beyond explicitly temporary prototype scope. +- No further rule change is required for this sprint. ## should anything update ARCHITECTURE.md? @@ -45,4 +48,4 @@ PASS ## recommended next action -- Accept Sprint 5O as complete and merge after the normal approval flow. Open a follow-up sprint for credential hardening and a fuller Gmail auth lifecycle when the product needs to move beyond this narrow read-only ingestion seam. +- Accept Sprint 5P as complete and merge after the normal approval flow. Any next Gmail work should stay narrow and focus on refresh-token or secret-manager evolution without broadening into search, sync, Calendar, or UI in the same change. diff --git a/RULES.md b/RULES.md index ad493d8..5956d87 100644 --- a/RULES.md +++ b/RULES.md @@ -24,6 +24,7 @@ - Treat Postgres as the v1 system of record unless measured constraints justify a change. - Task-step lineage and execution linkage must stay explicit; do not reconstruct them heuristically from broader task history. - Enforce row-level security on every user-owned table. +- Connector secrets must not be stored on normal metadata tables or exposed on read surfaces; they must use a dedicated protected storage seam. - Default memory admission to `NOOP`; promote only evidence-backed changes and preserve revision history for non-`NOOP` updates. - Apply domain and sensitivity filters before semantic retrieval. diff --git a/apps/api/alembic/versions/20260316_0027_gmail_account_credentials.py b/apps/api/alembic/versions/20260316_0027_gmail_account_credentials.py new file mode 100644 index 0000000..8f5dd87 --- /dev/null +++ b/apps/api/alembic/versions/20260316_0027_gmail_account_credentials.py @@ -0,0 +1,128 @@ +"""Move Gmail access tokens into a protected credential table.""" + +from __future__ import annotations + +from alembic import op + + +revision = "20260316_0027" +down_revision = "20260316_0026" +branch_labels = None +depends_on = None + +GMAIL_AUTH_KIND_OAUTH_ACCESS_TOKEN = "oauth_access_token" +GMAIL_PROTECTED_CREDENTIAL_KIND = "gmail_oauth_access_token_v1" + +_RLS_TABLES = ("gmail_account_credentials",) + +_UPGRADE_SCHEMA_STATEMENT = f""" + CREATE TABLE gmail_account_credentials ( + gmail_account_id uuid PRIMARY KEY REFERENCES gmail_accounts(id) ON DELETE CASCADE, + user_id uuid NOT NULL, + auth_kind text NOT NULL, + credential_blob jsonb NOT NULL, + created_at timestamptz NOT NULL DEFAULT now(), + updated_at timestamptz NOT NULL DEFAULT now(), + FOREIGN KEY (gmail_account_id, user_id) + REFERENCES gmail_accounts (id, user_id) + ON DELETE CASCADE, + CONSTRAINT gmail_account_credentials_auth_kind_check + CHECK (auth_kind = '{GMAIL_AUTH_KIND_OAUTH_ACCESS_TOKEN}'), + CONSTRAINT gmail_account_credentials_blob_shape_check + CHECK ( + jsonb_typeof(credential_blob) = 'object' + AND credential_blob ? 'credential_kind' + AND credential_blob ? 'access_token' + AND credential_blob ->> 'credential_kind' = '{GMAIL_PROTECTED_CREDENTIAL_KIND}' + AND jsonb_typeof(credential_blob -> 'access_token') = 'string' + AND length(credential_blob ->> 'access_token') > 0 + ) + ); + + CREATE INDEX gmail_account_credentials_user_created_idx + ON gmail_account_credentials (user_id, created_at, gmail_account_id); + """ + +_UPGRADE_BACKFILL_STATEMENT = f""" + INSERT INTO gmail_account_credentials ( + gmail_account_id, + user_id, + auth_kind, + credential_blob, + created_at, + updated_at + ) + SELECT + id, + user_id, + '{GMAIL_AUTH_KIND_OAUTH_ACCESS_TOKEN}', + jsonb_build_object( + 'credential_kind', '{GMAIL_PROTECTED_CREDENTIAL_KIND}', + 'access_token', access_token + ), + created_at, + updated_at + FROM gmail_accounts; + """ + +_UPGRADE_DROP_PLAINTEXT_STATEMENTS = ( + "ALTER TABLE gmail_accounts DROP CONSTRAINT gmail_accounts_access_token_nonempty_check", + "ALTER TABLE gmail_accounts DROP COLUMN access_token", +) + +_UPGRADE_GRANT_STATEMENTS = ( + "GRANT SELECT, INSERT ON gmail_account_credentials TO alicebot_app", +) + +_UPGRADE_POLICY_STATEMENT = """ + CREATE POLICY gmail_account_credentials_is_owner ON gmail_account_credentials + USING (user_id = app.current_user_id()) + WITH CHECK (user_id = app.current_user_id()); + """ + +_DOWNGRADE_ADD_PLAINTEXT_STATEMENTS = ( + "ALTER TABLE gmail_accounts ADD COLUMN access_token text", +) + +_DOWNGRADE_BACKFILL_STATEMENT = """ + UPDATE gmail_accounts AS accounts + SET access_token = credentials.credential_blob ->> 'access_token' + FROM gmail_account_credentials AS credentials + WHERE credentials.gmail_account_id = accounts.id + """ + +_DOWNGRADE_RESTORE_CONSTRAINT_STATEMENTS = ( + "ALTER TABLE gmail_accounts ALTER COLUMN access_token SET NOT NULL", + """ + ALTER TABLE gmail_accounts + ADD CONSTRAINT gmail_accounts_access_token_nonempty_check + CHECK (length(access_token) > 0) + """, + "DROP TABLE IF EXISTS gmail_account_credentials", +) + + +def _execute_statements(statements: tuple[str, ...]) -> None: + for statement in statements: + op.execute(statement) + + +def _enable_row_level_security() -> None: + for table_name in _RLS_TABLES: + op.execute(f"ALTER TABLE {table_name} ENABLE ROW LEVEL SECURITY") + op.execute(f"ALTER TABLE {table_name} FORCE ROW LEVEL SECURITY") + + +def upgrade() -> None: + op.execute(_UPGRADE_SCHEMA_STATEMENT) + op.execute(_UPGRADE_BACKFILL_STATEMENT) + _execute_statements(_UPGRADE_DROP_PLAINTEXT_STATEMENTS) + _execute_statements(_UPGRADE_GRANT_STATEMENTS) + _enable_row_level_security() + op.execute(_UPGRADE_POLICY_STATEMENT) + + +def downgrade() -> None: + _execute_statements(_DOWNGRADE_ADD_PLAINTEXT_STATEMENTS) + op.execute(_DOWNGRADE_BACKFILL_STATEMENT) + _execute_statements(_DOWNGRADE_RESTORE_CONSTRAINT_STATEMENTS) diff --git a/apps/api/src/alicebot_api/contracts.py b/apps/api/src/alicebot_api/contracts.py index d614b16..035273e 100644 --- a/apps/api/src/alicebot_api/contracts.py +++ b/apps/api/src/alicebot_api/contracts.py @@ -178,6 +178,7 @@ GMAIL_PROVIDER = "gmail" GMAIL_AUTH_KIND_OAUTH_ACCESS_TOKEN = "oauth_access_token" GMAIL_READONLY_SCOPE = "https://www.googleapis.com/auth/gmail.readonly" +GMAIL_PROTECTED_CREDENTIAL_KIND = "gmail_oauth_access_token_v1" TASK_STEP_SEQUENCE_VERSION_V0 = "task_step_sequence_v0" TRACE_KIND_TASK_STEP_SEQUENCE = "task.step.sequence" TASK_STEP_CONTINUATION_VERSION_V0 = "task_step_continuation_v0" diff --git a/apps/api/src/alicebot_api/gmail.py b/apps/api/src/alicebot_api/gmail.py index 6243916..9d908d1 100644 --- a/apps/api/src/alicebot_api/gmail.py +++ b/apps/api/src/alicebot_api/gmail.py @@ -23,6 +23,7 @@ from alicebot_api.contracts import ( GMAIL_ACCOUNT_LIST_ORDER, GMAIL_AUTH_KIND_OAUTH_ACCESS_TOKEN, + GMAIL_PROTECTED_CREDENTIAL_KIND, GMAIL_PROVIDER, GMAIL_READONLY_SCOPE, GmailAccountConnectInput, @@ -63,6 +64,14 @@ class GmailMessageFetchError(RuntimeError): """Raised when the Gmail API call fails for non-deterministic upstream reasons.""" +class GmailCredentialNotFoundError(RuntimeError): + """Raised when Gmail protected credentials are missing for a visible account.""" + + +class GmailCredentialInvalidError(RuntimeError): + """Raised when Gmail protected credentials are malformed for a visible account.""" + + def serialize_gmail_account_row(row: GmailAccountRow) -> GmailAccountRecord: return { "id": str(row["id"]), @@ -77,6 +86,49 @@ def serialize_gmail_account_row(row: GmailAccountRow) -> GmailAccountRecord: } +def build_gmail_protected_credential_blob(*, access_token: str) -> dict[str, str]: + return { + "credential_kind": GMAIL_PROTECTED_CREDENTIAL_KIND, + "access_token": access_token, + } + + +def resolve_gmail_access_token( + store: ContinuityStore, + *, + gmail_account_id: UUID, +) -> str: + credential = store.get_gmail_account_credential_optional(gmail_account_id) + if credential is None: + raise GmailCredentialNotFoundError( + f"gmail account {gmail_account_id} is missing protected credentials" + ) + + if credential["auth_kind"] != GMAIL_AUTH_KIND_OAUTH_ACCESS_TOKEN: + raise GmailCredentialInvalidError( + f"gmail account {gmail_account_id} has invalid protected credentials" + ) + + credential_blob = credential["credential_blob"] + if not isinstance(credential_blob, dict): + raise GmailCredentialInvalidError( + f"gmail account {gmail_account_id} has invalid protected credentials" + ) + + credential_kind = credential_blob.get("credential_kind") + access_token = credential_blob.get("access_token") + if ( + credential_kind != GMAIL_PROTECTED_CREDENTIAL_KIND + or not isinstance(access_token, str) + or access_token == "" + ): + raise GmailCredentialInvalidError( + f"gmail account {gmail_account_id} has invalid protected credentials" + ) + + return access_token + + def create_gmail_account_record( store: ContinuityStore, *, @@ -97,7 +149,13 @@ def create_gmail_account_record( email_address=request.email_address, display_name=request.display_name, scope=request.scope, - access_token=request.access_token, + ) + store.create_gmail_account_credential( + gmail_account_id=row["id"], + auth_kind=GMAIL_AUTH_KIND_OAUTH_ACCESS_TOKEN, + credential_blob=build_gmail_protected_credential_blob( + access_token=request.access_token, + ), ) except psycopg.errors.UniqueViolation as exc: raise GmailAccountAlreadyExistsError( @@ -215,6 +273,11 @@ def ingest_gmail_message_record( f"task workspace {request.task_workspace_id} was not found" ) + access_token = resolve_gmail_access_token( + store, + gmail_account_id=request.gmail_account_id, + ) + store.lock_task_artifacts(workspace["id"]) relative_path = build_gmail_message_artifact_relative_path( provider_account_id=account["provider_account_id"], @@ -230,7 +293,7 @@ def ingest_gmail_message_record( ) raw_bytes = fetch_gmail_message_raw_bytes( - access_token=account["access_token"], + access_token=access_token, provider_message_id=request.provider_message_id, ) diff --git a/apps/api/src/alicebot_api/main.py b/apps/api/src/alicebot_api/main.py index c5a224b..9fbe271 100644 --- a/apps/api/src/alicebot_api/main.py +++ b/apps/api/src/alicebot_api/main.py @@ -140,6 +140,8 @@ ) from alicebot_api.gmail import ( GmailAccountAlreadyExistsError, + GmailCredentialInvalidError, + GmailCredentialNotFoundError, GmailAccountNotFoundError, GmailMessageFetchError, GmailMessageNotFoundError, @@ -1373,6 +1375,8 @@ def ingest_gmail_message( return JSONResponse(status_code=404, content={"detail": str(exc)}) except GmailMessageUnsupportedError as exc: return JSONResponse(status_code=400, content={"detail": str(exc)}) + except (GmailCredentialNotFoundError, GmailCredentialInvalidError) as exc: + return JSONResponse(status_code=409, content={"detail": str(exc)}) except TaskArtifactValidationError as exc: return JSONResponse(status_code=400, content={"detail": str(exc)}) except GmailMessageFetchError as exc: diff --git a/apps/api/src/alicebot_api/store.py b/apps/api/src/alicebot_api/store.py index 4098cda..4822a44 100644 --- a/apps/api/src/alicebot_api/store.py +++ b/apps/api/src/alicebot_api/store.py @@ -252,7 +252,15 @@ class GmailAccountRow(TypedDict): email_address: str display_name: str | None scope: str - access_token: str + created_at: datetime + updated_at: datetime + + +class ProtectedGmailCredentialRow(TypedDict): + gmail_account_id: UUID + user_id: UUID + auth_kind: str + credential_blob: JsonObject created_at: datetime updated_at: datetime @@ -1488,7 +1496,6 @@ class LabelCountRow(TypedDict): email_address, display_name, scope, - access_token, created_at, updated_at ) @@ -1498,7 +1505,6 @@ class LabelCountRow(TypedDict): %s, %s, %s, - %s, clock_timestamp(), clock_timestamp() ) @@ -1509,7 +1515,32 @@ class LabelCountRow(TypedDict): email_address, display_name, scope, - access_token, + created_at, + updated_at + """ + +INSERT_GMAIL_ACCOUNT_CREDENTIAL_SQL = """ + INSERT INTO gmail_account_credentials ( + gmail_account_id, + user_id, + auth_kind, + credential_blob, + created_at, + updated_at + ) + VALUES ( + %s, + app.current_user_id(), + %s, + %s, + clock_timestamp(), + clock_timestamp() + ) + RETURNING + gmail_account_id, + user_id, + auth_kind, + credential_blob, created_at, updated_at """ @@ -1522,7 +1553,6 @@ class LabelCountRow(TypedDict): email_address, display_name, scope, - access_token, created_at, updated_at FROM gmail_accounts @@ -1537,7 +1567,6 @@ class LabelCountRow(TypedDict): email_address, display_name, scope, - access_token, created_at, updated_at FROM gmail_accounts @@ -1546,6 +1575,18 @@ class LabelCountRow(TypedDict): LIMIT 1 """ +GET_GMAIL_ACCOUNT_CREDENTIAL_SQL = """ + SELECT + gmail_account_id, + user_id, + auth_kind, + credential_blob, + created_at, + updated_at + FROM gmail_account_credentials + WHERE gmail_account_id = %s + """ + LIST_GMAIL_ACCOUNTS_SQL = """ SELECT id, @@ -1554,7 +1595,6 @@ class LabelCountRow(TypedDict): email_address, display_name, scope, - access_token, created_at, updated_at FROM gmail_accounts @@ -3086,17 +3126,35 @@ def create_gmail_account( email_address: str, display_name: str | None, scope: str, - access_token: str, ) -> GmailAccountRow: return self._fetch_one( "create_gmail_account", INSERT_GMAIL_ACCOUNT_SQL, - (provider_account_id, email_address, display_name, scope, access_token), + (provider_account_id, email_address, display_name, scope), + ) + + def create_gmail_account_credential( + self, + *, + gmail_account_id: UUID, + auth_kind: str, + credential_blob: JsonObject, + ) -> ProtectedGmailCredentialRow: + return self._fetch_one( + "create_gmail_account_credential", + INSERT_GMAIL_ACCOUNT_CREDENTIAL_SQL, + (gmail_account_id, auth_kind, Jsonb(credential_blob)), ) def get_gmail_account_optional(self, gmail_account_id: UUID) -> GmailAccountRow | None: return self._fetch_optional_one(GET_GMAIL_ACCOUNT_SQL, (gmail_account_id,)) + def get_gmail_account_credential_optional( + self, + gmail_account_id: UUID, + ) -> ProtectedGmailCredentialRow | None: + return self._fetch_optional_one(GET_GMAIL_ACCOUNT_CREDENTIAL_SQL, (gmail_account_id,)) + def get_gmail_account_by_provider_account_id_optional( self, provider_account_id: str, diff --git a/tests/integration/test_gmail_accounts_api.py b/tests/integration/test_gmail_accounts_api.py index d9aa1df..334cabd 100644 --- a/tests/integration/test_gmail_accounts_api.py +++ b/tests/integration/test_gmail_accounts_api.py @@ -7,6 +7,7 @@ from uuid import UUID, uuid4 import anyio +import psycopg import apps.api.src.alicebot_api.main as main_module from apps.api.src.alicebot_api.config import Settings @@ -240,6 +241,39 @@ def test_gmail_account_endpoints_connect_list_detail_and_isolate( assert isolated_detail_payload == { "detail": f"gmail account {create_payload['account']['id']} was not found" } + assert '"access_token":' not in json.dumps(create_payload) + assert '"access_token":' not in json.dumps(list_payload) + assert '"access_token":' not in json.dumps(detail_payload) + + with psycopg.connect(migrated_database_urls["admin"]) as conn: + with conn.cursor() as cur: + cur.execute( + """ + SELECT column_name + FROM information_schema.columns + WHERE table_schema = 'public' + AND table_name = 'gmail_accounts' + ORDER BY ordinal_position + """ + ) + gmail_account_columns = {row[0] for row in cur.fetchall()} + assert "access_token" not in gmail_account_columns + cur.execute( + """ + SELECT + auth_kind, + credential_blob ->> 'credential_kind', + credential_blob ->> 'access_token' + FROM gmail_account_credentials + WHERE gmail_account_id = %s + """, + (UUID(create_payload["account"]["id"]),), + ) + assert cur.fetchone() == ( + "oauth_access_token", + "gmail_oauth_access_token_v1", + "token-for-acct-owner-001", + ) def test_gmail_message_ingestion_endpoint_persists_artifact_and_chunks( @@ -379,6 +413,71 @@ def test_gmail_message_ingestion_endpoint_rejects_cross_user_workspace_access( } +def test_gmail_message_ingestion_endpoint_rejects_missing_protected_credentials_without_side_effects( + migrated_database_urls, + monkeypatch, + tmp_path, +) -> None: + owner = seed_task(migrated_database_urls["app"], email="owner@example.com") + workspace_root = tmp_path / "task-workspaces" + monkeypatch.setattr( + main_module, + "get_settings", + lambda: Settings( + database_url=migrated_database_urls["app"], + task_workspace_root=str(workspace_root), + ), + ) + + def fail_fetch(**_kwargs): + raise AssertionError("fetch_gmail_message_raw_bytes should not be called") + + monkeypatch.setattr(gmail_module, "fetch_gmail_message_raw_bytes", fail_fetch) + + _, account_payload = _connect_gmail_account( + user_id=owner["user_id"], + provider_account_id="acct-owner-001", + email_address="owner@gmail.example", + ) + _, workspace_payload = invoke_request( + "POST", + f"/v0/tasks/{owner['task_id']}/workspace", + payload={"user_id": str(owner["user_id"])}, + ) + + with psycopg.connect(migrated_database_urls["admin"]) as conn: + with conn.cursor() as cur: + cur.execute( + "DELETE FROM gmail_account_credentials WHERE gmail_account_id = %s", + (UUID(account_payload["account"]["id"]),), + ) + conn.commit() + + ingest_status, ingest_payload = invoke_request( + "POST", + f"/v0/gmail-accounts/{account_payload['account']['id']}/messages/msg-001/ingest", + payload={ + "user_id": str(owner["user_id"]), + "task_workspace_id": workspace_payload["workspace"]["id"], + }, + ) + + assert ingest_status == 409 + assert ingest_payload == { + "detail": ( + f"gmail account {account_payload['account']['id']} is missing protected credentials" + ) + } + artifact_file = ( + Path(workspace_payload["workspace"]["local_path"]) / "gmail" / "acct-owner-001" / "msg-001.eml" + ) + assert not artifact_file.exists() + + with user_connection(migrated_database_urls["app"], owner["user_id"]) as conn: + store = ContinuityStore(conn) + assert store.list_task_artifacts_for_task(owner["task_id"]) == [] + + def test_gmail_message_ingestion_endpoint_rejects_sanitized_path_collisions_without_overwrite( migrated_database_urls, monkeypatch, diff --git a/tests/integration/test_migrations.py b/tests/integration/test_migrations.py index e8e7011..7e4817d 100644 --- a/tests/integration/test_migrations.py +++ b/tests/integration/test_migrations.py @@ -248,6 +248,105 @@ def test_tool_execution_task_step_linkage_migration_backfills_existing_rows(data assert cur.fetchone() == ("NO",) +def test_gmail_account_credentials_migration_round_trip_preserves_tokens(database_urls): + config = make_alembic_config(database_urls["admin"]) + user_id = "00000000-0000-0000-0000-000000000101" + gmail_account_id = "00000000-0000-0000-0000-000000000102" + + command.upgrade(config, "20260316_0026") + + with psycopg.connect(database_urls["admin"]) as conn: + with conn.cursor() as cur: + cur.execute( + """ + INSERT INTO users (id, email, display_name) + VALUES (%s, 'gmail-migration@example.com', 'Gmail Migration User') + """, + (user_id,), + ) + cur.execute( + """ + INSERT INTO gmail_accounts ( + id, + user_id, + provider_account_id, + email_address, + display_name, + scope, + access_token + ) + VALUES ( + %s, + %s, + 'acct-migration-001', + 'owner@gmail.example', + 'Owner', + 'https://www.googleapis.com/auth/gmail.readonly', + 'token-before-hardening' + ) + """, + (gmail_account_id, user_id), + ) + conn.commit() + + command.upgrade(config, "20260316_0027") + + with psycopg.connect(database_urls["admin"]) as conn: + with conn.cursor() as cur: + cur.execute( + """ + SELECT column_name + FROM information_schema.columns + WHERE table_schema = 'public' + AND table_name = 'gmail_accounts' + AND column_name = 'access_token' + """ + ) + assert cur.fetchone() is None + cur.execute( + """ + SELECT + auth_kind, + credential_blob ->> 'credential_kind', + credential_blob ->> 'access_token' + FROM gmail_account_credentials + WHERE gmail_account_id = %s + """, + (gmail_account_id,), + ) + assert cur.fetchone() == ( + "oauth_access_token", + "gmail_oauth_access_token_v1", + "token-before-hardening", + ) + + command.downgrade(config, "20260316_0026") + + with psycopg.connect(database_urls["admin"]) as conn: + with conn.cursor() as cur: + cur.execute( + """ + SELECT column_name + FROM information_schema.columns + WHERE table_schema = 'public' + AND table_name = 'gmail_accounts' + AND column_name = 'access_token' + """ + ) + assert cur.fetchone() == ("access_token",) + cur.execute( + """ + SELECT access_token + FROM gmail_accounts + WHERE id = %s + """, + (gmail_account_id,), + ) + assert cur.fetchone() == ("token-before-hardening",) + cur.execute("SELECT to_regclass('public.gmail_account_credentials')") + assert cur.fetchone() == (None,) + + def test_migrations_upgrade_and_downgrade(database_urls): config = make_alembic_config(database_urls["admin"]) diff --git a/tests/unit/test_20260316_0027_gmail_account_credentials.py b/tests/unit/test_20260316_0027_gmail_account_credentials.py new file mode 100644 index 0000000..ef80e31 --- /dev/null +++ b/tests/unit/test_20260316_0027_gmail_account_credentials.py @@ -0,0 +1,52 @@ +from __future__ import annotations + +import importlib + + +MODULE_NAME = "apps.api.alembic.versions.20260316_0027_gmail_account_credentials" + + +def load_migration_module(): + return importlib.import_module(MODULE_NAME) + + +def test_upgrade_executes_expected_statements_in_order(monkeypatch) -> None: + module = load_migration_module() + executed: list[str] = [] + + monkeypatch.setattr(module.op, "execute", executed.append) + + module.upgrade() + + assert executed == [ + module._UPGRADE_SCHEMA_STATEMENT, + module._UPGRADE_BACKFILL_STATEMENT, + *module._UPGRADE_DROP_PLAINTEXT_STATEMENTS, + *module._UPGRADE_GRANT_STATEMENTS, + "ALTER TABLE gmail_account_credentials ENABLE ROW LEVEL SECURITY", + "ALTER TABLE gmail_account_credentials FORCE ROW LEVEL SECURITY", + module._UPGRADE_POLICY_STATEMENT, + ] + + +def test_downgrade_executes_expected_statements_in_order(monkeypatch) -> None: + module = load_migration_module() + executed: list[str] = [] + + monkeypatch.setattr(module.op, "execute", executed.append) + + module.downgrade() + + assert executed == [ + *module._DOWNGRADE_ADD_PLAINTEXT_STATEMENTS, + module._DOWNGRADE_BACKFILL_STATEMENT, + *module._DOWNGRADE_RESTORE_CONSTRAINT_STATEMENTS, + ] + + +def test_gmail_account_credential_privileges_allow_only_expected_runtime_writes() -> None: + module = load_migration_module() + + assert module._UPGRADE_GRANT_STATEMENTS == ( + "GRANT SELECT, INSERT ON gmail_account_credentials TO alicebot_app", + ) diff --git a/tests/unit/test_gmail.py b/tests/unit/test_gmail.py index d602b5c..b84c945 100644 --- a/tests/unit/test_gmail.py +++ b/tests/unit/test_gmail.py @@ -7,16 +7,25 @@ import pytest from alicebot_api.artifacts import TaskArtifactAlreadyExistsError -from alicebot_api.contracts import GMAIL_READONLY_SCOPE, GmailAccountConnectInput, GmailMessageIngestInput +from alicebot_api.contracts import ( + GMAIL_PROTECTED_CREDENTIAL_KIND, + GMAIL_READONLY_SCOPE, + GmailAccountConnectInput, + GmailMessageIngestInput, +) from alicebot_api.gmail import ( GmailAccountAlreadyExistsError, GmailAccountNotFoundError, + GmailCredentialInvalidError, + GmailCredentialNotFoundError, GmailMessageUnsupportedError, build_gmail_message_artifact_relative_path, + build_gmail_protected_credential_blob, create_gmail_account_record, get_gmail_account_record, ingest_gmail_message_record, list_gmail_account_records, + resolve_gmail_access_token, ) from alicebot_api.workspaces import TaskWorkspaceNotFoundError @@ -41,6 +50,7 @@ class GmailStoreStub: def __init__(self) -> None: self.base_time = datetime(2026, 3, 16, 10, 0, tzinfo=UTC) self.gmail_accounts: list[dict[str, object]] = [] + self.gmail_account_credentials: dict[UUID, dict[str, object]] = {} self.task_workspaces: list[dict[str, object]] = [] self.task_artifacts: list[dict[str, object]] = [] self.operations: list[tuple[str, object]] = [] @@ -52,7 +62,6 @@ def create_gmail_account( email_address: str, display_name: str | None, scope: str, - access_token: str, ) -> dict[str, object]: row = { "id": uuid4(), @@ -61,19 +70,48 @@ def create_gmail_account( "email_address": email_address, "display_name": display_name, "scope": scope, - "access_token": access_token, "created_at": self.base_time + timedelta(minutes=len(self.gmail_accounts)), "updated_at": self.base_time + timedelta(minutes=len(self.gmail_accounts)), } self.gmail_accounts.append(row) return row + def create_gmail_account_credential( + self, + *, + gmail_account_id: UUID, + auth_kind: str, + credential_blob: dict[str, object], + ) -> dict[str, object]: + row = { + "gmail_account_id": gmail_account_id, + "user_id": next( + account["user_id"] + for account in self.gmail_accounts + if account["id"] == gmail_account_id + ), + "auth_kind": auth_kind, + "credential_blob": credential_blob, + "created_at": self.base_time + timedelta(minutes=len(self.gmail_account_credentials)), + "updated_at": self.base_time + timedelta(minutes=len(self.gmail_account_credentials)), + } + self.gmail_account_credentials[gmail_account_id] = row + self.operations.append(("create_gmail_account_credential", gmail_account_id)) + return row + def get_gmail_account_optional(self, gmail_account_id: UUID) -> dict[str, object] | None: return next( (row for row in self.gmail_accounts if row["id"] == gmail_account_id), None, ) + def get_gmail_account_credential_optional( + self, + gmail_account_id: UUID, + ) -> dict[str, object] | None: + self.operations.append(("get_gmail_account_credential_optional", gmail_account_id)) + return self.gmail_account_credentials.get(gmail_account_id) + def get_gmail_account_by_provider_account_id_optional( self, provider_account_id: str, @@ -192,6 +230,45 @@ def test_create_list_and_get_gmail_account_records_are_deterministic() -> None: user_id=user_id, gmail_account_id=UUID(second["account"]["id"]), ) == {"account": second["account"]} + assert "access_token" not in first["account"] + assert "access_token" not in second["account"] + + +def test_create_gmail_account_record_persists_protected_credential_and_hides_secret() -> None: + store = GmailStoreStub() + user_id = uuid4() + + response = create_gmail_account_record( + store, + user_id=user_id, + request=GmailAccountConnectInput( + provider_account_id="acct-001", + email_address="owner@example.com", + display_name="Owner", + scope=GMAIL_READONLY_SCOPE, + access_token="token-1", + ), + ) + + account_id = UUID(response["account"]["id"]) + assert response == { + "account": { + "id": str(account_id), + "provider": "gmail", + "auth_kind": "oauth_access_token", + "provider_account_id": "acct-001", + "email_address": "owner@example.com", + "display_name": "Owner", + "scope": GMAIL_READONLY_SCOPE, + "created_at": response["account"]["created_at"], + "updated_at": response["account"]["updated_at"], + } + } + assert store.gmail_account_credentials[account_id]["credential_blob"] == { + "credential_kind": GMAIL_PROTECTED_CREDENTIAL_KIND, + "access_token": "token-1", + } + assert store.operations == [("create_gmail_account_credential", account_id)] def test_create_gmail_account_record_rejects_duplicate_provider_account_id() -> None: @@ -222,6 +299,63 @@ def test_get_gmail_account_record_raises_when_account_is_missing() -> None: ) +def test_resolve_gmail_access_token_reads_protected_credential() -> None: + store = GmailStoreStub() + account = create_gmail_account_record( + store, + user_id=uuid4(), + request=GmailAccountConnectInput( + provider_account_id="acct-001", + email_address="owner@example.com", + display_name="Owner", + scope=GMAIL_READONLY_SCOPE, + access_token="token-1", + ), + )["account"] + + assert resolve_gmail_access_token( + store, + gmail_account_id=UUID(account["id"]), + ) == "token-1" + + +def test_resolve_gmail_access_token_rejects_missing_and_invalid_protected_credentials() -> None: + store = GmailStoreStub() + account = create_gmail_account_record( + store, + user_id=uuid4(), + request=GmailAccountConnectInput( + provider_account_id="acct-001", + email_address="owner@example.com", + display_name="Owner", + scope=GMAIL_READONLY_SCOPE, + access_token="token-1", + ), + )["account"] + account_id = UUID(account["id"]) + + store.gmail_account_credentials.pop(account_id) + with pytest.raises( + GmailCredentialNotFoundError, + match=f"gmail account {account_id} is missing protected credentials", + ): + resolve_gmail_access_token(store, gmail_account_id=account_id) + + store.gmail_account_credentials[account_id] = { + "gmail_account_id": account_id, + "user_id": uuid4(), + "auth_kind": "oauth_access_token", + "credential_blob": {"credential_kind": GMAIL_PROTECTED_CREDENTIAL_KIND}, + "created_at": store.base_time, + "updated_at": store.base_time, + } + with pytest.raises( + GmailCredentialInvalidError, + match=f"gmail account {account_id} has invalid protected credentials", + ): + resolve_gmail_access_token(store, gmail_account_id=account_id) + + def test_ingest_gmail_message_record_writes_rfc822_artifact_and_reuses_artifact_seam( monkeypatch, tmp_path, @@ -340,6 +474,10 @@ def fake_ingest(_store, *, user_id: UUID, request): } assert calls["register_user_id"] == user_id assert calls["ingest_user_id"] == user_id + assert store.operations[:2] == [ + ("create_gmail_account_credential", UUID(account["id"])), + ("get_gmail_account_credential_optional", UUID(account["id"])), + ] def test_ingest_gmail_message_record_rejects_unsupported_message(monkeypatch, tmp_path) -> None: @@ -437,7 +575,8 @@ def fail_fetch(**_kwargs): ) assert existing_file.read_bytes() == b"original" - assert store.operations[:2] == [ + assert store.operations[-3:] == [ + ("get_gmail_account_credential_optional", UUID(account["id"])), ("lock_task_artifacts", workspace_id), ("get_task_artifact_by_workspace_relative_path_optional", workspace_id), ] @@ -473,3 +612,105 @@ def test_ingest_gmail_message_record_requires_visible_workspace(monkeypatch) -> provider_message_id="msg-001", ), ) + + +def test_ingest_gmail_message_record_rejects_missing_protected_credentials_before_artifact_work( + monkeypatch, + tmp_path, +) -> None: + store = GmailStoreStub() + user_id = uuid4() + workspace_id = uuid4() + workspace_path = (tmp_path / "workspace").resolve() + store.create_task_workspace( + task_workspace_id=workspace_id, + local_path=str(workspace_path), + ) + account = create_gmail_account_record( + store, + user_id=user_id, + request=GmailAccountConnectInput( + provider_account_id="acct-001", + email_address="owner@example.com", + display_name="Owner", + scope=GMAIL_READONLY_SCOPE, + access_token="token-1", + ), + )["account"] + account_id = UUID(account["id"]) + store.gmail_account_credentials.pop(account_id) + + def fail_fetch(**_kwargs): + raise AssertionError("fetch_gmail_message_raw_bytes should not be called") + + monkeypatch.setattr("alicebot_api.gmail.fetch_gmail_message_raw_bytes", fail_fetch) + + with pytest.raises( + GmailCredentialNotFoundError, + match=f"gmail account {account_id} is missing protected credentials", + ): + ingest_gmail_message_record( + store, + user_id=user_id, + request=GmailMessageIngestInput( + gmail_account_id=account_id, + task_workspace_id=workspace_id, + provider_message_id="msg-001", + ), + ) + + assert store.task_artifacts == [] + assert not workspace_path.exists() + assert ("lock_task_artifacts", workspace_id) not in store.operations + + +def test_ingest_gmail_message_record_rejects_invalid_protected_credentials_before_artifact_work( + monkeypatch, + tmp_path, +) -> None: + store = GmailStoreStub() + user_id = uuid4() + workspace_id = uuid4() + workspace_path = (tmp_path / "workspace").resolve() + store.create_task_workspace( + task_workspace_id=workspace_id, + local_path=str(workspace_path), + ) + account = create_gmail_account_record( + store, + user_id=user_id, + request=GmailAccountConnectInput( + provider_account_id="acct-001", + email_address="owner@example.com", + display_name="Owner", + scope=GMAIL_READONLY_SCOPE, + access_token="token-1", + ), + )["account"] + account_id = UUID(account["id"]) + store.gmail_account_credentials[account_id]["credential_blob"] = build_gmail_protected_credential_blob( + access_token="", + ) + + def fail_fetch(**_kwargs): + raise AssertionError("fetch_gmail_message_raw_bytes should not be called") + + monkeypatch.setattr("alicebot_api.gmail.fetch_gmail_message_raw_bytes", fail_fetch) + + with pytest.raises( + GmailCredentialInvalidError, + match=f"gmail account {account_id} has invalid protected credentials", + ): + ingest_gmail_message_record( + store, + user_id=user_id, + request=GmailMessageIngestInput( + gmail_account_id=account_id, + task_workspace_id=workspace_id, + provider_message_id="msg-001", + ), + ) + + assert store.task_artifacts == [] + assert not workspace_path.exists() + assert ("lock_task_artifacts", workspace_id) not in store.operations diff --git a/tests/unit/test_gmail_main.py b/tests/unit/test_gmail_main.py index b711e65..18fc375 100644 --- a/tests/unit/test_gmail_main.py +++ b/tests/unit/test_gmail_main.py @@ -9,6 +9,8 @@ from alicebot_api.gmail import ( GmailAccountAlreadyExistsError, GmailAccountNotFoundError, + GmailCredentialInvalidError, + GmailCredentialNotFoundError, GmailMessageFetchError, GmailMessageNotFoundError, GmailMessageUnsupportedError, @@ -174,6 +176,44 @@ def fake_unsupported(*_args, **_kwargs): "detail": "gmail message msg-001 is not a supported RFC822 email" } + def fake_missing_credentials(*_args, **_kwargs): + raise GmailCredentialNotFoundError( + f"gmail account {gmail_account_id} is missing protected credentials" + ) + + monkeypatch.setattr(main_module, "ingest_gmail_message_record", fake_missing_credentials) + response = main_module.ingest_gmail_message( + gmail_account_id, + "msg-001", + main_module.IngestGmailMessageRequest( + user_id=user_id, + task_workspace_id=task_workspace_id, + ), + ) + assert response.status_code == 409 + assert json.loads(response.body) == { + "detail": f"gmail account {gmail_account_id} is missing protected credentials" + } + + def fake_invalid_credentials(*_args, **_kwargs): + raise GmailCredentialInvalidError( + f"gmail account {gmail_account_id} has invalid protected credentials" + ) + + monkeypatch.setattr(main_module, "ingest_gmail_message_record", fake_invalid_credentials) + response = main_module.ingest_gmail_message( + gmail_account_id, + "msg-001", + main_module.IngestGmailMessageRequest( + user_id=user_id, + task_workspace_id=task_workspace_id, + ), + ) + assert response.status_code == 409 + assert json.loads(response.body) == { + "detail": f"gmail account {gmail_account_id} has invalid protected credentials" + } + def fake_fetch_error(*_args, **_kwargs): raise GmailMessageFetchError("gmail message msg-001 could not be fetched")