diff --git a/.ai/active/SPRINT_PACKET.md b/.ai/active/SPRINT_PACKET.md index 445936f..5f56f34 100644 --- a/.ai/active/SPRINT_PACKET.md +++ b/.ai/active/SPRINT_PACKET.md @@ -2,122 +2,115 @@ ## Sprint Title -Sprint 5A: Task Workspace Records and Provisioning +Sprint 5B: Project Truth Compaction 01 ## Sprint Type -feature +refactor ## Sprint Reason -Milestone 5 should start at the workspace boundary, not at document ingestion or connectors. The repo now has the task and execution substrate needed to add one deterministic, user-scoped task workspace seam without expanding product scope. +The live project-truth files are now materially stale and redundant relative to the accepted repo state through Sprint 5A. `ROADMAP.md` and `.ai/handoff/CURRENT_STATE.md` still describe the project as pre-Milestone-5 and pre-step-linked approval/execution/workspace work, which will degrade planning and review quality if not compacted now. ## Sprint Intent -Begin Milestone 5 by adding user-scoped task workspace records plus deterministic local workspace provisioning, so later artifact handling, document ingestion, and read-only connectors have a governed workspace boundary to build on. +Compact and synchronize the live project-truth files so Control Tower, builders, and reviewers operate from a smaller, current, non-redundant source-of-truth set without changing product scope or runtime behavior. ## Git Instructions -- Branch Name: `codex/sprint-5a-task-workspaces` +- Branch Name: `codex/refactor-project-truth-compaction-01` - Base Branch: `main` -- PR Strategy: one sprint branch, one PR, no stacked PRs unless Control Tower explicitly opens a follow-up sprint on top of this branch -- Merge Policy: squash merge only after reviewer `PASS`; if review fails, repair on the same branch until pass or explicit abandonment +- PR Strategy: one sprint branch, one PR, no stacked PRs unless Control Tower explicitly opens a follow-up sprint +- Merge Policy: squash merge only after reviewer `PASS` and explicit Control Tower merge approval -## Why This Sprint +## Why This Sprint Matters -- Sprint 4S is implemented and passed: approvals and executions now both use explicit task-step linkage, so the Milestone 4 lifecycle substrate is in place. -- The roadmap says workspace and artifact boundaries should land before document-heavy or connector-heavy flows rely on them. -- The narrowest safe Milestone 5 entry slice is workspace provisioning only, not artifact indexing, document ingestion, or connectors. -- This keeps sequencing boring and maintainable by establishing the workspace boundary first. +- `ROADMAP.md` and `.ai/handoff/CURRENT_STATE.md` are behind the accepted repo state. +- `ARCHITECTURE.md` now reflects Sprint 5A, so the other live truth artifacts need to catch up and shed stale milestone text. +- A narrow compaction sprint is safer than letting outdated truth leak into future planning or review work. +- This restores clean project truth without changing product scope. ## In Scope -- Add schema and migration support for: - - `task_workspaces` -- Define typed contracts for: - - workspace create responses - - workspace list responses - - workspace detail responses -- Implement a minimal workspace seam that: - - provisions one deterministic local workspace path for a visible task - - persists one user-scoped workspace record linked to that task - - validates the workspace path is rooted under one configured workspace base directory - - prevents duplicate active workspace creation for the same task - - exposes deterministic list and detail reads -- Implement the minimal API or service paths needed for: - - creating a workspace for a task - - listing workspaces - - reading one workspace by id -- Add unit and integration tests for: - - workspace creation - - deterministic path generation - - duplicate-create rejection for the same task - - per-user isolation - - stable response shape +- compact and synchronize `.ai/handoff/CURRENT_STATE.md` +- compact and synchronize `ROADMAP.md` +- prune `RULES.md` only if it contains stale or duplicate guidance after truth sync +- slim `README.md` only if it duplicates active planning truth instead of onboarding +- archive stale planning/history docs into `docs/archive/` when they are no longer appropriate as live context +- update internal references so canonical files point to the right archive locations +- update `ARCHITECTURE.md` only if a stale duplicate or outdated boundary statement remains after the Sprint 5A truth sync ## Out of Scope -- No artifact inventory or artifact metadata table yet. -- No document ingestion. -- No chunking, embeddings, or document retrieval. -- No Gmail or Calendar connector scope. -- No runner-style orchestration. -- No new proxy handlers or broader side-effect expansion. +- new product features +- source code changes unrelated to doc-link or archive-link integrity +- UI improvements +- backend refactors +- new architecture decisions unless a current truth file is factually inaccurate +- changing roadmap priorities beyond removing stale historical clutter +- artifact, document, connector, or runner implementation work -## Required Deliverables +## Files / Modules In Scope -- Migration for `task_workspaces`. -- Stable workspace create/list/detail contracts. -- Minimal deterministic task-workspace provisioning and persistence path. -- Unit and integration coverage for provisioning, path safety, duplicate protection, and isolation. -- Updated `BUILD_REPORT.md` with exact verification results and explicit deferred scope. +- `.ai/handoff/CURRENT_STATE.md` +- `ROADMAP.md` +- `RULES.md` only if needed for stale/duplicate guidance cleanup +- `README.md` only if needed for onboarding/truth separation +- `docs/archive/**` +- `ARCHITECTURE.md` only if duplicate or stale sections must be cleaned after truth sync + +## Constraints + +- do not delete information unless it is safely archived +- preserve historical traceability +- do not change product scope +- do not change runtime behavior +- keep canonical files concise, current, and durable +- prefer archive over deletion +- use `PRODUCT_BRIEF.md`, `ARCHITECTURE.md`, accepted build/review reports, and the implemented repo state as the truth basis + +## Relevant Rules + +- active sprint packet is the top scope boundary for implementation work +- never represent planned architecture as implemented behavior +- roadmap should be future-facing once historical milestone state has been distilled elsewhere +- rules should contain only durable reusable guidance +- when live context becomes noisy, reduce and archive instead of letting stale state accumulate + +## Design Source of Truth + +- `DESIGN_SYSTEM.md` if it is later introduced +- otherwise N/A for this sprint + +## Architecture Source of Truth + +- `ARCHITECTURE.md` ## Acceptance Criteria -- A client can provision one user-scoped workspace for a visible task. -- Every workspace record stores a deterministic local path under the configured workspace root. -- Duplicate active workspace creation for the same task is rejected deterministically. -- Workspace list and detail reads are deterministic and user-scoped. -- `./.venv/bin/python -m pytest tests/unit` passes. -- `./.venv/bin/python -m pytest tests/integration` passes. -- No artifact indexing, document ingestion, connector, runner, handler-expansion, or broader side-effect scope enters the sprint. - -## Implementation Constraints - -- Keep the workspace seam narrow and boring. -- Provision only local workspace boundaries; do not invent remote storage abstractions in this sprint. -- Keep workspace paths deterministic, explicit, and rooted under one configured base directory. -- Reuse existing task ownership and isolation seams rather than creating a parallel authorization path. -- Do not add artifact scanning, file sync, or document parsing in the same sprint. - -## Suggested Work Breakdown - -1. Add `task_workspaces` schema and migration. -2. Define workspace create/list/detail contracts. -3. Implement deterministic workspace path generation rooted under the configured base directory. -4. Implement workspace create, list, and detail behavior with duplicate protection. -5. Add unit and integration tests. -6. Update `BUILD_REPORT.md` with executed verification. - -## Build Report Requirements - -`BUILD_REPORT.md` must include: -- the exact workspace schema and contract changes introduced -- the configured workspace root and path-generation rule used -- exact commands run -- unit and integration test results -- one example workspace create response -- one example workspace detail response -- what remains intentionally deferred to later milestones - -## Review Focus - -`REVIEW_REPORT.md` should verify: -- the sprint stayed limited to task workspace records and provisioning -- workspace paths are deterministic, rooted safely, and user-scoped -- duplicate protection, ordering, and isolation are test-backed -- no hidden artifact indexing, document ingestion, connector, runner, handler-expansion, or broader side-effect scope entered the sprint - -## Exit Condition - -This sprint is complete when the repo can provision deterministic user-scoped task workspace records under a configured local workspace root, expose stable workspace reads, and verify the full path with Postgres-backed tests, while still deferring artifact handling, document ingestion, and connector work. +- `.ai/handoff/CURRENT_STATE.md` is concise, current, and non-redundant through Sprint 5A. +- `ROADMAP.md` no longer presents stale pre-Sprint-5 state and is future-facing from the current repo position. +- `RULES.md` contains only durable rules and no stale scope-era leftovers. +- Any stale planning/history material moved out of live context is archived under `docs/archive/`. +- All archive links and references resolve correctly. +- No product behavior, scope, or runtime code was changed. +- Control Tower can plan from a smaller, cleaner active context set after this sprint. + +## Required Tests + +- manual review of canonical files for duplication, staleness, and truth alignment +- link/path sanity check for moved archive files +- confirm no runtime or schema behavior changed +- run docs/path validation only if any existing tooling references moved files + +## Docs To Update + +- `.ai/handoff/CURRENT_STATE.md` +- `ROADMAP.md` +- `RULES.md` if needed +- `README.md` if needed +- `ARCHITECTURE.md` only if stale duplication remains + +## Definition of Done + +The live project-truth files are smaller, cleaner, and aligned to the accepted repo state through Sprint 5A; stale planning/history material is preserved in archive; and the next sprint can be planned from a trustworthy active context set. diff --git a/.ai/handoff/CURRENT_STATE.md b/.ai/handoff/CURRENT_STATE.md index be3a187..a220a6e 100644 --- a/.ai/handoff/CURRENT_STATE.md +++ b/.ai/handoff/CURRENT_STATE.md @@ -1,53 +1,46 @@ # Current State -## What Exists Today +## Canonical Truth -- Canonical project docs now describe the shipped repo state through Sprint 4O. -- `apps/api` implements the accepted backend seams for continuity, tracing, context compilation, governed memory, memory review, embeddings, semantic retrieval, entities, policies, tools, approvals, approved proxy execution, execution budgets, execution review, tasks, task steps, and explicit manual continuation lineage. -- The live schema now includes continuity tables, trace tables, memory tables, embedding tables, entity tables, governance tables, plus `tasks` and `task_steps`. -- `apps/web` and `workers` remain starter scaffolds only; no workspace UI, runner, or background-job orchestration is shipped. +- The accepted repo state is current through Sprint 5A. +- Use [PRODUCT_BRIEF.md](/Users/samirusani/Desktop/Codex/AliceBot/PRODUCT_BRIEF.md) for product scope, [ARCHITECTURE.md](/Users/samirusani/Desktop/Codex/AliceBot/ARCHITECTURE.md) for implemented technical boundaries, [ROADMAP.md](/Users/samirusani/Desktop/Codex/AliceBot/ROADMAP.md) for forward planning, and [RULES.md](/Users/samirusani/Desktop/Codex/AliceBot/RULES.md) for durable operating rules. +- Historical build and review reports have been moved under [docs/archive/sprints](/Users/samirusani/Desktop/Codex/AliceBot/docs/archive/sprints). -## Stable / Trusted Areas +## Implemented Repo Slice -- Immutable event log and persisted trace model with per-user isolation. -- Deterministic context compilation and deterministic prompt assembly over durable sources. -- Governed memory admission, narrow deterministic explicit-preference extraction, explicit embedding storage, semantic retrieval, and deterministic hybrid memory merge during compile. -- Deterministic policy evaluation, tool allowlist evaluation, tool routing, approval persistence, approval resolution, approved-only `proxy.echo` execution, durable execution review, and execution-budget enforcement. -- Durable task and task-step reads, deterministic task-step sequencing, explicit task-step transitions, and explicit manual continuation with lineage validated against the parent step outcome. -- Sprint 4O review verification: - - `./.venv/bin/python -m pytest tests/unit` -> `284 passed` - - `./.venv/bin/python -m pytest tests/integration` -> `95 passed` +- `apps/api` is the only shipped product surface. It implements continuity, tracing, deterministic context compilation, governed memory admission and review, embeddings, semantic retrieval, entities, policy and tool governance, approval persistence and resolution, approved-only `proxy.echo` execution, execution budgets, task/task-step lifecycle reads and mutations, explicit manual continuation lineage, explicit task-step linkage for approval and execution synchronization, and deterministic rooted local task-workspace provisioning. +- The live schema includes continuity, trace, memory, embedding, entity, governance, `tasks`, `task_steps`, and `task_workspaces` tables with row-level security on user-owned data. +- `apps/web` and `workers` remain starter scaffolds only. -## Incomplete / At-Risk Areas +## Current Boundaries -- Auth beyond DB user context is still unimplemented. -- Memory extraction and retrieval quality remain major ship-gating risks. -- Document ingestion, scoped task workspaces, artifact handling, and read-only connectors have not started in code. -- The current multi-step boundary is still narrow: approval-resolution and execution-synchronization helpers continue to target `task_steps.sequence_no = 1`, even though manual continuation is now implemented for later steps. +- Task workspaces are implemented only as deterministic rooted local directories plus durable `task_workspaces` records. +- The shipped multi-step task path is still explicit and narrow: later steps are appended manually with lineage, while approval and execution synchronization use explicit linked `task_step_id` references. +- The only execution handler in the repo is the in-process no-external-I/O `proxy.echo` path. -## Current Milestone Position +## Not Implemented -- The repo has completed the implementation planned through Milestone 4. -- Milestone 5 has not started in shipped code. -- The project is at a truth-sync checkpoint before Milestone 5 entry. +- Artifact storage or indexing beyond the local workspace boundary. +- Document ingestion, chunking, or document retrieval. +- Read-only Gmail or Calendar connectors. +- Runner-style orchestration or automatic multi-step progression. +- Auth beyond the current database user-context model. -## Latest State Summary +## Active Risks -- Local runtime assets exist for Docker Compose, Postgres bootstrap, API startup, migrations, and backend tests. -- `POST /v0/approvals/requests` now creates one durable task plus one initial task step for each routed governed request, with task and task-step lifecycle traces. -- `GET /v0/tasks`, `GET /v0/tasks/{task_id}`, `GET /v0/tasks/{task_id}/steps`, and `GET /v0/task-steps/{task_step_id}` expose durable task/task-step review reads with deterministic ordering. -- `POST /v0/tasks/{task_id}/steps` now appends exactly one manual continuation step when the latest step is appendable and explicit lineage points to that latest visible parent step. -- `POST /v0/task-steps/{task_step_id}/transition` now advances only the latest visible step through the explicit status graph and keeps the parent task status synchronized. -- Task-step lineage is trace-visible through `task.step.continuation.request`, `task.step.continuation.lineage`, and `task.step.continuation.summary` events. +- Memory extraction and retrieval quality remain the main product risk. +- Auth is still incomplete beyond database user context. +- Workspace provisioning is intentionally narrow and local; broader artifact and document flows still need their own accepted seams. -## Critical Constraints +## Latest Accepted Verification -- Do not treat planned workspace, connector, runner, or broader side-effect work as implemented. -- Do not bypass approval boundaries for consequential actions. -- Do not replace compiled durable context with raw transcript stuffing. -- Appended task steps must carry explicit lineage; do not infer provenance heuristically from task history. -- Keep the current multi-step boundary explicit until the first-step lifecycle helpers are removed or constrained. +- Sprint 5A review status: `PASS`. +- Accepted verification on March 13, 2026: + - `./.venv/bin/python -m pytest tests/unit` -> `315 passed` + - `./.venv/bin/python -m pytest tests/integration` -> `99 passed` -## Immediate Next Move +## Planning Guardrails -- Take the smallest follow-up sprint that removes or explicitly constrains the remaining `task_steps.sequence_no = 1` approval/execution synchronization assumptions before any runner, workspace, or connector work begins. +- Plan from the implemented Sprint 5A repo state, not from older milestone narratives. +- Do not describe Milestone 5 document, artifact, connector, or runner work as shipped. +- Keep live truth files compact; archive historical detail instead of re-expanding the active context set. diff --git a/BUILD_REPORT.md b/BUILD_REPORT.md index c525cd9..1589baf 100644 --- a/BUILD_REPORT.md +++ b/BUILD_REPORT.md @@ -2,180 +2,47 @@ ## sprint objective -Implement Sprint 5A: Task Workspace Records and Provisioning by adding user-scoped `task_workspaces`, deterministic local workspace provisioning under one configured root, duplicate-active protection per task, and stable workspace create/list/detail reads. +Compact and synchronize the live project-truth files so the active docs reflect the accepted repo state through Sprint 5A, reduce redundancy, and move stale sprint-history material out of the live context set. ## completed work -- Added workspace schema and migration: - - new migration `apps/api/alembic/versions/20260313_0022_task_workspaces.py` - - new table `task_workspaces` with `id`, `user_id`, `task_id`, `status`, `local_path`, `created_at`, and `updated_at` - - user/task foreign key `(task_id, user_id) -> tasks(id, user_id)` - - partial unique index enforcing one active workspace per task and user - - RLS policy plus runtime grants limited to `SELECT, INSERT` -- Added workspace configuration and deterministic pathing: - - new setting `TASK_WORKSPACE_ROOT` - - default workspace root: `/tmp/alicebot/task-workspaces` - - path-generation rule: `//` - - workspace provisioning validates the resolved path stays rooted under the resolved workspace root before creating the directory -- Added typed contracts and service behavior: - - `TaskWorkspaceStatus` - - `TaskWorkspaceCreateInput` - - `TaskWorkspaceRecord` - - `TaskWorkspaceCreateResponse` - - `TaskWorkspaceListResponse` - - `TaskWorkspaceDetailResponse` - - new workspace service in `apps/api/src/alicebot_api/workspaces.py` - - duplicate active workspace creation for the same visible task now raises a deterministic conflict -- Added minimal API paths: - - `POST /v0/tasks/{task_id}/workspace` - - `GET /v0/task-workspaces` - - `GET /v0/task-workspaces/{task_workspace_id}` -- Added coverage for: - - deterministic path generation - - rooted path safety validation - - workspace creation - - duplicate-create rejection - - per-user isolation - - stable response shape - - migration upgrade/downgrade expectations including the new table, RLS, and privileges - -## exact workspace schema and contract changes introduced - -- Schema: - - `task_workspaces.id uuid PRIMARY KEY DEFAULT gen_random_uuid()` - - `task_workspaces.user_id uuid NOT NULL REFERENCES users(id) ON DELETE CASCADE` - - `task_workspaces.task_id uuid NOT NULL` - - `task_workspaces.status text NOT NULL CHECK (status IN ('active'))` - - `task_workspaces.local_path text NOT NULL CHECK (length(local_path) > 0)` - - `task_workspaces.created_at timestamptz NOT NULL DEFAULT now()` - - `task_workspaces.updated_at timestamptz NOT NULL DEFAULT now()` - - `CONSTRAINT task_workspaces_task_user_fk FOREIGN KEY (task_id, user_id) REFERENCES tasks(id, user_id) ON DELETE CASCADE` - - `CREATE UNIQUE INDEX task_workspaces_active_task_idx ON task_workspaces (user_id, task_id) WHERE status = 'active'` -- Store layer: - - `TaskWorkspaceRow` - - `ContinuityStore.lock_task_workspaces(...)` - - `ContinuityStore.create_task_workspace(...)` - - `ContinuityStore.get_task_workspace_optional(...)` - - `ContinuityStore.get_active_task_workspace_for_task_optional(...)` - - `ContinuityStore.list_task_workspaces(...)` -- Contracts: - - `TaskWorkspaceStatus = Literal["active"]` - - `TaskWorkspaceCreateInput.task_id` - - `TaskWorkspaceCreateInput.status` - - `TaskWorkspaceRecord.id` - - `TaskWorkspaceRecord.task_id` - - `TaskWorkspaceRecord.status` - - `TaskWorkspaceRecord.local_path` - - `TaskWorkspaceRecord.created_at` - - `TaskWorkspaceRecord.updated_at` - - `TaskWorkspaceCreateResponse.workspace` - - `TaskWorkspaceListResponse.items` - - `TaskWorkspaceListResponse.summary` - - `TaskWorkspaceDetailResponse.workspace` - -## configured workspace root and path-generation rule used - -- Default configured workspace root: `/tmp/alicebot/task-workspaces` -- Test override root: per-test temp directory via `Settings(task_workspace_root=...)` -- Deterministic path rule: `resolved_root / str(user_id) / str(task_id)` -- Safety rule: the resolved workspace path must remain under the resolved configured root or provisioning fails before persistence +- Rewrote `.ai/handoff/CURRENT_STATE.md` into a compact current-state snapshot with canonical truth pointers, implemented boundaries, non-implemented boundaries, active risks, and accepted Sprint 5A verification. +- Rewrote `ROADMAP.md` so it is future-facing from the current repo position instead of repeating milestone-history detail. +- Pruned `RULES.md` down to durable reusable scope, safety, architecture, data, and testing rules. +- Slimmed `README.md` to onboarding and current-slice orientation, removing stale sprint-by-sprint implementation narration. +- Archived the prior Sprint 5A build and review reports under `docs/archive/sprints/`. +- Removed `REVIEW_REPORT.md` from the repo root and updated live references to point at the archive location. +- Left `ARCHITECTURE.md` unchanged because it was already aligned to the accepted Sprint 5A state. ## incomplete work -- None inside Sprint 5A scope. +- None within Sprint 5B scope. ## files changed -- `apps/api/alembic/versions/20260313_0022_task_workspaces.py` -- `apps/api/src/alicebot_api/config.py` -- `apps/api/src/alicebot_api/contracts.py` -- `apps/api/src/alicebot_api/main.py` -- `apps/api/src/alicebot_api/store.py` -- `apps/api/src/alicebot_api/workspaces.py` -- `tests/integration/test_migrations.py` -- `tests/integration/test_task_workspaces_api.py` -- `tests/unit/test_20260313_0022_task_workspaces.py` -- `tests/unit/test_config.py` -- `tests/unit/test_main.py` -- `tests/unit/test_task_workspace_store.py` -- `tests/unit/test_workspaces.py` -- `tests/unit/test_workspaces_main.py` +- `.ai/handoff/CURRENT_STATE.md` +- `ROADMAP.md` +- `RULES.md` +- `README.md` +- `docs/archive/sprints/2026-03-13-sprint-5a-build-report.md` +- `docs/archive/sprints/2026-03-13-sprint-5a-review-report.md` +- `REVIEW_REPORT.md` (removed from live root) - `BUILD_REPORT.md` -## exact commands run - -- `./.venv/bin/python -m pytest tests/unit/test_workspaces.py tests/unit/test_workspaces_main.py tests/unit/test_task_workspace_store.py tests/unit/test_20260313_0022_task_workspaces.py tests/unit/test_config.py tests/unit/test_main.py` -- `./.venv/bin/python -m pytest tests/integration/test_task_workspaces_api.py tests/integration/test_migrations.py` - - initial sandbox run failed because sandboxed localhost Postgres access was blocked -- `./.venv/bin/python -m pytest tests/unit` -- `./.venv/bin/python -m pytest tests/integration` - ## tests run -- `./.venv/bin/python -m pytest tests/unit/test_workspaces.py tests/unit/test_workspaces_main.py tests/unit/test_task_workspace_store.py tests/unit/test_20260313_0022_task_workspaces.py tests/unit/test_config.py tests/unit/test_main.py` - - passed: `56 passed in 0.50s` -- `./.venv/bin/python -m pytest tests/integration/test_task_workspaces_api.py tests/integration/test_migrations.py` - - sandboxed run failed before test execution could start against Postgres: `3 errors in 0.21s` -- `./.venv/bin/python -m pytest tests/unit` - - passed: `315 passed in 0.57s` -- `./.venv/bin/python -m pytest tests/integration` - - passed outside the sandbox: `99 passed in 28.56s` - -## unit and integration test results - -- Unit suite: - - green - - covers config loading, migration statement order, store queries, workspace service behavior, rooted path safety, duplicate rejection, route registration, and endpoint error mapping -- Integration suite: - - green - - covers migration upgrade/downgrade expectations, workspace API provisioning, duplicate rejection, deterministic list/detail responses, and per-user isolation against Postgres - -## one example workspace create response - -```json -{ - "workspace": { - "id": "11111111-1111-1111-1111-111111111111", - "task_id": "22222222-2222-2222-2222-222222222222", - "status": "active", - "local_path": "/tmp/alicebot/task-workspaces/33333333-3333-3333-3333-333333333333/22222222-2222-2222-2222-222222222222", - "created_at": "2026-03-13T10:00:00+00:00", - "updated_at": "2026-03-13T10:00:00+00:00" - } -} -``` - -## one example workspace detail response - -```json -{ - "workspace": { - "id": "11111111-1111-1111-1111-111111111111", - "task_id": "22222222-2222-2222-2222-222222222222", - "status": "active", - "local_path": "/tmp/alicebot/task-workspaces/33333333-3333-3333-3333-333333333333/22222222-2222-2222-2222-222222222222", - "created_at": "2026-03-13T10:00:00+00:00", - "updated_at": "2026-03-13T10:00:00+00:00" - } -} -``` +- Manual review of `ARCHITECTURE.md`, `PRODUCT_BRIEF.md`, the live truth files, and the archived Sprint 5A reports for truth alignment and duplication. +- `rg -n "REVIEW_REPORT|docs/archive/sprints|CURRENT_STATE|ROADMAP|RULES" .` +- `find docs/archive -maxdepth 3 -type f | sort` +- `git diff --check` +- `git diff --stat -- .ai/handoff/CURRENT_STATE.md ROADMAP.md RULES.md README.md docs/archive/sprints BUILD_REPORT.md REVIEW_REPORT.md` +- `test -f docs/archive/sprints/2026-03-13-sprint-5a-build-report.md && test -f docs/archive/sprints/2026-03-13-sprint-5a-review-report.md && test ! -f REVIEW_REPORT.md && echo ok` ## blockers/issues -- No implementation blocker remains. -- Verification note: - - Postgres-backed integration tests required unsandboxed access to `localhost:5432`; the initial sandboxed focused integration run failed with connection-permission errors before being rerun successfully outside the sandbox. - -## what remains intentionally deferred to later milestones - -- Artifact inventory and artifact metadata tables -- Document ingestion -- Chunking, embeddings, or document retrieval tied to workspaces -- Gmail or Calendar connector scope -- Runner-style orchestration -- New proxy handlers or broader side-effect expansion -- Any remote storage abstraction beyond the local deterministic workspace boundary added here +- No implementation blockers. +- No runtime or schema tests were run because this sprint was intentionally docs-only and the sprint packet required manual truth review plus path/link sanity checks, not behavior changes. ## recommended next step -Build the next workspace-dependent milestone slice on top of this boundary without widening the seam: artifact or document work should consume `task_workspaces` records and the configured rooted local path instead of inventing a parallel storage contract. +Plan the next sprint from the compact live truth set now in `PRODUCT_BRIEF.md`, `ARCHITECTURE.md`, `ROADMAP.md`, `RULES.md`, and `.ai/handoff/CURRENT_STATE.md`; pull historical detail from `docs/archive/sprints/` only when needed. diff --git a/README.md b/README.md index a16d490..d294e1b 100644 --- a/README.md +++ b/README.md @@ -1,73 +1,41 @@ # AliceBot -AliceBot is a private, permissioned personal AI operating system. The repository now includes the runnable foundation slice plus the first tracing/context-compilation seam, the first governed memory/admissions-and-embeddings slice, the first deterministic response-generation seam, the first governance routing seam for non-executing tool requests, the first durable approval-request persistence seam for `approval_required` routing outcomes, the explicit approval-resolution seam, the first minimal approved-only proxy-execution seam, the first durable execution-review seam over that proxy path, the narrow execution-budget lifecycle seam over approved proxy execution, and the first deterministic task-workspace provisioning seam: local infrastructure, an API scaffold, migration tooling, continuity primitives, persisted traces, a deterministic continuity-only compiler, explicit memory admission, a narrow deterministic explicit-preference extraction path, explicit embedding-config and memory-embedding storage paths, a direct semantic memory retrieval primitive, deterministic hybrid compile-path memory merge, a no-tools model invocation path over deterministically assembled prompts, deterministic policy and tool-governance seams, a narrow no-side-effect proxy handler path, durable `tool_executions` records, durable `execution_budgets` records, durable `task_workspaces` records, execution-budget create/list/detail reads, budget deactivate/supersede lifecycle operations, active-only budget enforcement, budget-blocked execution persistence, task-workspace create/list/detail reads, and backend verification coverage. +AliceBot is a private, permissioned personal AI operating system. The current repo contains the accepted backend slice through Sprint 5A plus local developer tooling. -## Status +## Current Implemented Slice -- Local Docker Compose infrastructure is defined for Postgres with `pgvector`, Redis, and MinIO. -- `apps/api` contains FastAPI health, compile, response-generation, memory-admission, explicit-preference extraction, semantic-memory-retrieval, policy, tool-registry, tool-allowlist, tool-routing, approval-request, approval-resolution, proxy-execution, execution-budget, execution-review, task, and task-workspace endpoints, configuration loading, Alembic migrations, continuity storage primitives, the Sprint 2A trace/compiler path, the Sprint 3A memory-admission path, the Sprint 3I deterministic extraction path, the Sprint 3K embedding substrate, the Sprint 3L semantic retrieval primitive, the Sprint 3M compile-path semantic retrieval adoption, the Sprint 3N deterministic hybrid memory merge, the Sprint 4A deterministic prompt-assembly and no-tools response path, the Sprint 4D deterministic non-executing tool-routing seam, the Sprint 4E durable approval-request persistence seam, the Sprint 4F approval-resolution seam, the Sprint 4G minimal approved-only proxy-execution seam, the Sprint 4H durable execution-review seam, the Sprint 4I execution-budget guard seam, the Sprint 4J execution-budget lifecycle seam, the Sprint 4K time-windowed execution-budget seam, the Sprint 4S explicit execution-to-task-step linkage seam, and the Sprint 5A task-workspace provisioning seam. -- `apps/web` and `workers` contain minimal starter scaffolds for later milestone work. -- The active sprint is documented in [.ai/active/SPRINT_PACKET.md](/Users/samirusani/Desktop/Codex/AliceBot/.ai/active/SPRINT_PACKET.md). +- `apps/api` is the shipped surface. It includes continuity storage, tracing, deterministic context compilation, governed memory admission and review, embeddings, semantic retrieval, entities, policy and tool governance, approval persistence and resolution, approved-only `proxy.echo` execution, execution budgets, tasks, task steps, explicit manual continuation lineage, step-linked approval/execution synchronization, and deterministic rooted local task-workspace provisioning. +- `apps/web` and `workers` are starter scaffolds only. +- Task workspaces are currently local rooted directories plus durable records. Artifact indexing, document ingestion, connectors, and runner orchestration are not shipped. ## Quick Start 1. Create a local env file: `cp .env.example .env` -2. Start required infrastructure with one command: `docker compose up -d` -3. Create a project virtualenv and install Python dependencies: `python3 -m venv .venv && ./.venv/bin/python -m pip install -e '.[dev]'` -4. Run database migrations: `./scripts/migrate.sh` -5. Start the API locally: `./scripts/api_dev.sh` +2. Start infrastructure: `docker compose up -d` +3. Create a virtualenv and install dependencies: `python3 -m venv .venv && ./.venv/bin/python -m pip install -e '.[dev]'` +4. Apply migrations: `./scripts/migrate.sh` +5. Start the API: `./scripts/api_dev.sh` -The health endpoint is exposed at [http://127.0.0.1:8000/healthz](http://127.0.0.1:8000/healthz). -The minimal context-compilation API path is `POST /v0/context/compile`. -The minimal response-generation API path is `POST /v0/responses`. -The minimal memory-admission API path is `POST /v0/memories/admit`. -The explicit-preference extraction API path is `POST /v0/memories/extract-explicit-preferences`. -The minimal non-executing tool-routing API path is `POST /v0/tools/route`. -The minimal approval API paths are `POST /v0/approvals/requests`, `GET /v0/approvals`, `GET /v0/approvals/{approval_id}`, `POST /v0/approvals/{approval_id}/approve`, `POST /v0/approvals/{approval_id}/reject`, and `POST /v0/approvals/{approval_id}/execute`. -The execution-budget API paths are `POST /v0/execution-budgets`, `GET /v0/execution-budgets`, `GET /v0/execution-budgets/{execution_budget_id}`, `POST /v0/execution-budgets/{execution_budget_id}/deactivate`, and `POST /v0/execution-budgets/{execution_budget_id}/supersede`. -The execution-review API paths are `GET /v0/tool-executions` and `GET /v0/tool-executions/{execution_id}`. -The task-workspace API paths are `POST /v0/tasks/{task_id}/workspace`, `GET /v0/task-workspaces`, and `GET /v0/task-workspaces/{task_workspace_id}`. -The helper scripts load the repo-root `.env` automatically and prefer `.venv/bin/python` when that virtualenv exists, falling back to `python3` otherwise. The default migration/admin URL targets the same local `alicebot` database as the app runtime. -`/healthz` currently performs a live Postgres check only. Redis and MinIO are reported as configured endpoints with `not_checked` status. -`TASK_WORKSPACE_ROOT` controls the single rooted base directory used for deterministic local task-workspace provisioning. By default it is `/tmp/alicebot/task-workspaces`, and each workspace path is created as `//`. -The current backend path has been verified in a local developer environment with `docker compose up -d`, `./scripts/migrate.sh`, `./.venv/bin/python -m pytest tests/unit tests/integration`, a live `GET /healthz`, and the Postgres-backed `POST /v0/context/compile`, `POST /v0/responses`, `POST /v0/memories/admit`, `POST /v0/memories/extract-explicit-preferences`, `POST /v0/memories/semantic-retrieval`, `POST /v0/tools/allowlist/evaluate`, `POST /v0/tools/route`, `POST /v0/approvals/requests`, `POST /v0/approvals/{approval_id}/execute`, `POST /v0/execution-budgets`, `GET /v0/execution-budgets`, `POST /v0/execution-budgets/{execution_budget_id}/deactivate`, `POST /v0/execution-budgets/{execution_budget_id}/supersede`, `GET /v0/tool-executions`, and `GET /v0/tool-executions/{execution_id}` integration paths, including compile requests that explicitly enable the hybrid memory merge, response requests that persist assistant events and response traces, deterministic non-executing tool-routing requests that persist `tool.route.*` traces, approval-request persistence requests that persist `approval.request.*` traces plus durable approval rows only for `approval_required` outcomes, approved proxy execution that persists `tool.proxy.execute.*` traces plus durable `tool_executions` rows for approved execution attempts, deterministic budget-management requests over durable `execution_budgets` rows, lifecycle requests that persist `execution_budget.lifecycle.*` traces and change budget status deterministically, budget-prechecked proxy execution that emits `tool.proxy.execute.budget` trace events against active budgets only, and execution-review reads over those durable records including budget-blocked attempts. +Useful checks: -## Repo Structure +- API health: [http://127.0.0.1:8000/healthz](http://127.0.0.1:8000/healthz) +- Full backend tests: `./.venv/bin/python -m pytest tests/unit tests/integration` +- Web shell: `pnpm --dir apps/web dev` -- [PRODUCT_BRIEF.md](/Users/samirusani/Desktop/Codex/AliceBot/PRODUCT_BRIEF.md): permanent product truth. -- [ARCHITECTURE.md](/Users/samirusani/Desktop/Codex/AliceBot/ARCHITECTURE.md): permanent technical truth. -- [ROADMAP.md](/Users/samirusani/Desktop/Codex/AliceBot/ROADMAP.md): milestone sequence and delivery risks. -- [RULES.md](/Users/samirusani/Desktop/Codex/AliceBot/RULES.md): durable engineering and scope rules. -- [.ai/handoff/CURRENT_STATE.md](/Users/samirusani/Desktop/Codex/AliceBot/.ai/handoff/CURRENT_STATE.md): fresh-thread recovery snapshot. -- [.ai/active/SPRINT_PACKET.md](/Users/samirusani/Desktop/Codex/AliceBot/.ai/active/SPRINT_PACKET.md): current builder sprint. -- `docker-compose.yml`: local Postgres, Redis, and MinIO stack. -- `infra/postgres/init/`: Postgres bootstrap SQL, including the non-superuser app role. -- `apps/api/`: FastAPI app, config, continuity store, and Alembic migrations. -- `apps/web/`: minimal Next.js shell for later dashboard work. -- `workers/`: placeholder Python worker package for future background jobs. -- `tests/`: unit and Postgres-backed integration tests for the foundation slice. -- `scripts/`: local development and migration entrypoints. - -## Essential Commands +## Repo Map -- `docker compose up -d`: start Postgres, Redis, and MinIO on `127.0.0.1`. -- `./scripts/dev_up.sh`: start local infrastructure, wait for Postgres and role bootstrap readiness, and apply Alembic migrations. -- `./scripts/migrate.sh`: apply Alembic migrations with the admin database URL from `.env` or the built-in defaults. -- `./scripts/api_dev.sh`: run the FastAPI service with auto-reload. -- `./.venv/bin/python -m pytest tests/unit tests/integration`: run backend tests from the project virtualenv. -- `pnpm --dir apps/web dev`: start the web shell after frontend dependencies are installed. +- [PRODUCT_BRIEF.md](/Users/samirusani/Desktop/Codex/AliceBot/PRODUCT_BRIEF.md): stable product scope and ship gates. +- [ARCHITECTURE.md](/Users/samirusani/Desktop/Codex/AliceBot/ARCHITECTURE.md): implemented technical boundaries and planned-later boundaries. +- [ROADMAP.md](/Users/samirusani/Desktop/Codex/AliceBot/ROADMAP.md): forward-looking milestone direction from the current repo position. +- [RULES.md](/Users/samirusani/Desktop/Codex/AliceBot/RULES.md): durable engineering and scope rules. +- [.ai/handoff/CURRENT_STATE.md](/Users/samirusani/Desktop/Codex/AliceBot/.ai/handoff/CURRENT_STATE.md): compact current-state recovery snapshot. +- [.ai/active/SPRINT_PACKET.md](/Users/samirusani/Desktop/Codex/AliceBot/.ai/active/SPRINT_PACKET.md): active builder scope. +- [docs/archive/sprints](/Users/samirusani/Desktop/Codex/AliceBot/docs/archive/sprints): archived sprint build and review history. ## Environment Notes -- Postgres is the system of record and the live schema now includes continuity tables, trace tables, policy-governance tables including `approvals`, `tool_executions`, and `execution_budgets`, task lifecycle tables including `tasks`, `task_steps`, and `task_workspaces`, memory tables, entity tables, and the embedding substrate tables `embedding_configs` and `memory_embeddings`. -- Sprint 2A adds persisted `traces` and `trace_events` plus a deterministic continuity-only context compiler over existing durable continuity records. -- Sprint 3A adds governed `memories` and append-only `memory_revisions` plus an explicit `NOOP`-first admission path over cited source events. -- The app and migration defaults both target the local `alicebot` database to keep quick-start behavior deterministic. -- `TASK_WORKSPACE_ROOT` defaults to `/tmp/alicebot/task-workspaces` and defines the only allowed root for deterministic local task-workspace provisioning. -- Local service ports are bound to `127.0.0.1` by default to avoid exposing fixed development credentials on non-loopback interfaces. -- Redis is reserved for future queue, lock, and cache work; no retrieval or orchestration features are enabled in this sprint. -- MinIO provides the local S3-compatible endpoint for future document and artifact storage. -- Continuity tables enforce row-level security from the start and `events` are append-only by application contract plus database trigger, with concurrent appends serialized per thread. -- Trace tables follow the same per-user isolation model, with append-only `trace_events` for compiler explainability. -- Memory admission remains explicit and evidence-backed, automatic extraction is currently limited to a narrow deterministic explicit-preference path over stored user messages, and the repo now includes explicit versioned embedding-config storage, direct memory-embedding persistence, a direct semantic retrieval API over active durable memories, compile-path hybrid memory merge into one `context_pack["memories"]` section with `memory_summary.hybrid_retrieval` metadata, one deterministic no-tools response path that assembles prompts from durable compiled context and persists assistant replies plus response traces, one deterministic approval-request persistence path over `approval_required` tool-routing outcomes, explicit approval resolution, one minimal approved-only proxy execution path through the no-side-effect `proxy.echo` handler, durable execution-review records plus list/detail reads for approved execution attempts, one narrow deterministic execution-budget seam that can activate, deactivate, supersede, and enforce both lifetime and rolling-window limits using durable `tool_executions` history while keeping blocked attempts reviewable, and one narrow deterministic task-workspace seam that provisions rooted local workspace directories and persists durable `task_workspaces` rows. Broader extraction, reranking, external-connector tool execution, artifact indexing, document ingestion, orchestration, and review UI remain deferred. -- The runtime database role is limited to `SELECT`/`INSERT` on continuity and trace tables, `SELECT`/`INSERT` on `memory_revisions`, `memory_review_labels`, `embedding_configs`, `entities`, and `entity_edges`, plus `SELECT`/`INSERT`/`UPDATE` on `consents`, `memories`, `memory_embeddings`, and `execution_budgets`. +- Postgres is the system of record. +- Local Docker Compose includes Postgres with `pgvector`, Redis, and MinIO. +- The helper scripts source the repo-root `.env` and prefer `.venv/bin/python` when present. +- `TASK_WORKSPACE_ROOT` defaults to `/tmp/alicebot/task-workspaces` and is the only allowed root for task-workspace provisioning. +- `/healthz` performs a live Postgres check; Redis and MinIO are reported as configured but not live-checked. diff --git a/REVIEW_REPORT.md b/REVIEW_REPORT.md index 0fc8efe..ce0e041 100644 --- a/REVIEW_REPORT.md +++ b/REVIEW_REPORT.md @@ -6,19 +6,14 @@ PASS ## criteria met -- The sprint stayed inside the Sprint 5A boundary. I found no artifact indexing, document ingestion, connector work, runner orchestration, new proxy handlers, or broader side-effect expansion. -- `apps/api/alembic/versions/20260313_0022_task_workspaces.py` adds the required `task_workspaces` schema with user ownership, task linkage through `(task_id, user_id)`, row-level security, and a partial unique index enforcing one active workspace per task and user. -- The workspace seam in `apps/api/src/alicebot_api/workspaces.py` is narrow and deterministic: it resolves one configured root, builds the path as `resolved_root / user_id / task_id`, rejects rooted-path escapes before provisioning, and persists a single active workspace row. -- Stable create/list/detail contracts and the minimal API surface are present for the required endpoints: - - `POST /v0/tasks/{task_id}/workspace` - - `GET /v0/task-workspaces` - - `GET /v0/task-workspaces/{task_workspace_id}` -- Duplicate active workspace creation is rejected deterministically through the advisory lock plus active-workspace lookup, with the database unique index providing backstop enforcement. -- User isolation, deterministic ordering, and stable response shape are test-backed in both unit and Postgres-backed integration coverage, including `tests/integration/test_task_workspaces_api.py`. -- `BUILD_REPORT.md` accurately describes the schema change, contract change, rooted path rule, exact commands, sample responses, and deferred scope. -- Independent verification passed: - - `./.venv/bin/python -m pytest tests/unit` -> `315 passed in 0.62s` - - `./.venv/bin/python -m pytest tests/integration` -> `99 passed in 28.66s` +- `.ai/handoff/CURRENT_STATE.md` is materially smaller, current through Sprint 5A, and now points readers to the canonical live truth files instead of repeating stale milestone-state detail. +- `ROADMAP.md` is future-facing from the shipped Sprint 5A position and no longer presents the repo as pre-Milestone-5. +- `RULES.md` was pruned to durable reusable guidance only; stale sprint-era scope narration was removed. +- `README.md` is now focused on onboarding, current slice orientation, and canonical file pointers instead of duplicating active planning truth. +- Stale sprint-history material was archived under `docs/archive/sprints/`, and the archived Sprint 5A build/review reports were preserved intact. +- Archive references resolve correctly in the live docs, and the expected archive files exist on disk. +- No product behavior, schema, or runtime code changed in the tracked diff; the modified tracked files are docs only. +- The live truth set is smaller and cleaner, and it now aligns with the implemented Sprint 5A state already described in `ARCHITECTURE.md`. ## criteria missed @@ -26,27 +21,27 @@ PASS ## quality issues -- Non-blocking: `create_task_workspace_record()` provisions the directory before the insert is durably committed and uses `mkdir(..., exist_ok=True)`. If the insert or transaction commit fails after directory creation, the code can leave behind an orphaned directory that a later successful create would silently reuse. +- None blocking. +- Process note: the prior root `REVIEW_REPORT.md` was correctly archived and removed from the live context set; this file is the new current review artifact for Sprint 5B. ## regression risks -- Runtime regression risk is low because both acceptance suites passed and the workspace behavior is covered at service, route, migration, and integration boundaries. -- Operational note: Postgres-backed integration tests require unsandboxed localhost access. The sandboxed run fails with `Operation not permitted` against `localhost:5432`, which matches the note in `BUILD_REPORT.md`. -- The main residual behavior risk is filesystem/database drift if provisioning fails after directory creation. +- Low risk. The tracked diff is docs-only, and `git diff --name-only -- '*.py' '*.ts' '*.tsx' '*.js' '*.jsx' '*.sql' '*.yaml' '*.yml' '*.toml' '*.json' '*.sh' 'Dockerfile*'` returned no runtime-file changes. +- The main residual risk is future doc drift if later sprints re-expand live truth files instead of continuing to archive stale history. ## docs issues -- None. `README.md`, `ARCHITECTURE.md`, and `.env.example` all reflect the Sprint 5A workspace seam and deferred scope accurately. +- None. The live docs are internally consistent with the accepted Sprint 5A architecture and with the archived sprint history. ## should anything be added to RULES.md? -- No. The current rules already cover sprint scope control, doc accuracy, and schema/test expectations for this slice. +- No. The revised rules already capture the durable guidance this sprint was meant to preserve: truth accuracy, archive-over-delete, scope control, and testing expectations. ## should anything update ARCHITECTURE.md? -- No further update is needed for Sprint 5A. +- No. `ARCHITECTURE.md` already matches the accepted Sprint 5A implemented boundary, and this sprint appropriately treated it as the architecture source of truth. ## recommended next action -- Accept Sprint 5A. -- In the next workspace-dependent sprint, tighten provisioning hygiene so filesystem creation cannot drift from durable row persistence on failure. +- Accept Sprint 5B and plan the next sprint from `PRODUCT_BRIEF.md`, `ARCHITECTURE.md`, `ROADMAP.md`, `RULES.md`, and `.ai/handoff/CURRENT_STATE.md`. +- Keep future sprint build/review artifacts under `docs/archive/sprints/` so the live context set stays compact. diff --git a/ROADMAP.md b/ROADMAP.md index ed7c2ba..a963227 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -1,97 +1,39 @@ # Roadmap -## Current State +## Current Position -- The repo has shipped the implementation slices originally planned as Milestones 1 through 4. -- Sprint 4O added the latest accepted backend seam: durable `tasks` and `task_steps` with explicit manual continuation lineage and deterministic task-step transitions. -- The project is no longer at Foundation. The current repo state is a post-Milestone-4 checkpoint, and this sprint is synchronizing project-truth docs before Milestone 5 work begins. -- No task runner, workspace/artifact layer, document ingestion, read-only connector, or broader side-effect surface has landed yet. +- The accepted repo state is current through Sprint 5A. +- The backend foundation through governance, execution review, task/task-step lifecycle, explicit manual continuation, step-linked approval/execution synchronization, and deterministic rooted task-workspace provisioning is already shipped. +- This roadmap is future-facing from that position; milestone history lives in archived sprint reports, not here. -## Completed Milestones +## Next Delivery Focus -### Milestone 1: Foundation +### Finish Milestone 5 On Top Of The Shipped Workspace Boundary -- Repo scaffold, local Docker Compose infra, FastAPI app shell, config loading, migration tooling, and backend test harness. -- Postgres continuity primitives: `users`, `threads`, `sessions`, and append-only `events`. -- Row-level-security foundation and concurrent event sequencing hardening. +- Add artifact records and artifact-handling rules that reuse `task_workspaces` instead of inventing a parallel storage seam. +- Add document ingestion and retrieval only after the artifact/workspace boundary is explicit and reviewable. +- Add read-only Gmail and Calendar connectors only after document and workspace boundaries remain deterministic under the current governance model. -Status on March 13, 2026: -- Complete. +### Preserve Current Governance And Task Guarantees -### Milestone 2: Context Compiler and Tracing +- Keep approvals, execution budgets, task/task-step state, and trace visibility deterministic as new Milestone 5 work lands. +- Do not widen the current no-external-I/O proxy surface or introduce new consequential side effects without an explicit sprint opening that scope. -- Deterministic context compilation over durable continuity records. -- Persisted `traces` and append-only `trace_events`. -- Trace-visible inclusion and exclusion reasoning for compiled context. +## After Milestone 5 -Status on March 13, 2026: -- Complete. - -### Milestone 3: Memory and Retrieval - -- Governed memory admission with append-only revisions. -- Narrow deterministic explicit-preference extraction from stored user events. -- Memory review labels, review queue reads, and evaluation summary reads. -- Explicit entities and temporal entity edges backed by cited memories. -- Versioned embedding configs, durable memory embeddings, direct semantic retrieval, and deterministic hybrid compile-path memory merge. - -Status on March 13, 2026: -- Complete. - -### Milestone 4: Governance and Safe Action - -- Deterministic response generation over compiled context. -- User-scoped consents, policies, policy evaluation, tool registry, allowlist evaluation, and tool routing. -- Durable approval requests and explicit approval resolution. -- Approved-only proxy execution through the in-process `proxy.echo` handler. -- Durable execution review, execution-budget enforcement, lifecycle mutations, and optional rolling-window limits. -- Durable `tasks` and `task_steps`, deterministic task-step reads, explicit task-step transitions, and explicit manual continuation with lineage. - -Status on March 13, 2026: -- Complete through Sprint 4O. - -## Current Milestone Position - -- The repo is at the boundary after Milestone 4. -- Milestone 5 has not started in shipped code yet. -- The immediate work is documentation synchronization and narrow lifecycle-boundary hardening so Milestone 5 planning and review start from truthful artifacts. - -## Next Milestones - -### Immediate Next Narrow Boundary - -- Preserve the current manual-continuation seam as the only shipped multi-step task path. -- Remove or explicitly constrain the remaining approval/execution helpers that still synchronize against `task_steps.sequence_no = 1` before starting runner-style orchestration or workspace-heavy task flows. - -### Milestone 5: Documents, Workspaces, and Read-Only Connectors - -- Add document ingestion and chunk retrieval. -- Add scoped task workspaces and artifact handling. -- Add read-only Gmail and Calendar sync. -- Keep connector scope read-only and approval-aware. - -### Sequencing After Milestone 5 - -- Generalize task lifecycle handling beyond the current manual continuation seam. -- Introduce runner-style orchestration only after the first-step lifecycle assumption is removed. -- Expand tool execution breadth only after the governance and task seams stay deterministic under multi-step flows. +- Revisit broader task orchestration only after the current explicit task-step seams remain stable under workspace, artifact, and document flows. +- Expand tool execution breadth only after governance, review, and budget controls still hold under the wider task surface. +- Address production-facing auth and deployment hardening as the product approaches broader real-world use. ## Dependencies -- Truth artifacts must stay synchronized before milestone planning and review work can be trusted. -- The current first-step lifecycle assumption must be resolved before broader runner or workspace work can safely depend on `tasks` / `task_steps`. -- Scoped workspace and artifact boundaries should land before document-heavy or connector-heavy flows rely on them. -- Connector scope should remain deferred until the core memory, governance, and task seams stay stable under the shipped workload. - -## Blockers and Risks - -- Memory extraction and retrieval quality remain the biggest product risk. -- Auth beyond DB user context is still unimplemented. -- The remaining first-step approval/execution synchronization helpers are a forward-compatibility risk for broader multi-step orchestration. -- Workspace or connector work could create hidden scope drift if it starts before the current task-lifecycle boundary is hardened. +- Live truth docs must stay synchronized with accepted repo state so sprint planning does not start from stale assumptions. +- Artifact and document work should build on the existing rooted local workspace contract. +- Connector work should remain read-only and approval-aware. +- Runner-style orchestration should stay deferred until the repo no longer depends on narrow current-step assumptions for safety and explainability. -## Recently Completed +## Ongoing Risks -- Durable approval, execution review, and execution-budget seams over the approved proxy path. -- Durable `tasks` and `task_steps` with deterministic reads and status transitions. -- Explicit task-step lineage and manual continuation, including adversarial validation for cross-task, cross-user, and parent-step mismatch cases. +- Memory extraction and retrieval quality remain the largest product risk. +- Auth beyond database user context is still missing. +- Milestone 5 can drift if artifact, document, connector, and orchestration work are mixed into one sprint instead of landing as narrow seams. diff --git a/RULES.md b/RULES.md index f6ac44b..6c02e05 100644 --- a/RULES.md +++ b/RULES.md @@ -1,52 +1,37 @@ # Rules -## Product / Scope Rules +## Truth And Scope + +- The active sprint packet is the top scope boundary for implementation work. +- Never describe planned behavior as already implemented. +- Keep canonical truth files concise, current, and durable. +- Archive stale planning or history material instead of deleting it when traceability still matters. +- Do not widen product scope without an explicit roadmap or sprint change. + +## Product And Safety -- The active sprint packet is the top priority scope boundary for implementation work and overrides broader roadmap intent when they conflict. -- Never represent planned architecture as implemented behavior in docs, handoffs, or build reports. - Never execute a consequential external action without explicit user approval. -- Always treat explainability as a product feature, not an internal debugging aid. +- Treat explainability as a product feature, not an internal debugging aid. - Treat the repeat magnesium reorder as the v1 ship-gate scenario. -- Never expand v1 scope with proactive automation, write-capable connectors, voice, or browser automation without an explicit roadmap change. -- Do not start runner, workspace/artifact, document-ingestion, or connector work unless the active sprint explicitly opens that boundary. +- Do not add proactive automation, write-capable connectors, voice, or browser automation without an explicit roadmap change. -## Architecture Rules +## Architecture And Data -- Treat the immutable event store as ground truth; memories, tasks, and summaries are derived or governed views over durable records. +- Treat the immutable event store as ground truth; downstream memories, tasks, and summaries are derived or governed views. - Always compile context per invocation from durable sources. -- Keep prompt prefixes, tool schemas, and serialized context ordering deterministic. -- Treat Postgres as the v1 system of record unless measured constraints justify a platform split. -- Appended task steps must carry explicit lineage to a prior visible task step. Do not relink approvals or executions heuristically from broader task history. -- Manual continuation is the current multi-step boundary. Until the older first-step lifecycle helpers are removed or constrained, do not describe broader automatic multi-step orchestration as implemented. - -## Coding Rules - -- Always build against typed contracts and migration-backed schemas first. -- Never mutate tool schemas mid-session; enforce access through policy and proxy layers. -- Keep changes small, module-scoped, and test-backed. -- Stop long-running tasks with a clear progress summary when budgets or circuit breakers trip. -- Sprint-scoped docs must clearly separate what exists now from what is only planned later. - -## Data / Schema Rules - -- Enforce row-level security on every user-owned table from the start. -- Default memory admission to `NOOP`; promote only evidence-backed changes. -- Always keep memory revision history for non-`NOOP` changes. -- Task-step lineage references must stay inside the current user scope and must validate against the intended parent step and its recorded outcome. +- Keep prompt assembly, tool schemas, and serialized context ordering deterministic. +- Treat Postgres as the v1 system of record unless measured constraints justify a change. +- Task-step lineage and execution linkage must stay explicit; do not reconstruct them heuristically from broader task history. +- Enforce row-level security on every user-owned table. +- Default memory admission to `NOOP`; promote only evidence-backed changes and preserve revision history for non-`NOOP` updates. - Apply domain and sensitivity filters before semantic retrieval. -## Deployment / Ops Rules - -- Keep v1 operations simple: one modular monolith, one primary database, one cache, one object store. -- Never store secrets in source control, committed config, or logs. -- Any repo-advertised bootstrap script that starts dependencies and then runs dependent commands must wait for service readiness before proceeding. -- When external side effects are introduced, route them through approval-aware tool execution paths. -- Backups and object versioning are required before production use. - -## Testing Rules +## Delivery And Testing +- Build against typed contracts and migration-backed schemas first. +- Keep changes small, module-scoped, and test-backed. +- Never bypass policy, approval, or proxy boundaries to introduce side effects. - Schema changes are not complete without forward and rollback coverage. - Every module needs unit tests and at least one integration boundary test. -- Approval boundaries, RLS isolation, and audit logging require adversarial tests. -- Lineage changes require adversarial tests for cross-task, cross-user, and parent-step mismatch cases. -- Memory quality and retrieval quality need labeled evaluations before release claims. +- Approval boundaries, row-level security, audit logging, and lineage changes require adversarial tests. +- Do not make memory-quality or retrieval-quality release claims without labeled evaluation evidence. diff --git a/docs/archive/sprints/2026-03-13-sprint-5a-build-report.md b/docs/archive/sprints/2026-03-13-sprint-5a-build-report.md new file mode 100644 index 0000000..c525cd9 --- /dev/null +++ b/docs/archive/sprints/2026-03-13-sprint-5a-build-report.md @@ -0,0 +1,181 @@ +# BUILD_REPORT + +## sprint objective + +Implement Sprint 5A: Task Workspace Records and Provisioning by adding user-scoped `task_workspaces`, deterministic local workspace provisioning under one configured root, duplicate-active protection per task, and stable workspace create/list/detail reads. + +## completed work + +- Added workspace schema and migration: + - new migration `apps/api/alembic/versions/20260313_0022_task_workspaces.py` + - new table `task_workspaces` with `id`, `user_id`, `task_id`, `status`, `local_path`, `created_at`, and `updated_at` + - user/task foreign key `(task_id, user_id) -> tasks(id, user_id)` + - partial unique index enforcing one active workspace per task and user + - RLS policy plus runtime grants limited to `SELECT, INSERT` +- Added workspace configuration and deterministic pathing: + - new setting `TASK_WORKSPACE_ROOT` + - default workspace root: `/tmp/alicebot/task-workspaces` + - path-generation rule: `//` + - workspace provisioning validates the resolved path stays rooted under the resolved workspace root before creating the directory +- Added typed contracts and service behavior: + - `TaskWorkspaceStatus` + - `TaskWorkspaceCreateInput` + - `TaskWorkspaceRecord` + - `TaskWorkspaceCreateResponse` + - `TaskWorkspaceListResponse` + - `TaskWorkspaceDetailResponse` + - new workspace service in `apps/api/src/alicebot_api/workspaces.py` + - duplicate active workspace creation for the same visible task now raises a deterministic conflict +- Added minimal API paths: + - `POST /v0/tasks/{task_id}/workspace` + - `GET /v0/task-workspaces` + - `GET /v0/task-workspaces/{task_workspace_id}` +- Added coverage for: + - deterministic path generation + - rooted path safety validation + - workspace creation + - duplicate-create rejection + - per-user isolation + - stable response shape + - migration upgrade/downgrade expectations including the new table, RLS, and privileges + +## exact workspace schema and contract changes introduced + +- Schema: + - `task_workspaces.id uuid PRIMARY KEY DEFAULT gen_random_uuid()` + - `task_workspaces.user_id uuid NOT NULL REFERENCES users(id) ON DELETE CASCADE` + - `task_workspaces.task_id uuid NOT NULL` + - `task_workspaces.status text NOT NULL CHECK (status IN ('active'))` + - `task_workspaces.local_path text NOT NULL CHECK (length(local_path) > 0)` + - `task_workspaces.created_at timestamptz NOT NULL DEFAULT now()` + - `task_workspaces.updated_at timestamptz NOT NULL DEFAULT now()` + - `CONSTRAINT task_workspaces_task_user_fk FOREIGN KEY (task_id, user_id) REFERENCES tasks(id, user_id) ON DELETE CASCADE` + - `CREATE UNIQUE INDEX task_workspaces_active_task_idx ON task_workspaces (user_id, task_id) WHERE status = 'active'` +- Store layer: + - `TaskWorkspaceRow` + - `ContinuityStore.lock_task_workspaces(...)` + - `ContinuityStore.create_task_workspace(...)` + - `ContinuityStore.get_task_workspace_optional(...)` + - `ContinuityStore.get_active_task_workspace_for_task_optional(...)` + - `ContinuityStore.list_task_workspaces(...)` +- Contracts: + - `TaskWorkspaceStatus = Literal["active"]` + - `TaskWorkspaceCreateInput.task_id` + - `TaskWorkspaceCreateInput.status` + - `TaskWorkspaceRecord.id` + - `TaskWorkspaceRecord.task_id` + - `TaskWorkspaceRecord.status` + - `TaskWorkspaceRecord.local_path` + - `TaskWorkspaceRecord.created_at` + - `TaskWorkspaceRecord.updated_at` + - `TaskWorkspaceCreateResponse.workspace` + - `TaskWorkspaceListResponse.items` + - `TaskWorkspaceListResponse.summary` + - `TaskWorkspaceDetailResponse.workspace` + +## configured workspace root and path-generation rule used + +- Default configured workspace root: `/tmp/alicebot/task-workspaces` +- Test override root: per-test temp directory via `Settings(task_workspace_root=...)` +- Deterministic path rule: `resolved_root / str(user_id) / str(task_id)` +- Safety rule: the resolved workspace path must remain under the resolved configured root or provisioning fails before persistence + +## incomplete work + +- None inside Sprint 5A scope. + +## files changed + +- `apps/api/alembic/versions/20260313_0022_task_workspaces.py` +- `apps/api/src/alicebot_api/config.py` +- `apps/api/src/alicebot_api/contracts.py` +- `apps/api/src/alicebot_api/main.py` +- `apps/api/src/alicebot_api/store.py` +- `apps/api/src/alicebot_api/workspaces.py` +- `tests/integration/test_migrations.py` +- `tests/integration/test_task_workspaces_api.py` +- `tests/unit/test_20260313_0022_task_workspaces.py` +- `tests/unit/test_config.py` +- `tests/unit/test_main.py` +- `tests/unit/test_task_workspace_store.py` +- `tests/unit/test_workspaces.py` +- `tests/unit/test_workspaces_main.py` +- `BUILD_REPORT.md` + +## exact commands run + +- `./.venv/bin/python -m pytest tests/unit/test_workspaces.py tests/unit/test_workspaces_main.py tests/unit/test_task_workspace_store.py tests/unit/test_20260313_0022_task_workspaces.py tests/unit/test_config.py tests/unit/test_main.py` +- `./.venv/bin/python -m pytest tests/integration/test_task_workspaces_api.py tests/integration/test_migrations.py` + - initial sandbox run failed because sandboxed localhost Postgres access was blocked +- `./.venv/bin/python -m pytest tests/unit` +- `./.venv/bin/python -m pytest tests/integration` + +## tests run + +- `./.venv/bin/python -m pytest tests/unit/test_workspaces.py tests/unit/test_workspaces_main.py tests/unit/test_task_workspace_store.py tests/unit/test_20260313_0022_task_workspaces.py tests/unit/test_config.py tests/unit/test_main.py` + - passed: `56 passed in 0.50s` +- `./.venv/bin/python -m pytest tests/integration/test_task_workspaces_api.py tests/integration/test_migrations.py` + - sandboxed run failed before test execution could start against Postgres: `3 errors in 0.21s` +- `./.venv/bin/python -m pytest tests/unit` + - passed: `315 passed in 0.57s` +- `./.venv/bin/python -m pytest tests/integration` + - passed outside the sandbox: `99 passed in 28.56s` + +## unit and integration test results + +- Unit suite: + - green + - covers config loading, migration statement order, store queries, workspace service behavior, rooted path safety, duplicate rejection, route registration, and endpoint error mapping +- Integration suite: + - green + - covers migration upgrade/downgrade expectations, workspace API provisioning, duplicate rejection, deterministic list/detail responses, and per-user isolation against Postgres + +## one example workspace create response + +```json +{ + "workspace": { + "id": "11111111-1111-1111-1111-111111111111", + "task_id": "22222222-2222-2222-2222-222222222222", + "status": "active", + "local_path": "/tmp/alicebot/task-workspaces/33333333-3333-3333-3333-333333333333/22222222-2222-2222-2222-222222222222", + "created_at": "2026-03-13T10:00:00+00:00", + "updated_at": "2026-03-13T10:00:00+00:00" + } +} +``` + +## one example workspace detail response + +```json +{ + "workspace": { + "id": "11111111-1111-1111-1111-111111111111", + "task_id": "22222222-2222-2222-2222-222222222222", + "status": "active", + "local_path": "/tmp/alicebot/task-workspaces/33333333-3333-3333-3333-333333333333/22222222-2222-2222-2222-222222222222", + "created_at": "2026-03-13T10:00:00+00:00", + "updated_at": "2026-03-13T10:00:00+00:00" + } +} +``` + +## blockers/issues + +- No implementation blocker remains. +- Verification note: + - Postgres-backed integration tests required unsandboxed access to `localhost:5432`; the initial sandboxed focused integration run failed with connection-permission errors before being rerun successfully outside the sandbox. + +## what remains intentionally deferred to later milestones + +- Artifact inventory and artifact metadata tables +- Document ingestion +- Chunking, embeddings, or document retrieval tied to workspaces +- Gmail or Calendar connector scope +- Runner-style orchestration +- New proxy handlers or broader side-effect expansion +- Any remote storage abstraction beyond the local deterministic workspace boundary added here + +## recommended next step + +Build the next workspace-dependent milestone slice on top of this boundary without widening the seam: artifact or document work should consume `task_workspaces` records and the configured rooted local path instead of inventing a parallel storage contract. diff --git a/docs/archive/sprints/2026-03-13-sprint-5a-review-report.md b/docs/archive/sprints/2026-03-13-sprint-5a-review-report.md new file mode 100644 index 0000000..0fc8efe --- /dev/null +++ b/docs/archive/sprints/2026-03-13-sprint-5a-review-report.md @@ -0,0 +1,52 @@ +# REVIEW_REPORT + +## verdict + +PASS + +## criteria met + +- The sprint stayed inside the Sprint 5A boundary. I found no artifact indexing, document ingestion, connector work, runner orchestration, new proxy handlers, or broader side-effect expansion. +- `apps/api/alembic/versions/20260313_0022_task_workspaces.py` adds the required `task_workspaces` schema with user ownership, task linkage through `(task_id, user_id)`, row-level security, and a partial unique index enforcing one active workspace per task and user. +- The workspace seam in `apps/api/src/alicebot_api/workspaces.py` is narrow and deterministic: it resolves one configured root, builds the path as `resolved_root / user_id / task_id`, rejects rooted-path escapes before provisioning, and persists a single active workspace row. +- Stable create/list/detail contracts and the minimal API surface are present for the required endpoints: + - `POST /v0/tasks/{task_id}/workspace` + - `GET /v0/task-workspaces` + - `GET /v0/task-workspaces/{task_workspace_id}` +- Duplicate active workspace creation is rejected deterministically through the advisory lock plus active-workspace lookup, with the database unique index providing backstop enforcement. +- User isolation, deterministic ordering, and stable response shape are test-backed in both unit and Postgres-backed integration coverage, including `tests/integration/test_task_workspaces_api.py`. +- `BUILD_REPORT.md` accurately describes the schema change, contract change, rooted path rule, exact commands, sample responses, and deferred scope. +- Independent verification passed: + - `./.venv/bin/python -m pytest tests/unit` -> `315 passed in 0.62s` + - `./.venv/bin/python -m pytest tests/integration` -> `99 passed in 28.66s` + +## criteria missed + +- None. + +## quality issues + +- Non-blocking: `create_task_workspace_record()` provisions the directory before the insert is durably committed and uses `mkdir(..., exist_ok=True)`. If the insert or transaction commit fails after directory creation, the code can leave behind an orphaned directory that a later successful create would silently reuse. + +## regression risks + +- Runtime regression risk is low because both acceptance suites passed and the workspace behavior is covered at service, route, migration, and integration boundaries. +- Operational note: Postgres-backed integration tests require unsandboxed localhost access. The sandboxed run fails with `Operation not permitted` against `localhost:5432`, which matches the note in `BUILD_REPORT.md`. +- The main residual behavior risk is filesystem/database drift if provisioning fails after directory creation. + +## docs issues + +- None. `README.md`, `ARCHITECTURE.md`, and `.env.example` all reflect the Sprint 5A workspace seam and deferred scope accurately. + +## should anything be added to RULES.md? + +- No. The current rules already cover sprint scope control, doc accuracy, and schema/test expectations for this slice. + +## should anything update ARCHITECTURE.md? + +- No further update is needed for Sprint 5A. + +## recommended next action + +- Accept Sprint 5A. +- In the next workspace-dependent sprint, tighten provisioning hygiene so filesystem creation cannot drift from durable row persistence on failure.