Phase 3 SP3 — Incremental download + global dedup (refcount/GC)#17
Merged
Conversation
Brainstormed spec for Phase 3 sub-project 3. Unified diff+dedup: an existing storage_objects row for (tenant,storage,sha) => inherit via executor S3 copy_object / os.link (no download). refcount on complete, DELETE /tasks (terminal-only) deref, leader-gated GC of refcount=0 rows. Stacks on SP2 (diff before plan_task_sources). Aggressive deferrals (quota/LRU evict, physical byte GC, BLAKE3) keep it one spec->plan cycle. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
12 bite-sized TDD tasks across 5 milestones (M1 schema/models/config/lint, M2 storage_objects svc, M3 diff/dedup+wiring, M4 executor inherit+DELETE+GC, M5 e2e+docs+PR). Complete code; self-reviewed. SP1/SP2 lessons baked in (drop_all->create_all+teardown fixtures, models in __init__.py, inherit in lint set first, real CI gate list, inherit double-count idempotency test). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2 opus reviewers: 1 BLOCKER + 5 IMPORTANT, all fixed before impl: - BLOCKER: fully-inherited task stranded in paused_external -> T6 moves pending-subs early-return ABOVE plan_task_sources' pause gates - StorageConfig had no backend_type (local hardlink dead) -> T1 adds it, T8 threads StorageBackend.backend_type via poll - runner anchor wrong -> T8 uses _execute_subtask/self._s - storage.gc audit dropped -> T10 _gc_loop write_audit - create_task never threaded upgrade_from_revision -> T1 ctor wire+test - inherit-then-fail refcount leak -> T7 deref + re-queue to pending spec banner §7 records all 6 as authoritative rulings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…StorageConfig.backend_type
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Test fixture seeds StorageBackend(id=1) — DownloadTask.storage_id FKs storage_backends; the plan's fixture omitted it (same correction Task 3 applied to the storage_objects fixture). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… inherit plan_task_sources now loads pending-only subs at the top and returns early when none remain, so a fully-inherited task never trips the pinned/no-sha/no-speed pause gates (banner 7a). Dropped an unused SourceManifest import the plan's test carried. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…erit complete_subtask records a storage_object on success (no-op when an inherit ref already exists) and, on a failed inherit copy, derefs + re-queues the file as pending (banner 7f). claim_one_subtask now also claims `inherit` subs for the lightweight copy. Deviation: test_inherit_complete_does_not_double_count seeds the sub as `assigned` not `inherit` — claim_one_subtask transitions inherit->assigned before the executor calls complete_subtask, and complete_subtask rejects non-assigned rows (scheduler.py:123). The idempotency path (record_ref_only already added the ref => record_object no-ops) is still fully exercised. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nner dispatch + poll backend_type Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eref) Two plan-defect fixes required to make the route work: - @router.delete needs response_model=None: the `-> None` return annotation makes FastAPI infer NoneType as a response model, tripping its "204 must not have a response body" assertion. - DownloadTask.subtasks needed cascade "delete" + passive_deletes=True: the relationship previously had only "save-update, merge", so session.delete(task) with subtasks loaded UPDATE-d children task_id to NULL (NOT NULL violation) instead of deferring to the FK's existing ON DELETE CASCADE. Plain "delete" (not "delete-orphan") carries no orphan-on-lazy-load risk; matches the relationship's documented intent. Test fixture also gets the now-standard intermediate await s.flush() after Project/User/StorageBackend before DownloadTask (FK ordering). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eardown Two plan gaps surfaced only by the full-suite M4 milestone E2E: - test_alembic EXPECTED_TABLES didn't include the SP3 tables the migration correctly creates (storage_objects, subtask_object_refs). - test_delete_task's app_client fixture seeded fixed-PK rows with no teardown drop_all, leaking into test_executors/test_hf_proxy (which errored in the full suite, passed isolated — the recurring session-DB cross-module collision class). Full suite now 379 passed, 0 errors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fixture seeds StorageBackend(id=1) (DownloadTask.storage_id FK) — same correction Tasks 5/9 applied; the plan's fixture omitted it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Final-review HIGH: claim_one_subtask gated EVERY candidate (incl. `inherit`) on size + 200MB <= free_bytes. An inherit subtask materializes via a server-side S3 copy_object / local hardlink and writes zero executor local bytes, so SP3's headline scenario — a 1-file change in a huge repo producing N-1 inherit subtasks — would silently never be claimed on any realistically-sized executor (task hangs in `downloading`, no error). The 380-test suite missed it because fixtures seed ample disk + tiny files. Added a regression test proving an inherit sub bypasses the gate while a same-size real download stays disk-gated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
upgrade_from_revision+ unified diff/dedup: a file whose HF sha already has astorage_objectsrow for (tenant,storage,sha) →inheritsubtask (executor S3 copy_object / os.link, no download); else normal SP2 download. Unified with cross-task dedup (content-sha lookup, not just the named prior revision).storage_objectsrefcount (INVARIANT 14 = UNIQUE(tenant,storage,sha)) +subtask_object_refs;record_objecton success is inherit-idempotent (ref-exists ⇒ no double-count). Inherit-copy failure self-heals (deref + re-queue as pending, banner 7f).DELETE /api/v1/tasks/{id}(terminal-only → 409, tenant-scoped, RBAC) derefs; leader-gated GC loop reclaims refcount=0 rows past grace (auditedstorage.gc). Phase 3 sub-project 3 of 4 (SP1 Phase 3 SP1 — Multi-tenancy (OIDC + RBAC + tenant scoping + quota) #15, SP2 Phase 3 SP2 — Multi-source (SourceDriver + NameResolver + LPT/chunk routing) #16 merged).claim_one_subtaskdisk pre-flight would silently strand inherit subtasks (server-side copy needs zero local disk) — now bypassed for inherit, with a regression test.Test plan
uv run pytestgreen: 381 passed incl.tests/e2e/test_incremental.py(≥90% inherit, roadmap §3.5)"inherit"in VALID_SUBTASK_STATUS) / openapi (spectral 0 errors + swagger-cli) / no-direct-status-write greenSpec: docs/superpowers/specs/2026-05-19-phase-3-sp3-incremental-download-design.md
Plan: docs/superpowers/plans/2026-05-19-phase-3-sp3-incremental-download.md
🤖 Generated with Claude Code