Skip to content

Phase 3 SP2 — Multi-source (SourceDriver + NameResolver + LPT/chunk routing)#16

Merged
l17728 merged 23 commits into
mainfrom
feat/phase-3-sp2-multi-source
May 18, 2026
Merged

Phase 3 SP2 — Multi-source (SourceDriver + NameResolver + LPT/chunk routing)#16
l17728 merged 23 commits into
mainfrom
feat/phase-3-sp2-multi-source

Conversation

@l17728
Copy link
Copy Markdown
Owner

@l17728 l17728 commented May 18, 2026

Summary

  • SourceDriver Protocol + HF/hf_mirror/ModelScope drivers + sources.yaml registry + 3-tier NameResolver (resolver-rules.yaml).
  • Leader-gated scheduling loop: controller-side speed probe → optimal-combo → LPT file→source + chunk-split (≥100MB, ≥2 covering sources) → persist source_id/subtask_chunks. HF sha256 authority (INV 11/12/13): no-sha/HF-absent files pinned to HF (or paused_external unless trust_non_hf_sha256); non-HF mismatch → 24h (source,repo,filename) blacklist. source_strategy/source_blacklist enforced.
  • Generalized /api/v1/source-proxy (per-source cred controller-side, INV 2; strict superset of /hf-proxysource_id=None→huggingface back-compat); executor stream_source; minimal leader-gated rebalance of blacklisted sources' pending chunks.
  • Additive migration (3 tables + task/subtask source cols). Phase 3 sub-project 2 of 4 (SP1 merged Phase 3 SP1 — Multi-tenancy (OIDC + RBAC + tenant scoping + quota) #15).

Process

  • 2-reviewer pre-execution plan review → 3 BLOCKER + IMPORTANTs fixed before any code (INV-12 no-sha pin, get_settings NameError, conftest fake, zero-div, probing/strategy/blacklist scope).
  • Implementer-only per-task; controller milestone E2E + CI gates per milestone (caught + fixed: source-proxy 409 back-compat, e2e mock-seam, session-DB ordering collisions).
  • Final opus whole-impl review: MERGE-READY, no CRITICAL; 1 HIGH (chunk-multi-source Range↔row alignment) accepted as a documented, sha-safe known limitation (spec banner §7) — deferred to a follow-up.

Test plan

  • uv run pytest → 352 passed, 1 deselected (incl. E2E-002 tests/e2e/test_multi_source.py)
  • invariant_lint (tools/lint_invariants.py + test_lint_invariants.py + lint_no_direct_status_write.py) green
  • OpenAPI: spectral --fail-severity=error 0 errors + swagger-cli validate valid
  • alembic upgrade head clean; uv.lock (pyyaml promoted)
  • CI green on all 12 jobs

Spec: docs/superpowers/specs/2026-05-19-phase-3-sp2-multi-source-design.md
Plan: docs/superpowers/plans/2026-05-19-phase-3-sp2-multi-source.md

🤖 Generated with Claude Code

l17728 and others added 23 commits May 19, 2026 00:48
…d-test + LPT/chunk routing)

Brainstormed spec for Phase 3 sub-project 2. Greenfield source layer:
SourceDriver Protocol + HF/hf_mirror/ModelScope drivers + NameResolver
+ scheduling-phase speed-test/LPT/chunk routing + generalized source-proxy
+ HF sha256 authority (INV 11/12/13) + minimal leader-gated rebalance.
Aggressive deferrals (wisemodel/opencsg/plugin, Phase-B LP optimizer,
incremental=SP3, CLI=SP4) keep it a single spec->plan cycle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
17 bite-sized TDD tasks across 5 milestones (M1 source layer, M2 schema,
M3 planner, M4 proxy+executor+lifespan, M5 e2e+docs+PR). Complete code,
no placeholders; self-reviewed for spec coverage / type consistency.
Embeds SP1 lessons (models in __init__.py, drop_all->create_all fixtures,
make_app_with_state seeding + test_lifespan_state, real CI gate list).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2 opus reviewers found 3 BLOCKER + IMPORTANTs, all fixed before impl:
- INVARIANT-12 hole: no-sha (non-LFS) files routed to non-HF unverified
  -> planner now pins expected_sha256=None files to huggingface (6a)
- executor Range <-> source-chunk mismatch + no chunked rehash
  -> chunk-aligned download contract (6b); offset-order sha is the rehash
- get_settings NameError in lifespan loops -> use _gs() (A3)
- conftest fake missing stream_source + downloader.py HF-only path
  -> add stream_source to fake, switch both downloader files (A1/A2)
- zero/all-zero speed div -> filter candidates>0, pause if empty (6c)
- probing unimplemented -> add controller-side probe_source_speed (6d)
- source_strategy/blacklist not enforced -> _strategy_filter (6e)
- 5xx/health blacklist + per-executor probe deferred to v2.1 (6d/6f)
- per-tenant HF token: matches existing hf_proxy.py, out of scope (6g)
- Task 15 reuse locked parent; pyyaml already-locked note; test fake fix

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SP2 migration adds subtask_chunks/source_speed_samples/source_blacklist;
the hardcoded full-schema assertion must include them (same as SP1).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tion)

- source_proxy: source_id=None defaults to huggingface driver, so
  /source-proxy is a strict superset of /hf-proxy (legacy/non-scheduled
  subtasks still work; spec §1.1). Fixes test_executor_e2e 409.
- test_executor_e2e: patch the new source_proxy._make_source_client seam
  (executor now routes via /source-proxy after Task 13).
- clean-slate (drop_all->create_all) test_tasks/tenant_scope/subtasks
  bootstraps + add teardown to test_source_proxy app_client — resolves
  session-DB cross-module Tenant(id=1) collisions (SP1-validated fix).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…er 7)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@l17728 l17728 merged commit 454ac41 into main May 18, 2026
12 checks passed
@l17728 l17728 deleted the feat/phase-3-sp2-multi-source branch May 18, 2026 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant