Phase 2 W3b — HF reverse proxy (SEC-02 / INVARIANT 2)#13
Merged
Conversation
Controller-side GET /api/v1/hf-proxy/subtask/{id} streams HF files with the
tenant token injected server-side; executors stop calling HF directly and
lose ExecutorSettings.hf_token/hf_endpoint. Zero schema changes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3 milestones: controller proxy endpoint, executor rewiring, integration (lint + e2e + OpenAPI + docs + PR). TDD bite-sized steps, complete code. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…est (W3b) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… (W3b) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…s transport once (W3b)
…W3b) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…lback (W3b) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… HF fields (W3b) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ook (W3b) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Reflects the corrections applied across the 9 implementation tasks: test heartbeat-before-poll, proxy client cleanup + 503 mapping, stream_hf return annotation, fake client X-Assignment-Token, the _resolve_size RuntimeError test, and the lint full-text-scan note. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
GET /api/v1/hf-proxy/subtask/{id}: mTLS+JWT auth, a fail-closed verification chain (subtask exists → confused-deputy guard → assignment_token fence → epoch fence), HF URL reconstruction from server-side rows, tenant-token injection, streaming passthrough with a 5-header allowlist. Per-request httpx client closed on every path (success / HF transport error → 503 / other exception).ControllerClient.stream_hf(async-context-manager) is the only HF path; both downloaders (HfS3StreamDownloader,DirectOffsetDownloader) fetch through it;DirectOffsetDownloader._resolve_sizebecame abytes=0-0range probe (proxy is GET-only).Assignmentgainedassignment_token.ExecutorSettings.hf_token/hf_endpointdeleted;_io.make_http_clientremoved. Newcheck_no_hf_token_in_executorinvariant lint locks INVARIANT 2 for the executor package (with self-tests).Settings.hf_proxy_timeout_seconds.Spec:
docs/superpowers/specs/2026-05-14-phase-2-w3b-hf-reverse-proxy-design.mdPlan:
docs/superpowers/plans/2026-05-14-phase-2-w3b-hf-reverse-proxy.mdTest plan
uv run pytest tests/api/test_hf_proxy.py— 9 proxy cases (streaming, token injection, URL reconstruction, Range, 429, 401, 404, 403 NOT_YOUR_SUBTASK, 409 STALE_ASSIGNMENT / EPOCH_MISMATCH)uv run pytest tests/executor/—stream_hfheaders, both downloaders rewired,_resolve_sizerange probe + fallback + RuntimeError branch, runner threadsassignment_tokenuv run pytest tests/e2e/test_executor_e2e.py— full executor→controller-proxy→HF(mock)→S3 pathuv run pytest tests/tools/test_lint_no_hf_token.py+uv run python tools/lint_invariants.py— the new lint passes on the production treeuv run pytest -q— full suite green (235 passed, 1 deselected)Known minor follow-ups (non-blocking, from final review)
make_fake_controller_clienttest double does not replicate the proxy's 5-header allowlist (latent — no current downloader reads a non-allowlisted header)._resolve_sizeContent-Lengthfallback trusts the value without assertingstatus_code == 200(HF is well-behaved here; downstream sha256 gate is fail-closed).🤖 Generated with Claude Code