Skip to content

test(infra) Stern Manuals scraper-pipeline integration tests#50

Merged
jkeeley2073 merged 1 commit into
mainfrom
Dev-SternManualsScraperTestInfra
May 3, 2026
Merged

test(infra) Stern Manuals scraper-pipeline integration tests#50
jkeeley2073 merged 1 commit into
mainfrom
Dev-SternManualsScraperTestInfra

Conversation

@jkeeley2073
Copy link
Copy Markdown
Contributor

Summary

Backfill of the PR #41 family-wide test-infra template against Stern's ManualsScraper. First non-Game-yielding scraper in the family backfill — manuals are documents, not games, so the tests assert .Link yield order with full provenance only, no .Game items, and no .Link.GameSlug parent-lineage (the scraper does not derive game slugs from filenames).

5 tests using the shared fakes (FakePolitenessGate + QueueingHttpMessageHandler from PR #41):

  1. ScrapeAsync_HappyPath_YieldsLinksInPageOrderWithProvenance — three same-host PDFs in page order, plus a non-PDF page link and an off-host PDF that exercise both filter branches; asserts full provenance (SourceType.ManualsPage, DiscoveryUrl, both inner and outer DiscoveryContext) and the gate-vs-wire URL equality invariant.
  2. ScrapeAsync_OneAnchorWithMissingHref_DoesNotAbortRun — adapted from the CGC template's per-page-500 test. ManualsScraper has no per-link HTTP fetch (the entire scrape is one page → many .Links parsed inline), so the per-item failure mode is at the anchor-parsing layer. Fixture mixes empty-href, whitespace-href, off-host, and non-PDF rejection cases between two valid PDFs; asserts the loop continues past every filter branch.
  3. ScrapeAsync_DiscoveryFailure_AbortsThisSourceOnly/manuals/ 500s, scraper yields nothing without throwing; politeness invariants still hold (the 500 is reported back, lease disposed).
  4. ScrapeAsync_PolitenessExceptionFromGate_PropagatesUpgate.ThrowOnAcquire throws PolitenessException; asserts propagation AND zero requests / zero reports (the throw fires before any HTTP call).
  5. ScrapeAsync_GateThrowsOnReport_BubblesUp — symmetric: gate.ThrowOnReport propagation.

Tests: 430 → 435 (+5). Build clean, zero warnings.

Adaptations from the CGC template

ManualsScraper differs from CgcGamePageScraper in three ways that shape the test surface:

  • Single yield kind.Link only. The happy-path test removes all .Game assertions.
  • Single HTTP fetch — only /manuals/ is fetched; there are no per-link HTTP requests. Politeness assertions check Single(handler.Requests) instead of handler.Requests.Count == gate.Acquired.Count.
  • GameSlug is not derived from filenamesManualsScraper leaves DiscoveredLink.GameSlug null. The happy-path test pins this explicitly with a comment noting that cross-linking happens later (or, per the project's known-gap list, doesn't happen at all yet).

The class-level <remarks> block documents these adaptations so a future reviewer can verify the divergence is intentional.

No bugs found in ManualsScraper.

Pre-push self-audit

Step 0 — Local review: Deferred to the human reviewer at merge time per the task brief.

Step 1 — 7-item mechanical checklist:

  1. Every option field is read — N/A (no new option fields). PASS.
  2. Sibling-diff for drift vs CGC template — adaptations enumerated above and documented in <remarks>. PASS.
  3. No bare catch {} — no catch blocks in test file. PASS.
  4. CLI / orchestrator wiring is end-to-end — no new ISourceScraper added; existing SourceAliasContractTests already pin ManualsScraper/manuals. PASS.
  5. Tests assert behavior, not just structure — happy-path includes filter-rejection anchors so the filter assertions are load-bearing; failure-isolation test mixes four rejection reasons between two valid PDFs; politeness-throw tests assert zero requests/reports to prove the throw fires before HTTP. PASS.
  6. Build is zero-warning. PASS.
  7. Identity check: Jim Keeley <94459922+jkeeley2073@users.noreply.github.com>. PASS.

Test plan

  • dotnet build — 0 warnings, 0 errors
  • dotnet test --filter ManualsScraperTests — 5/5 pass
  • dotnet test — 435/435 pass (was 430)

@jkeeley2073 jkeeley2073 added the claude-code Generated with Claude Code label May 3, 2026
Backfill of the PR #41 template across the family. 5 tests covering
happy-path with full provenance + gate-vs-wire URL equality, per-link
extraction failure isolation, discovery failure, PolitenessException on
both Acquire and Report. Single-yield-link scraper (manuals are
documents, not games) so the tests assert .Link yield order only with
no .Game items.

Pre-push audit: 7-item mechanical (all pass). /local-review deferred to
human reviewer at merge time.
@jkeeley2073 jkeeley2073 force-pushed the Dev-SternManualsScraperTestInfra branch from 8c9ae35 to 8686cdd Compare May 3, 2026 12:07
@jkeeley2073 jkeeley2073 merged commit 8850949 into main May 3, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

claude-code Generated with Claude Code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant