test(infra) Stern Manuals scraper-pipeline integration tests#50
Merged
Conversation
Backfill of the PR #41 template across the family. 5 tests covering happy-path with full provenance + gate-vs-wire URL equality, per-link extraction failure isolation, discovery failure, PolitenessException on both Acquire and Report. Single-yield-link scraper (manuals are documents, not games) so the tests assert .Link yield order only with no .Game items. Pre-push audit: 7-item mechanical (all pass). /local-review deferred to human reviewer at merge time.
8c9ae35 to
8686cdd
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Backfill of the PR #41 family-wide test-infra template against Stern's
ManualsScraper. First non-Game-yielding scraper in the family backfill — manuals are documents, not games, so the tests assert.Linkyield order with full provenance only, no.Gameitems, and no.Link.GameSlugparent-lineage (the scraper does not derive game slugs from filenames).5 tests using the shared fakes (
FakePolitenessGate+QueueingHttpMessageHandlerfrom PR #41):ScrapeAsync_HappyPath_YieldsLinksInPageOrderWithProvenance— three same-host PDFs in page order, plus a non-PDF page link and an off-host PDF that exercise both filter branches; asserts full provenance (SourceType.ManualsPage,DiscoveryUrl, both inner and outerDiscoveryContext) and the gate-vs-wire URL equality invariant.ScrapeAsync_OneAnchorWithMissingHref_DoesNotAbortRun— adapted from the CGC template's per-page-500 test.ManualsScraperhas no per-link HTTP fetch (the entire scrape is one page → many.Links parsed inline), so the per-item failure mode is at the anchor-parsing layer. Fixture mixes empty-href, whitespace-href, off-host, and non-PDF rejection cases between two valid PDFs; asserts the loop continues past every filter branch.ScrapeAsync_DiscoveryFailure_AbortsThisSourceOnly—/manuals/500s, scraper yields nothing without throwing; politeness invariants still hold (the 500 is reported back, lease disposed).ScrapeAsync_PolitenessExceptionFromGate_PropagatesUp—gate.ThrowOnAcquirethrowsPolitenessException; asserts propagation AND zero requests / zero reports (the throw fires before any HTTP call).ScrapeAsync_GateThrowsOnReport_BubblesUp— symmetric:gate.ThrowOnReportpropagation.Tests: 430 → 435 (+5). Build clean, zero warnings.
Adaptations from the CGC template
ManualsScraperdiffers fromCgcGamePageScraperin three ways that shape the test surface:.Linkonly. The happy-path test removes all.Gameassertions./manuals/is fetched; there are no per-link HTTP requests. Politeness assertions checkSingle(handler.Requests)instead ofhandler.Requests.Count == gate.Acquired.Count.GameSlugis not derived from filenames —ManualsScraperleavesDiscoveredLink.GameSlugnull. The happy-path test pins this explicitly with a comment noting that cross-linking happens later (or, per the project's known-gap list, doesn't happen at all yet).The class-level
<remarks>block documents these adaptations so a future reviewer can verify the divergence is intentional.No bugs found in
ManualsScraper.Pre-push self-audit
Step 0 — Local review: Deferred to the human reviewer at merge time per the task brief.
Step 1 — 7-item mechanical checklist:
<remarks>. PASS.catch {}— no catch blocks in test file. PASS.ISourceScraperadded; existingSourceAliasContractTestsalready pinManualsScraper/manuals. PASS.Jim Keeley <94459922+jkeeley2073@users.noreply.github.com>. PASS.Test plan
dotnet build— 0 warnings, 0 errorsdotnet test --filter ManualsScraperTests— 5/5 passdotnet test— 435/435 pass (was 430)