Skip to content

test(infra) JJP scraper-pipeline integration tests#47

Merged
jkeeley2073 merged 1 commit into
mainfrom
Dev-JjpScraperTestInfra
May 3, 2026
Merged

test(infra) JJP scraper-pipeline integration tests#47
jkeeley2073 merged 1 commit into
mainfrom
Dev-JjpScraperTestInfra

Conversation

@jkeeley2073
Copy link
Copy Markdown
Contributor

Summary

Backfill of the PR #41 family-wide scraper-pipeline test-infra template against JjpProductScraper. JJP is a single-yield scraper (.Game only) on a Shopify storefront. Discovery is the canonical Shopify pattern: fetch /collections/{slug}/products.json for the pinball-machine handle set, walk the sitemap index for product sub-sitemaps, filter the listed product URLs against the handle set, then fetch each filtered product page and yield. The five tests use the shared FakePolitenessGate + QueueingHttpMessageHandler from PR #41 to pin behaviors the unit-test surface couldn't reach: yield order, provenance-field propagation onto ScrapedItem, SourceType.JjpProductPage and "JJP Product Page" discovery context, gate-vs-wire URL byte-equality, per-page failure isolation, discovery failure aborting only this source, and PolitenessException propagation on both Acquire and Report paths.

Touches only the new test file plus a single CHANGELOG bullet inserted directly after the family-wide test-infra entry. No production code changed; no JjpProductScraper bugs found in the test path. (Note: the two pre-existing JJP drift items surfaced by /local-review of PR #43ExtractSlug null-guard and NormalizeAvailability visibility — are intentionally out of scope for this test-infra PR and tracked separately.)

Test Plan

  • dotnet build -nologo -v:m → 0 warnings, 0 errors
  • dotnet test --filter "FullyQualifiedName~JjpProductScraperTests" → 5/5 pass (878 ms)
  • dotnet test (full suite) → 435/435 pass (was 430 on main; +5 = the new tests)

Tests added (all in tests/PinballWizard.Scraper.Tests/Scraping/Jjp/JjpProductScraperTests.cs):

  • ScrapeAsync_HappyPath_YieldsGamesInListOrderWithProvenance — 3 fixtured products surface from the collection JSON + sitemap walk; assert .Game yield order matches sitemap order, SourceType.JjpProductPage, "JJP Product Page" context, DiscoveryUrl per item, Source.ScrapedFrom = product URL, Source.ScrapedAt non-default, DiscoveredOn contains "jjp_products", every wire URL byte-equals every gate-acquired and gate-reported URL.
  • ScrapeAsync_PerPageFetchFailure_DoesNotAbortRun — middle product page returns 500; the other two still yield; the 500 is reported to the gate.
  • ScrapeAsync_DiscoveryFailure_AbortsThisSourceOnly/collections/pinball-machines-for-sale/products.json returns 500; scraper yields nothing without throwing.
  • ScrapeAsync_PolitenessExceptionFromGate_PropagatesUpgate.ThrowOnAcquire set; assert PolitenessException propagates and zero wire requests fired.
  • ScrapeAsync_GateThrowsOnReport_BubblesUpgate.ThrowOnReport set; assert PolitenessException propagates from the first response report (collection JSON fetch).

Out of Scope

  • The two pre-existing JJP drift items from PR refactor(multimorphic) adopt shared JsonLdProductParser #43's /local-review (ExtractSlug null-guard, NormalizeAvailability visibility) — tracked separately.
  • Family backfill for the other six scrapers (PB / BoF / Multimorphic / AP / Spooky / Stern.Manuals) — running in parallel worktrees.
  • Any production-code changes; this PR is test-only.

Checklist

  • CI is green (build + test + coverage + CodeQL + sanitization)
  • PR title follows the Conventional Commits format above
  • If this is a new architectural decision, an ADR has been added under docs/adr/ — N/A, test-only
  • If user-visible behavior changes, README.md and/or docs/ are updated in the same PR — N/A, no behavior change
  • If a memory in ~/.claude/projects/c--projects-PinballWizard/memory/ is now stale, it has been updated or removed in the same PR — N/A
  • No TODO / FIXME / commented-out code committed
  • No new entries in <NoWarn> without a comment explaining why and the removal criterion

Pre-push self-audit (additive PRs)

Step 0 — /local-review (qualitative)

  • Ran /local-review and addressed every 🔴 finding before push
  • Local review outcome: deferred to human reviewer at merge time per the parallel-backfill task spec — six sibling backfill PRs are queued and the reviewer will run /local-review once across the family-merge sequence to catch any cross-PR drift.

Step 1 — Mechanical checklist

  • Every new *Options property has at least one real getter call in src/ — N/A, no production code added
  • Sibling-diffed against the closest existing implementation (CgcGamePageScraperTests); drift is justified — JJP-specific deltas: discovery is collection JSON + sitemap walk vs. menu HTML; single-yield (.Game only) vs. mixed yields; provenance asserts DiscoveryUrl equals product page URL because there is no separate index discovery URL on the per-yield path
  • No bare catch { } — none added (only Assert.ThrowsAsync<PolitenessException> for the propagation tests)
  • New ISourceScraper? — N/A, test-only PR
  • Tests assert behavior, not just structure — every test name maps to a fixture that triggers the named condition (a 500 on the middle page, a 500 on the discovery endpoint, gate.ThrowOnAcquire / ThrowOnReport)
  • Build is zero-warning — confirmed via dotnet build -nologo -v:m
  • git log -1 --format='%an <%ae>' shows personal noreply, not work email — verified: Jim Keeley <94459922+jkeeley2073@users.noreply.github.com>

@jkeeley2073 jkeeley2073 added the claude-code Generated with Claude Code label May 3, 2026
Backfill of the PR #41 template across the family. 5 tests covering
happy-path with full provenance + gate-vs-wire URL equality, per-page
failure isolation, discovery failure, PolitenessException on both Acquire
and Report. Single-yield (.Game only) Shopify storefront via
/collections/{slug}/products.json.

Pre-push audit: 7-item mechanical (all pass). /local-review deferred to
human reviewer at merge time.
@jkeeley2073 jkeeley2073 force-pushed the Dev-JjpScraperTestInfra branch from 4eca975 to e71dd58 Compare May 3, 2026 11:57
@jkeeley2073 jkeeley2073 merged commit eb7bc28 into main May 3, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

claude-code Generated with Claude Code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant