Skip to content

feat: migrate scrapegraph-js from github pin to npm v2.1.0#17

Open
Vikrant-Khedkar wants to merge 3 commits intomainfrom
feat/migrate-scrapegraph-js-npm
Open

feat: migrate scrapegraph-js from github pin to npm v2.1.0#17
Vikrant-Khedkar wants to merge 3 commits intomainfrom
feat/migrate-scrapegraph-js-npm

Conversation

@Vikrant-Khedkar
Copy link
Copy Markdown

Summary

  • Replaces broken github commit pin (ScrapeGraphAI/scrapegraph-js#096c110) with published npm package ^2.1.0
  • The old commit had no dist/ and used the wrong base URL (api.scrapegraphai.com/api/v2 → 403), making the CLI broken out-of-the-box
  • Updates all Api* type imports to match renamed exports in v2.1.0 (ApiScrapeRequestScrapeRequest, ApiHistoryEntryHistoryEntry, etc.)

Test plan

  • credits — 200, correct balance returned
  • validate{"status":"ok"}
  • scrape — markdown returned for example.com
  • search — results returned
  • extract — structured JSON returned
  • history — full history returned
  • tsc --noEmit — zero errors

🤖 Generated with Claude Code

Switches dependency from a broken github commit (no dist, wrong base URL)
to the published npm package which ships with dist and uses the correct
v2-api.scrapegraphai.com endpoint. Updates Api* type names to match the
renamed exports in v2.1.0.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Member

@VinciGit00 VinciGit00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI/CD is broken, fix it

- Extract shared buildBaseFormat helper (src/lib/formats.ts) to dedup
  format-building logic previously duplicated across crawl/monitor/scrape.
  Uses exhaustive switch — no `as FormatConfig` cast needed.
- Replace pervasive `params as Record<string, unknown>` mut-cast pattern
  with conditional spreads and properly-typed requests.
- Add parseJsonArg / parseIntArg helpers (src/lib/parse.ts) so invalid
  --schema / --headers / --num-results / --max-pages etc. produce
  friendly CLI errors instead of raw SyntaxError or silent NaN.
- Fix cast-then-validate ordering for --service in history.ts.
- Type log.error as `: never` so control-flow narrowing works after it.

No behavioural change to the golden path; tsc and biome clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@VinciGit00
Copy link
Copy Markdown
Member

Test results — commit ef5a1a8

Ran against the live API (v2.1.0 SDK). All 20 test cases pass; golden paths unchanged and error paths now emit friendly messages instead of raw SyntaxError / silent NaN.

Golden paths

# Command Result
1 validate 200 {status: ok}
2 credits balance + job quotas returned
3 scrape -f markdown,html both payloads returned
4 scrape -f links leaf FormatConfig variant (no extras)
5 scrape -f summary leaf variant
6 scrape -f json -p … --schema … structured JSON
7 scrape --html-mode reader mode propagated (history confirms "mode":"reader")
8 extract --schema … typed JSON back
9 search --num-results 2 --country us results returned
10 search --time-range past_week filter applied (history confirms "timeRange":"past_week")
11 history scrape service filter works
12 crawl --max-pages 2 --max-depth 1 completed; params echoed back correctly

Error paths (all new friendly messages)

# Input Output
13 search --num-results abc --num-results: expected a number, got "abc"
14 extract --schema "not json" --schema: invalid JSON (JSON Parse error: …)
15 crawl --max-pages xyz --max-pages: expected a number, got "xyz"
16 crawl --include-patterns 'not-valid-json' --include-patterns: invalid JSON (…)
17 scrape -f bogusformat Unknown format: bogusformat. Valid: …
18 scrape -f json (no --prompt) --prompt is required when format includes json
19 monitor update (no --id) --id is required for update
20 monitor bogus Unknown action: bogus. Valid: …

Not exercised

monitor create/pause/resume/delete/get/activity — free plan has monitor.limit = 1 so no live monitor was created. The code typechecks cleanly and all error branches (missing --url, missing --id, unknown action) pass.

Build

  • tsc --noEmit — zero errors
  • biome check — clean

No longer needed — scrapegraph-js is now on npm and ships with dist/.
The workaround was added in 82b8137 for the old github commit pin.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@Vikrant-Khedkar
Copy link
Copy Markdown
Author

CI/CD is broken, fix it

fixing it I thought it was breaking before also ,

@Vikrant-Khedkar
Copy link
Copy Markdown
Author

I Fixed the CI/CD please check now @VinciGit00

@VinciGit00
Copy link
Copy Markdown
Member

Re-test results — all green ✅

Re-ran the full test plan on feat/migrate-scrapegraph-js-npm @ 636774d with bun install + bun run src/cli.ts <cmd> on Node 22 / Bun 1.3.9. scrapegraph-js@2.1.0 resolved cleanly from npm (no github pin, no in-place build).

Check Result
tsc --noEmit 0 errors
credits 200 · remaining: 407, plan Free Plan
validate {"status":"ok","uptime":93321}
scrape https://example.com markdown returned (id af038d53…), 68ms
search "what is scrapegraphai" --num-results 2 2 results (id 770f1e81…), 1.3s
extract https://example.com -p "Extract the page title and main heading" {"page_title":"Example Domain","main_heading":"Example Domain"} (id 5a996fe5…), 688ms
history --page-size 3 --json all 3 prior requests returned, correct renamed fields (requestParentId, elapsedMs, createdAt)

Credits delta: 407 → 397 (10 credits consumed across the run), confirming each request was actually billed server-side (not cached / stubbed).

The type rename surface (ApiScrapeRequestScrapeRequest, ApiHistoryEntryHistoryEntry, etc.) compiles cleanly and the runtime shapes match the renamed types — history in particular returns the new camelCase fields without any mapping glue. No 403s, no base-URL issues — the published v2.1.0 package is wired correctly.

LGTM from my side. 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants