Skip to content

Commit 7e342ad

Browse files
VinciGit00claude
andcommitted
docs: update CLI and MCP docs for v2 PR changes
Align CLI docs with just-scrape PR #13 (v2 migration): - search: add --location-geo-code, --time-range, --format, --nationality flags - scrape: add links, images, summary, json formats, multi-format, --html-mode, --scrolls, --prompt/--schema - crawl: add --format flag - credits: fix jq path to camelCase remainingCredits - ai-agent-skill: update CLAUDE.md snippet with new formats and flags Align MCP docs with scrapegraph-mcp PR #16 (v2 migration): - Update available tools from 8 v1 tools to 16 v2 tools - Add crawl_stop, crawl_resume, credits, sgai_history, monitor_* tools - Remove sitemap, agentic_scrapper references Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 125938c commit 7e342ad

4 files changed

Lines changed: 63 additions & 18 deletions

File tree

services/cli/ai-agent-skill.mdx

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -81,10 +81,13 @@ Available commands (always use --json flag):
8181
- `just-scrape search <query> --json` — search the web and extract data
8282
- `just-scrape markdownify <url> --json` — convert a page to markdown
8383
- `just-scrape crawl <url> --json` — crawl multiple pages
84-
- `just-scrape scrape <url> --json` — get page content (markdown, html, screenshot, branding)
84+
- `just-scrape scrape <url> --json` — get page content (markdown, html, screenshot, branding, links, images, summary, json)
85+
- `just-scrape credits --json` — check credit balance
8586

8687
Use --schema to enforce a JSON schema on the output.
8788
Use --mode direct+stealth or --mode js+stealth for sites with anti-bot protection.
89+
Use -f to pick scrape format(s), e.g. -f markdown,links,images for multi-format.
90+
Use --location-geo-code and --time-range with search for geo/time filtering.
8891
```
8992

9093
### Example prompts for Claude Code

services/cli/commands.mdx

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,10 @@ just-scrape search <query>
2525
just-scrape search <query> -p <prompt> # extraction prompt for results
2626
just-scrape search <query> --num-results <n> # sources to scrape (1-20, default 3)
2727
just-scrape search <query> --schema <json>
28+
just-scrape search <query> --location-geo-code <code> # geo-target (e.g. 'us', 'de', 'jp-tk')
29+
just-scrape search <query> --time-range <range> # past_hour | past_24_hours | past_week | past_month | past_year
30+
just-scrape search <query> --format <markdown|html> # result format (default markdown)
31+
just-scrape search <query> --nationality <iso> # 2-letter ISO nationality code
2832
just-scrape search <query> --headers <json>
2933
```
3034

@@ -40,13 +44,21 @@ just-scrape markdownify <url> --headers <json>
4044

4145
## scrape
4246

43-
Scrape content from a URL in various formats. [Full docs →](/api-reference/scrape)
47+
Scrape content from a URL in one or more formats. Supports **8 formats**: `markdown`, `html`, `screenshot`, `branding`, `links`, `images`, `summary`, `json`. [Full docs →](/api-reference/scrape)
4448

4549
```bash
4650
just-scrape scrape <url> # markdown (default)
4751
just-scrape scrape <url> -f html # raw HTML
48-
just-scrape scrape <url> -f screenshot # screenshot
49-
just-scrape scrape <url> -f branding # extract branding info
52+
just-scrape scrape <url> -f screenshot # page screenshot
53+
just-scrape scrape <url> -f branding # branding info (logos, colors, fonts)
54+
just-scrape scrape <url> -f links # all links on the page
55+
just-scrape scrape <url> -f images # all images on the page
56+
just-scrape scrape <url> -f summary # AI-generated page summary
57+
just-scrape scrape <url> -f json -p <prompt> # structured JSON via prompt
58+
just-scrape scrape <url> -f json -p <prompt> --schema <json> # JSON with enforced schema
59+
just-scrape scrape <url> -f markdown,links,images # multi-format (comma-separated)
60+
just-scrape scrape <url> --html-mode reader # normal (default), reader, or prune
61+
just-scrape scrape <url> --scrolls <n> # infinite scroll (0-100)
5062
just-scrape scrape <url> -m direct+stealth # anti-bot bypass
5163
just-scrape scrape <url> --country <iso> # geo-targeting
5264
```
@@ -61,6 +73,7 @@ just-scrape crawl <url> --max-pages <n> # max pages (default 50
6173
just-scrape crawl <url> --max-depth <n> # crawl depth (default 2)
6274
just-scrape crawl <url> --max-links-per-page <n> # max links per page (default 10)
6375
just-scrape crawl <url> --allow-external # allow external domains
76+
just-scrape crawl <url> -f html # page format (default markdown)
6477
just-scrape crawl <url> -m direct+stealth # anti-bot bypass
6578
```
6679

@@ -83,7 +96,7 @@ Check your credit balance.
8396

8497
```bash
8598
just-scrape credits
86-
just-scrape credits --json | jq '.remaining_credits'
99+
just-scrape credits --json | jq '.remainingCredits'
87100
```
88101

89102
## Global flags

services/cli/examples.mdx

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,10 @@ just-scrape extract https://app.example.com/dashboard \
2929
just-scrape search "What are the best Python web frameworks in 2025?" \
3030
--num-results 10
3131

32+
# Recent news only, scoped to Germany
33+
just-scrape search "EU AI act latest news" \
34+
--time-range past_week --location-geo-code de
35+
3236
# Structured output with schema
3337
just-scrape search "Top 5 cloud providers pricing" \
3438
--schema '{"type":"object","properties":{"providers":{"type":"array","items":{"type":"object","properties":{"name":{"type":"string"},"free_tier":{"type":"string"}}}}}}'
@@ -58,15 +62,22 @@ just-scrape markdownify https://protected.example.com -m js+stealth
5862
# Get markdown (default format)
5963
just-scrape scrape https://example.com
6064

61-
# Get raw HTML
62-
just-scrape scrape https://example.com -f html
65+
# Get raw HTML with reader-mode extraction
66+
just-scrape scrape https://blog.example.com -f html --html-mode reader
6367

6468
# Take a screenshot
6569
just-scrape scrape https://example.com -f screenshot
6670

6771
# Extract branding info (logos, colors, fonts)
6872
just-scrape scrape https://example.com -f branding
6973

74+
# Multi-format: markdown + links + images in a single call
75+
just-scrape scrape https://example.com -f markdown,links,images
76+
77+
# Structured JSON output with a prompt
78+
just-scrape scrape https://store.example.com \
79+
-f json -p "Extract product name and price"
80+
7081
# Geo-targeted + anti-bot bypass
7182
just-scrape scrape https://store.example.com \
7283
-m direct+stealth --country DE
@@ -79,6 +90,10 @@ just-scrape scrape https://store.example.com \
7990
just-scrape crawl https://docs.example.com \
8091
--max-pages 20 --max-depth 3
8192

93+
# Crawl and get HTML instead of markdown
94+
just-scrape crawl https://example.com \
95+
--max-pages 50 -f html
96+
8297
# Allow external links
8398
just-scrape crawl https://example.com \
8499
--max-pages 50 --allow-external

services/mcp-server/introduction.mdx

Lines changed: 25 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -22,8 +22,8 @@ The Model Context Protocol (MCP) is a standardized way for AI assistants to acce
2222
## Key Features
2323

2424
<CardGroup cols={2}>
25-
<Card title="8 Powerful Tools" icon="tools">
26-
Access markdown conversion, AI extraction, search, crawling, sitemap discovery, and agentic workflows
25+
<Card title="16 Powerful Tools" icon="tools">
26+
Scrape, extract, search, crawl, monitor scheduled jobs, and manage your account
2727
</Card>
2828
<Card title="Remote & Local" icon="server">
2929
Use the hosted HTTP endpoint or run locally via Python
@@ -38,16 +38,30 @@ The Model Context Protocol (MCP) is a standardized way for AI assistants to acce
3838

3939
## Available Tools
4040

41-
The MCP server exposes 8 enterprise-ready tools:
41+
The MCP server exposes the following tools via API v2:
42+
43+
| Tool | Description |
44+
|---|---|
45+
| **markdownify** | Convert webpages to clean markdown (POST /scrape) |
46+
| **scrape** | Fetch page content in multiple formats: markdown, html, screenshot, branding (POST /scrape) |
47+
| **smartscraper** | AI-powered structured extraction from a URL (POST /extract) |
48+
| **searchscraper** | Search the web and extract structured results (POST /search) |
49+
| **smartcrawler_initiate** | Start async multi-page crawl in markdown or html mode (POST /crawl) |
50+
| **smartcrawler_fetch_results** | Poll crawl results (GET /crawl/:id) |
51+
| **crawl_stop** | Stop a running crawl job (POST /crawl/:id/stop) |
52+
| **crawl_resume** | Resume a stopped crawl job (POST /crawl/:id/resume) |
53+
| **credits** | Check your credit balance (GET /credits) |
54+
| **sgai_history** | Browse request history with pagination (GET /history) |
55+
| **monitor_create** | Create a scheduled extraction job (POST /monitor) |
56+
| **monitor_list** | List all monitors (GET /monitor) |
57+
| **monitor_get** | Get monitor details (GET /monitor/:id) |
58+
| **monitor_pause** | Pause a running monitor (POST /monitor/:id/pause) |
59+
| **monitor_resume** | Resume a paused monitor (POST /monitor/:id/resume) |
60+
| **monitor_delete** | Delete a monitor (DELETE /monitor/:id) |
4261

43-
1. **markdownify** - Convert webpages to clean markdown
44-
2. **smartscraper** - AI-powered extraction with optional infinite scrolls
45-
3. **searchscraper** - Search the web and extract structured results
46-
4. **scrape** - Fetch raw HTML with optional JavaScript rendering
47-
5. **sitemap** - Discover a site's URLs and structure
48-
6. **smartcrawler_initiate** - Start async multi-page crawls
49-
7. **smartcrawler_fetch_results** - Poll crawl results
50-
8. **agentic_scrapper** - Multi-step workflows with session persistence
62+
<Note>
63+
Removed from v1: `sitemap`, `agentic_scrapper`, `markdownify_status`, `smartscraper_status` (no v2 API equivalents).
64+
</Note>
5165

5266
## Quick Start
5367

0 commit comments

Comments
 (0)