docs: update CLI and MCP docs for v2 PR changes

VinciGit00 · claude · VinciGit00 · commit 7e342adcdae6 · 2026-04-14T09:19:13.000+02:00
Align CLI docs with just-scrape PR #13 (v2 migration): - search: add --location-geo-code, --time-range, --format, --nationality flags - scrape: add links, images, summary, json formats, multi-format, --html-mode, --scrolls, --prompt/--schema - crawl: add --format flag - credits: fix jq path to camelCase remainingCredits - ai-agent-skill: update CLAUDE.md snippet with new formats and flags Align MCP docs with scrapegraph-mcp PR #16 (v2 migration): - Update available tools from 8 v1 tools to 16 v2 tools - Add crawl_stop, crawl_resume, credits, sgai_history, monitor_* tools - Remove sitemap, agentic_scrapper references Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
diff --git a/services/cli/ai-agent-skill.mdx b/services/cli/ai-agent-skill.mdx
@@ -81,10 +81,13 @@ Available commands (always use --json flag):
 - `just-scrape search <query> --json` — search the web and extract data
 - `just-scrape markdownify <url> --json` — convert a page to markdown
 - `just-scrape crawl <url> --json` — crawl multiple pages
-- `just-scrape scrape <url> --json` — get page content (markdown, html, screenshot, branding)
+- `just-scrape scrape <url> --json` — get page content (markdown, html, screenshot, branding, links, images, summary, json)
+- `just-scrape credits --json` — check credit balance
 
 Use --schema to enforce a JSON schema on the output.
 Use --mode direct+stealth or --mode js+stealth for sites with anti-bot protection.
+Use -f to pick scrape format(s), e.g. -f markdown,links,images for multi-format.
+Use --location-geo-code and --time-range with search for geo/time filtering.
 ```
 
 ### Example prompts for Claude Code
diff --git a/services/cli/commands.mdx b/services/cli/commands.mdx
@@ -25,6 +25,10 @@ just-scrape search <query>
 just-scrape search <query> -p <prompt>                   # extraction prompt for results
 just-scrape search <query> --num-results <n>             # sources to scrape (1-20, default 3)
 just-scrape search <query> --schema <json>
+just-scrape search <query> --location-geo-code <code>    # geo-target (e.g. 'us', 'de', 'jp-tk')
+just-scrape search <query> --time-range <range>          # past_hour | past_24_hours | past_week | past_month | past_year
+just-scrape search <query> --format <markdown|html>      # result format (default markdown)
+just-scrape search <query> --nationality <iso>           # 2-letter ISO nationality code
 just-scrape search <query> --headers <json>
 ```
 
@@ -40,13 +44,21 @@ just-scrape markdownify <url> --headers <json>
 
 ## scrape
 
-Scrape content from a URL in various formats. [Full docs →](/api-reference/scrape)
+Scrape content from a URL in one or more formats. Supports **8 formats**: `markdown`, `html`, `screenshot`, `branding`, `links`, `images`, `summary`, `json`. [Full docs →](/api-reference/scrape)
 
 ```bash
 just-scrape scrape <url>                                 # markdown (default)
 just-scrape scrape <url> -f html                         # raw HTML
-just-scrape scrape <url> -f screenshot                   # screenshot
-just-scrape scrape <url> -f branding                     # extract branding info
+just-scrape scrape <url> -f screenshot                   # page screenshot
+just-scrape scrape <url> -f branding                     # branding info (logos, colors, fonts)
+just-scrape scrape <url> -f links                        # all links on the page
+just-scrape scrape <url> -f images                       # all images on the page
+just-scrape scrape <url> -f summary                      # AI-generated page summary
+just-scrape scrape <url> -f json -p <prompt>             # structured JSON via prompt
+just-scrape scrape <url> -f json -p <prompt> --schema <json>  # JSON with enforced schema
+just-scrape scrape <url> -f markdown,links,images        # multi-format (comma-separated)
+just-scrape scrape <url> --html-mode reader              # normal (default), reader, or prune
+just-scrape scrape <url> --scrolls <n>                   # infinite scroll (0-100)
 just-scrape scrape <url> -m direct+stealth               # anti-bot bypass
 just-scrape scrape <url> --country <iso>                 # geo-targeting
 ```
@@ -61,6 +73,7 @@ just-scrape crawl <url> --max-pages <n>                  # max pages (default 50
 just-scrape crawl <url> --max-depth <n>                  # crawl depth (default 2)
 just-scrape crawl <url> --max-links-per-page <n>         # max links per page (default 10)
 just-scrape crawl <url> --allow-external                 # allow external domains
+just-scrape crawl <url> -f html                          # page format (default markdown)
 just-scrape crawl <url> -m direct+stealth                # anti-bot bypass
 ```
 
@@ -83,7 +96,7 @@ Check your credit balance.
 
 ```bash
 just-scrape credits
-just-scrape credits --json | jq '.remaining_credits'
+just-scrape credits --json | jq '.remainingCredits'
 ```
 
 ## Global flags
diff --git a/services/cli/examples.mdx b/services/cli/examples.mdx
@@ -29,6 +29,10 @@ just-scrape extract https://app.example.com/dashboard \
 just-scrape search "What are the best Python web frameworks in 2025?" \
   --num-results 10
 
+# Recent news only, scoped to Germany
+just-scrape search "EU AI act latest news" \
+  --time-range past_week --location-geo-code de
+
 # Structured output with schema
 just-scrape search "Top 5 cloud providers pricing" \
   --schema '{"type":"object","properties":{"providers":{"type":"array","items":{"type":"object","properties":{"name":{"type":"string"},"free_tier":{"type":"string"}}}}}}'
@@ -58,15 +62,22 @@ just-scrape markdownify https://protected.example.com -m js+stealth
 # Get markdown (default format)
 just-scrape scrape https://example.com
 
-# Get raw HTML
-just-scrape scrape https://example.com -f html
+# Get raw HTML with reader-mode extraction
+just-scrape scrape https://blog.example.com -f html --html-mode reader
 
 # Take a screenshot
 just-scrape scrape https://example.com -f screenshot
 
 # Extract branding info (logos, colors, fonts)
 just-scrape scrape https://example.com -f branding
 
+# Multi-format: markdown + links + images in a single call
+just-scrape scrape https://example.com -f markdown,links,images
+
+# Structured JSON output with a prompt
+just-scrape scrape https://store.example.com \
+  -f json -p "Extract product name and price"
+
 # Geo-targeted + anti-bot bypass
 just-scrape scrape https://store.example.com \
   -m direct+stealth --country DE
@@ -79,6 +90,10 @@ just-scrape scrape https://store.example.com \
 just-scrape crawl https://docs.example.com \
   --max-pages 20 --max-depth 3
 
+# Crawl and get HTML instead of markdown
+just-scrape crawl https://example.com \
+  --max-pages 50 -f html
+
 # Allow external links
 just-scrape crawl https://example.com \
   --max-pages 50 --allow-external
diff --git a/services/mcp-server/introduction.mdx b/services/mcp-server/introduction.mdx
@@ -22,8 +22,8 @@ The Model Context Protocol (MCP) is a standardized way for AI assistants to acce
 ## Key Features
 
 <CardGroup cols={2}>
-  <Card title="8 Powerful Tools" icon="tools">
-    Access markdown conversion, AI extraction, search, crawling, sitemap discovery, and agentic workflows
+  <Card title="16 Powerful Tools" icon="tools">
+    Scrape, extract, search, crawl, monitor scheduled jobs, and manage your account
   </Card>
   <Card title="Remote & Local" icon="server">
     Use the hosted HTTP endpoint or run locally via Python
@@ -38,16 +38,30 @@ The Model Context Protocol (MCP) is a standardized way for AI assistants to acce
 
 ## Available Tools
 
-The MCP server exposes 8 enterprise-ready tools:
+The MCP server exposes the following tools via API v2:
+
+| Tool | Description |
+|---|---|
+| **markdownify** | Convert webpages to clean markdown (POST /scrape) |
+| **scrape** | Fetch page content in multiple formats: markdown, html, screenshot, branding (POST /scrape) |
+| **smartscraper** | AI-powered structured extraction from a URL (POST /extract) |
+| **searchscraper** | Search the web and extract structured results (POST /search) |
+| **smartcrawler_initiate** | Start async multi-page crawl in markdown or html mode (POST /crawl) |
+| **smartcrawler_fetch_results** | Poll crawl results (GET /crawl/:id) |
+| **crawl_stop** | Stop a running crawl job (POST /crawl/:id/stop) |
+| **crawl_resume** | Resume a stopped crawl job (POST /crawl/:id/resume) |
+| **credits** | Check your credit balance (GET /credits) |
+| **sgai_history** | Browse request history with pagination (GET /history) |
+| **monitor_create** | Create a scheduled extraction job (POST /monitor) |
+| **monitor_list** | List all monitors (GET /monitor) |
+| **monitor_get** | Get monitor details (GET /monitor/:id) |
+| **monitor_pause** | Pause a running monitor (POST /monitor/:id/pause) |
+| **monitor_resume** | Resume a paused monitor (POST /monitor/:id/resume) |
+| **monitor_delete** | Delete a monitor (DELETE /monitor/:id) |
 
-1. **markdownify** - Convert webpages to clean markdown
-2. **smartscraper** - AI-powered extraction with optional infinite scrolls
-3. **searchscraper** - Search the web and extract structured results
-4. **scrape** - Fetch raw HTML with optional JavaScript rendering
-5. **sitemap** - Discover a site's URLs and structure
-6. **smartcrawler_initiate** - Start async multi-page crawls
-7. **smartcrawler_fetch_results** - Poll crawl results
-8. **agentic_scrapper** - Multi-step workflows with session persistence
+<Note>
+  Removed from v1: `sitemap`, `agentic_scrapper`, `markdownify_status`, `smartscraper_status` (no v2 API equivalents).
+</Note>
 
 ## Quick Start