🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data
-
Updated
Dec 7, 2025 - TypeScript
🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
Broken link checker that crawls websites and validates links. Find broken links, dead links, and invalid URLs in websites, documentation, and local files. Perfect for SEO audits and CI/CD.
Run a high-fidelity browser-based web archiving crawler in a single Docker container
➖ Stripped down, stable version of firecrawl optimized for self-hosting and ease of contribution. Billing logic and AI features are completely removed. Crawl and convert any website into LLM-ready markdown.
Model Context Protocol (MCP) Server for Graphlit Platform
Lightweight scraper for Google News
Web Scraper and Crawler for LLM Apps and AI Workflows with NoCode / LowCode. Plug and play with your own logic and customize it flexibly and scalably on BuildShip.
🚀 OFFICIAL STARTER TEMPLATE FOR BOTASAURUS SCRAPING FRAMEWORK 🤖
Let's build with devtool-parsed CSS!
ptt-crawler is a web crawler module designed to scarpe data from Ptt.
Web crawling & scraping framework for Node.js on top of headless Chrome browser
Official TypeScript/JavaScript SDK for the Supadata API.
A simple TypeScript framework for declaratively composing bots with Puppeteer
Crawler written in TypeScript using ES6 generators.
🕸️ Froxy – A chill open-source web indexing engine built with Go, Node.js, and Next.js. Crawls, analyzes, and serves structured web data with TF-IDF magic and Supabase as the brain.
Awesome boilerplate for writing browser automations using Playwright, with debugging and tests ready to go.
Spring Boot + Keycloak Backend / Angular Web App
Add a description, image, and links to the web-crawler topic page so that developers can more easily learn about it.
To associate your repository with the web-crawler topic, visit your repo's landing page and select "manage topics."