GitHub - ShineXmRedT14/AsyncCrawler: Advanced python crawler for moderm JavaScript-based wesites. Designed to extract data from dynamically loaded pages where classic HTML parsing is not enough.

This is Async Web Crawler: |____what can this project: |____1. It parse urls and get from they hrefs with absolute links. |____2. This code can render JavaScript sites (React, Vue, Angular) if text of response don't have any links. |____3. It can imitate human behavior what help with antibot in sites. |____4. all urls from crawler saved into domains.bd (sqlite) |____Dependencies: |____All dependencies in requirements.txt

How to start this Web Crawler: |____1. install all dependencies |____2. run main.py file

Features: |____1. You will can run code in cmd |____2. Add some optimization |____3. Upgrade Gui in Terminal (now gui in terminal bad)

author - ShineXmRedT14 LICENSE (MIT)

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
AsyncCrawler		AsyncCrawler
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

License

ShineXmRedT14/AsyncCrawler

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages