This Python project allows users to upload their CV in PDF format and matches it with job postings scraped from job board like Welcome to th Jungle or Indeed. The system extracts key details from the CV and job descriptions, then calculates a matching score to help users find the most relevant job opportunities.
- Scrape job listings from multiple platforms
- Bypass JavaScript-based blocking with Selenium automation
- Upload and extract CV content for analysis
- Store many CVs and offers in SQL db
- Use NLP models for text processing
Note
This project is currently in its beta phase, and the final version is still under development. See Upcoming Features for planned future releases.
- Python - Core programming language
- Flake8 - For linting
- Selenium - For browser automation to bypass restrictions
- BeautifulSoup - For parsing HTML and extracting data
- PosrtgreSQL - For db storage
- pdfplumber - For text exctracting from pdf CV file
- spaCy - For text processing and tokenization
-
Clone the repository:
git clone https://github.com/githubstevemas/scrape-and-hire.git
-
Install the required dependencies: Navigate to the project directory and run:
pip install -r requirements.txt
-
Install trained pipeline for spaCy:
python -m spacy download fr_core_news_md
(you can find more trained models here)
Once the project configuration is completed you can execute the following commands in the project folder to start server :
python main.py
-
Set NLP pipeline:
When trained pipeline is installed, go to Settings Menu, then set the dowloaded one as current. You can install many models and choose one for process.
- OSError: [E050] Can't find model 'fr_core_news_md' : the trainned pipeline for spaCy is not installed, use
python -m spacy download fr_core_news_mdto download it.
- Possibility of choosing the sites to scrape
- Possibility of choosing the driver for Selenium
- CV analysis (identify skills, experience, and qualifications from the extracted text)
- Extract relevant information from job descriptions in the database
- Compare the user's CV with the job descriptions in the database to evaluate the match
Feel free to mail me for any questions, comments, or suggestions.