AI-Powered Job Application Automation for SME Tech Companies in Europe
coBoarding is a comprehensive "speed hiring" platform that connects tech talent with small and medium-sized companies across Europe through intelligent automation and real-time communication. Upload your CV, get matched with companies, and start working within 24 hours.
- π Features
- π οΈ Tech Stack
- π Quick Start
- π³ Docker Setup
- π€ AI/ML Components
- π Security & Compliance
- π€ Contributing
- π License
- π Intelligent CV Processing - AI extracts and structures your experience using local LLM models (Mistral, LLaVA)
- π― Smart Job Matching - Get matched with relevant positions based on skills and preferences
- π¬ Real-time Communication - Chat directly with employers through integrated messaging
- π€ Automated Applications - Forms filled automatically using your CV data
- β‘ 24-Hour Response SLA - Employers commit to responding within 24 hours
- π’ Instant Notifications - Multi-channel alerts for new candidates
- π Technical Validation - AI-generated questions validate candidate skills
- π Smart Matching - Candidates ranked by relevance and fit
- πΌ Integration Ready - Works with existing HR tools
- πͺπΊ GDPR Compliant - Built for European privacy regulations
The CV Processor is a core component that handles CV/Resume parsing and information extraction using AI models.
- Multi-format Support: Parses PDF, DOCX, and plain text CVs
- AI-Powered Extraction: Uses Mistral and LLaVA models for accurate information extraction
- Structured Output: Returns standardized JSON with candidate information
- Test Mode: Built-in testing support with mock responses
- Python 3.12+: Optimized for the latest Python version
# Install required dependencies
pip install -r requirements.txt
# Install development dependencies
pip install -r requirements-dev.txtfrom app.core.cv_processor import CVProcessor
import asyncio
async def process_cv(file_path):
# Initialize the processor (test_mode=True for development)
processor = CVProcessor(test_mode=True)
# Process a CV file
with open(file_path, 'rb') as f:
result = await processor.process_cv(f)
return result
# Example usage
if __name__ == "__main__":
result = asyncio.run(process_cv("path/to/cv.pdf"))
print(result)Run the test suite with:
# Run all tests
pytest tests/
# Run with coverage report
pytest --cov=app tests/
# Generate HTML coverage report
coverage htmlThe project is fully compatible with Python 3.12 and takes advantage of its features:
- Pattern Matching: Used for cleaner control flow
- Type Hints: Comprehensive type annotations for better code clarity
- Async/Await: Modern asynchronous programming patterns
- Performance: Optimized for Python 3.12's improved performance
- Type variable defaults (PEP 695)
- Improved error messages
- Faster exception handling
- Enhanced asyncio performance
- π’ Instant Notifications - Multi-channel alerts for new candidates
- π Technical Validation - AI-generated questions validate candidate skills
- π Smart Matching - Candidates ranked by relevance and fit
- πΌ Integration Ready - Works with existing HR tools
- πͺπΊ GDPR Compliant - Built for European privacy regulations
- Python: 3.12+
- Web Framework: FastAPI
- Database: PostgreSQL (production), SQLite (development)
- ORM: SQLAlchemy 2.0
- Migrations: Alembic
- Async Support: asyncio, aiohttp
- Testing: pytest, pytest-asyncio, pytest-cov
- Code Quality: black, isort, flake8, mypy
- LLM: Ollama (Mistral, LLaVA)
- NLP: spaCy, Transformers
- CV Processing: Custom CV processor with multi-model extraction
- Extracts: Name, contact info, skills, experience, education
- Supports: PDF, DOCX, plain text
- Handles: Multiple languages, various CV formats
- Computer Vision: OpenCV, Tesseract OCR
- Text Processing: Regex patterns, custom text cleaning
- Package Management: pip, setuptools
- Environment Management: venv, pyenv
- Documentation: MkDocs, mkdocstrings
- Linting/Formatting: pre-commit hooks
- CI/CD: GitHub Actions
- Text Processing: Regex patterns, custom text cleaning
- Web Interface: Streamlit
- Dashboard: React (future)
- Styling: Tailwind CSS
- Containerization: Docker, Docker Compose
- CI/CD: GitHub Actions
- Testing: pytest, pytest-cov, pytest-asyncio
- Code Quality: black, isort, flake8, mypy
- Package Management: pip, setuptools
- Environment Management: venv, pyenv
- Documentation: MkDocs, mkdocstrings
- Code Quality: black, isort, flake8, mypy
- Testing: pytest, pytest-cov, pytest-asyncio
- Linting: Pre-commit hooks for code quality
-
Python 3.12 or higher
# On Ubuntu/Debian sudo apt update && sudo apt install -y python3.12 python3.12-venv python3-pip # On macOS (using Homebrew) brew install python@3.12
-
Ollama (for local LLM processing)
# Install Ollama curl -fsSL https://ollama.com/install.sh | sh # Start Ollama service ollama serve & # Pull required models ollama pull mistral ollama pull llava
-
Clone the repository
git clone https://github.com/yourusername/coboarding.git cd coboarding/chat -
Set up virtual environment
# Create and activate virtual environment python3.12 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies
# Install base requirements pip install -r requirements.txt # Install development dependencies pip install -r requirements-dev.txt
-
Set up environment variables
cp .env.example .env # Edit .env with your configuration
# Start the development server
make run
# Or with auto-reload
make dev# Initialize the database
make db-init
# Create a new migration
make db-migrate msg="Your migration message"
# Apply migrations
make db-upgradeRun the test suite:
# Run all tests
make test
# Run with coverage report
make test-cov
# Run specific test file
pytest tests/test_file.py -v
# Run specific test case
pytest tests/test_file.py::test_function -v# Generate coverage report
make test-cov
# Open HTML coverage report
open htmlcov/index.html # On macOS
xdg-open htmlcov/index.html # On Linux# Run linters
make lint
# Format code
make format
# Run type checking
make typecheck
# Run all checks
make check-all- System Dependencies
# Install Tesseract OCR and other system dependencies sudo apt update && sudo apt install -y \ tesseract-ocr \ tesseract-ocr-eng \ poppler-utils \ libmagic1 \ python3-dev \ build-essential \ libpq-dev
-
Clone the repository
git clone https://github.com/yourusername/coboarding.git cd coboarding/chat -
Set up Python environment
# Create and activate virtual environment python3.12 -m venv venv source venv/bin/activate # On Windows: .\venv\Scripts\activate # Upgrade pip and install dependencies pip install --upgrade pip pip install -e .[dev] # Install package in development mode with dev dependencies
-
Set up environment variables
cp .env.example .env # Edit .env with your configuration -
Initialize the database
make db-init
-
Start the application
make run
The application will be available at http://localhost:8501
For development with auto-reload:
make dev
The project includes comprehensive tests to ensure code quality and reliability.
# Run all tests
make test
# Run tests with coverage report
make test-cov
# Run only unit tests
make test-unit
# Run only integration tests
make test-integration
# Run tests in parallel
pytest -n autoTo generate an HTML coverage report:
make test-cov
# Open htmlcov/index.html in your browser# Format code with black and isort
make format
# Check code style and quality
make lint
# Run static type checking
make typecheck
# Run all checks (lint, typecheck, test)
make check-all# Initialize database
make db-init
# Create new migration
make db-migrate
# Apply migrations
make db-upgrade
# Revert migrations
make db-downgradePre-commit hooks are configured to automatically format and check your code before each commit:
# Install pre-commit hooks
pre-commit install
# Run pre-commit on all files
pre-commit run --all-filesThe CV processing pipeline uses multiple models for robust information extraction:
- Mistral 7B - For structured data extraction from text
- LLaVA - For visual understanding of CV layouts (images/PDFs)
- spaCy - For NER and basic text processing
# List available models
ollama list
# Pull a model
ollama pull mistral
ollama pull llava
# Remove a model
ollama rm mistralSensitive configuration should be set via environment variables in the .env file:
# Database
DATABASE_URL=postgresql://user:password@localhost:5432/coboarding
# Ollama
OLLAMA_BASE_URL=http://localhost:11434
# Security
SECRET_KEY=your-secret-key-here
ALGORITHM=HS256
ACCESS_TOKEN_EXPIRE_MINUTES=1440 # 24 hours
# CORS (comma-separated list of origins)
CORS_ORIGINS=http://localhost:8501,http://localhost:3000- Never commit sensitive data to version control
- Use environment variables for configuration
- Keep dependencies updated
- Run security scans regularly
- Follow the principle of least privilege for database access
- All data is encrypted at rest and in transit
- Regular security audits and penetration testing
- Role-based access control (RBAC)
- Secure credential management using environment variables
- Right to be forgotten implementation
- Data portability
- Consent management
- Data processing agreements
- Regular data retention policy enforcement
- Dependency vulnerability scanning
- Static code analysis in CI/CD pipeline
- Secrets scanning
- Regular dependency updates
We welcome contributions from the community! Here's how you can help:
-
Report Bugs
- Check existing issues to avoid duplicates
- Provide detailed reproduction steps
- Include error logs and screenshots if applicable
-
Suggest Enhancements
- Open an issue to discuss your idea
- Check for existing feature requests
- Be specific about the use case
-
Submit Code Changes
# Fork the repository git clone https://github.com/yourusername/coboarding.git cd coboarding/chat # Set up development environment make setup # Create a feature branch git checkout -b feature/amazing-feature # Make your changes # Run tests and checks make check-all # Commit and push git commit -m "Add amazing feature" git push origin feature/amazing-feature
-
Code Review Process
- All changes require code review
- At least one approval required for merging
- CI/CD pipeline must pass
- Code coverage should not decrease significantly
-
Code Style
- Follow PEP 8 guidelines
- Use type hints for all new code
- Keep functions small and focused
- Write docstrings for public functions and classes
- Add tests for new functionality
-
Documentation
- Update relevant documentation
- Add examples for new features
- Keep the README up to date
- Create an issue describing the bug or feature
- Assign the issue to yourself
- Create a feature branch from
main - Make your changes with atomic commits
- Push your changes and create a pull request
- Address any review comments
- Once approved, squash and merge
<type>(<scope>): <subject>
[optional body]
[optional footer]
Types:
- feat: A new feature
- fix: A bug fix
- docs: Documentation changes
- style: Code style changes (formatting, etc.)
- refactor: Code change that neither fixes a bug nor adds a feature
- test: Adding missing tests or correcting existing tests
- chore: Changes to the build process or auxiliary tools
Example:
feat(cv): add support for DOCX files
- Added docx2txt for DOCX parsing
- Updated CV processor to handle DOCX format
- Added tests for DOCX processing
Closes #123
- Create a new branch for your feature/bugfix
- Write tests for your changes
- Implement your changes
- Ensure all tests pass
- Update documentation if needed
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
- Ollama for providing easy-to-use local LLMs
- FastAPI for the awesome async web framework
- Streamlit for the interactive web interface
- All the amazing open-source libraries that made this project possible
-
Build and start services
docker-compose up -d --build
-
View logs
docker-compose logs -f
-
Run tests in Docker
docker-compose exec web pytest
- Fork the repository
- Create a feature branch:
git checkout -b feature/your-feature - Commit your changes:
git commit -am 'Add some feature' - Push to the branch:
git push origin feature/your-feature - Open a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
For questions or feedback, please open an issue or contact your-email@example.com