Skip to content

coboarding/chat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

92 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ coBoarding - Speed Hiring Platform

Python Version Code style: black Imports: isort License: MIT Tests Coverage Documentation Status

AI-Powered Job Application Automation for SME Tech Companies in Europe

coBoarding is a comprehensive "speed hiring" platform that connects tech talent with small and medium-sized companies across Europe through intelligent automation and real-time communication. Upload your CV, get matched with companies, and start working within 24 hours.

πŸ“‹ Table of Contents

πŸš€ Features

For Job Seekers

  • πŸ“„ Intelligent CV Processing - AI extracts and structures your experience using local LLM models (Mistral, LLaVA)
  • 🎯 Smart Job Matching - Get matched with relevant positions based on skills and preferences
  • πŸ’¬ Real-time Communication - Chat directly with employers through integrated messaging
  • πŸ€– Automated Applications - Forms filled automatically using your CV data
  • ⚑ 24-Hour Response SLA - Employers commit to responding within 24 hours

For Employers

  • πŸ“’ Instant Notifications - Multi-channel alerts for new candidates
  • πŸ” Technical Validation - AI-generated questions validate candidate skills
  • πŸ“Š Smart Matching - Candidates ranked by relevance and fit
  • πŸ’Ό Integration Ready - Works with existing HR tools
  • πŸ‡ͺπŸ‡Ί GDPR Compliant - Built for European privacy regulations

πŸ› οΈ CV Processor

The CV Processor is a core component that handles CV/Resume parsing and information extraction using AI models.

Key Features

  • Multi-format Support: Parses PDF, DOCX, and plain text CVs
  • AI-Powered Extraction: Uses Mistral and LLaVA models for accurate information extraction
  • Structured Output: Returns standardized JSON with candidate information
  • Test Mode: Built-in testing support with mock responses
  • Python 3.12+: Optimized for the latest Python version

Installation

# Install required dependencies
pip install -r requirements.txt

# Install development dependencies
pip install -r requirements-dev.txt

Basic Usage

from app.core.cv_processor import CVProcessor
import asyncio

async def process_cv(file_path):
    # Initialize the processor (test_mode=True for development)
    processor = CVProcessor(test_mode=True)
    
    # Process a CV file
    with open(file_path, 'rb') as f:
        result = await processor.process_cv(f)
    
    return result

# Example usage
if __name__ == "__main__":
    result = asyncio.run(process_cv("path/to/cv.pdf"))
    print(result)

Testing

Run the test suite with:

# Run all tests
pytest tests/

# Run with coverage report
pytest --cov=app tests/

# Generate HTML coverage report
coverage html

🐍 Python 3.12 Compatibility

The project is fully compatible with Python 3.12 and takes advantage of its features:

  • Pattern Matching: Used for cleaner control flow
  • Type Hints: Comprehensive type annotations for better code clarity
  • Async/Await: Modern asynchronous programming patterns
  • Performance: Optimized for Python 3.12's improved performance

Required Python 3.12+ Features

  • Type variable defaults (PEP 695)
  • Improved error messages
  • Faster exception handling
  • Enhanced asyncio performance

For Employers

  • πŸ“’ Instant Notifications - Multi-channel alerts for new candidates
  • πŸ” Technical Validation - AI-generated questions validate candidate skills
  • πŸ“Š Smart Matching - Candidates ranked by relevance and fit
  • πŸ’Ό Integration Ready - Works with existing HR tools
  • πŸ‡ͺπŸ‡Ί GDPR Compliant - Built for European privacy regulations

πŸ› οΈ Tech Stack

Core Technologies

  • Python: 3.12+
  • Web Framework: FastAPI
  • Database: PostgreSQL (production), SQLite (development)
  • ORM: SQLAlchemy 2.0
  • Migrations: Alembic
  • Async Support: asyncio, aiohttp
  • Testing: pytest, pytest-asyncio, pytest-cov
  • Code Quality: black, isort, flake8, mypy

AI/ML Components

  • LLM: Ollama (Mistral, LLaVA)
  • NLP: spaCy, Transformers
  • CV Processing: Custom CV processor with multi-model extraction
    • Extracts: Name, contact info, skills, experience, education
    • Supports: PDF, DOCX, plain text
    • Handles: Multiple languages, various CV formats
  • Computer Vision: OpenCV, Tesseract OCR
  • Text Processing: Regex patterns, custom text cleaning

Development Tools

  • Package Management: pip, setuptools
  • Environment Management: venv, pyenv
  • Documentation: MkDocs, mkdocstrings
  • Linting/Formatting: pre-commit hooks
  • CI/CD: GitHub Actions
  • Text Processing: Regex patterns, custom text cleaning

Frontend

  • Web Interface: Streamlit
  • Dashboard: React (future)
  • Styling: Tailwind CSS

Infrastructure

  • Containerization: Docker, Docker Compose
  • CI/CD: GitHub Actions
  • Testing: pytest, pytest-cov, pytest-asyncio
  • Code Quality: black, isort, flake8, mypy

Development Tools

  • Package Management: pip, setuptools
  • Environment Management: venv, pyenv
  • Documentation: MkDocs, mkdocstrings
  • Code Quality: black, isort, flake8, mypy
  • Testing: pytest, pytest-cov, pytest-asyncio
  • Linting: Pre-commit hooks for code quality

πŸš€ Quick Start

Prerequisites

  • Python 3.12 or higher

    # On Ubuntu/Debian
    sudo apt update && sudo apt install -y python3.12 python3.12-venv python3-pip
    
    # On macOS (using Homebrew)
    brew install python@3.12
  • Ollama (for local LLM processing)

    # Install Ollama
    curl -fsSL https://ollama.com/install.sh | sh
    
    # Start Ollama service
    ollama serve &
    
    # Pull required models
    ollama pull mistral
    ollama pull llava

Installation

  1. Clone the repository

    git clone https://github.com/yourusername/coboarding.git
    cd coboarding/chat
  2. Set up virtual environment

    # Create and activate virtual environment
    python3.12 -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install dependencies

    # Install base requirements
    pip install -r requirements.txt
    
    # Install development dependencies
    pip install -r requirements-dev.txt
  4. Set up environment variables

    cp .env.example .env
    # Edit .env with your configuration

Development

Running the Application

# Start the development server
make run

# Or with auto-reload
make dev

Database Setup

# Initialize the database
make db-init

# Create a new migration
make db-migrate msg="Your migration message"

# Apply migrations
make db-upgrade

Testing

Run the test suite:

# Run all tests
make test

# Run with coverage report
make test-cov

# Run specific test file
pytest tests/test_file.py -v

# Run specific test case
pytest tests/test_file.py::test_function -v

Test Coverage

# Generate coverage report
make test-cov

# Open HTML coverage report
open htmlcov/index.html  # On macOS
xdg-open htmlcov/index.html  # On Linux

Code Quality

# Run linters
make lint

# Format code
make format

# Run type checking
make typecheck

# Run all checks
make check-all
  • System Dependencies
    # Install Tesseract OCR and other system dependencies
    sudo apt update && sudo apt install -y \
      tesseract-ocr \
      tesseract-ocr-eng \
      poppler-utils \
      libmagic1 \
      python3-dev \
      build-essential \
      libpq-dev

Installation

  1. Clone the repository

    git clone https://github.com/yourusername/coboarding.git
    cd coboarding/chat
  2. Set up Python environment

    # Create and activate virtual environment
    python3.12 -m venv venv
    source venv/bin/activate  # On Windows: .\venv\Scripts\activate
    
    # Upgrade pip and install dependencies
    pip install --upgrade pip
    pip install -e .[dev]  # Install package in development mode with dev dependencies
  3. Set up environment variables

    cp .env.example .env
    # Edit .env with your configuration
  4. Initialize the database

    make db-init
  5. Start the application

    make run

    The application will be available at http://localhost:8501

    For development with auto-reload:

    make dev

πŸ§ͺ Testing

The project includes comprehensive tests to ensure code quality and reliability.

Running Tests

# Run all tests
make test

# Run tests with coverage report
make test-cov

# Run only unit tests
make test-unit

# Run only integration tests
make test-integration

# Run tests in parallel
pytest -n auto

Test Coverage

To generate an HTML coverage report:

make test-cov
# Open htmlcov/index.html in your browser

πŸ—οΈ Development

Code Style & Quality

# Format code with black and isort
make format

# Check code style and quality
make lint

# Run static type checking
make typecheck

# Run all checks (lint, typecheck, test)
make check-all

Database Management

# Initialize database
make db-init

# Create new migration
make db-migrate

# Apply migrations
make db-upgrade

# Revert migrations
make db-downgrade

Pre-commit Hooks

Pre-commit hooks are configured to automatically format and check your code before each commit:

# Install pre-commit hooks
pre-commit install

# Run pre-commit on all files
pre-commit run --all-files

πŸ€– AI/ML Components

The CV processing pipeline uses multiple models for robust information extraction:

  1. Mistral 7B - For structured data extraction from text
  2. LLaVA - For visual understanding of CV layouts (images/PDFs)
  3. spaCy - For NER and basic text processing

Model Management

# List available models
ollama list

# Pull a model
ollama pull mistral
ollama pull llava

# Remove a model
ollama rm mistral

πŸ”’ Security & Compliance

Environment Variables

Sensitive configuration should be set via environment variables in the .env file:

# Database
DATABASE_URL=postgresql://user:password@localhost:5432/coboarding

# Ollama
OLLAMA_BASE_URL=http://localhost:11434

# Security
SECRET_KEY=your-secret-key-here
ALGORITHM=HS256
ACCESS_TOKEN_EXPIRE_MINUTES=1440  # 24 hours

# CORS (comma-separated list of origins)
CORS_ORIGINS=http://localhost:8501,http://localhost:3000

Security Best Practices

  1. Never commit sensitive data to version control
  2. Use environment variables for configuration
  3. Keep dependencies updated
  4. Run security scans regularly
  5. Follow the principle of least privilege for database access

Data Protection

  • All data is encrypted at rest and in transit
  • Regular security audits and penetration testing
  • Role-based access control (RBAC)
  • Secure credential management using environment variables

GDPR Compliance

  • Right to be forgotten implementation
  • Data portability
  • Consent management
  • Data processing agreements
  • Regular data retention policy enforcement

Secure Development

  • Dependency vulnerability scanning
  • Static code analysis in CI/CD pipeline
  • Secrets scanning
  • Regular dependency updates

🀝 Contributing

We welcome contributions from the community! Here's how you can help:

  1. Report Bugs

    • Check existing issues to avoid duplicates
    • Provide detailed reproduction steps
    • Include error logs and screenshots if applicable
  2. Suggest Enhancements

    • Open an issue to discuss your idea
    • Check for existing feature requests
    • Be specific about the use case
  3. Submit Code Changes

    # Fork the repository
    git clone https://github.com/yourusername/coboarding.git
    cd coboarding/chat
    
    # Set up development environment
    make setup
    
    # Create a feature branch
    git checkout -b feature/amazing-feature
    
    # Make your changes
    # Run tests and checks
    make check-all
    
    # Commit and push
    git commit -m "Add amazing feature"
    git push origin feature/amazing-feature
  4. Code Review Process

    • All changes require code review
    • At least one approval required for merging
    • CI/CD pipeline must pass
    • Code coverage should not decrease significantly
  5. Code Style

    • Follow PEP 8 guidelines
    • Use type hints for all new code
    • Keep functions small and focused
    • Write docstrings for public functions and classes
    • Add tests for new functionality
  6. Documentation

    • Update relevant documentation
    • Add examples for new features
    • Keep the README up to date

Development Workflow

  1. Create an issue describing the bug or feature
  2. Assign the issue to yourself
  3. Create a feature branch from main
  4. Make your changes with atomic commits
  5. Push your changes and create a pull request
  6. Address any review comments
  7. Once approved, squash and merge

Commit Message Guidelines

<type>(<scope>): <subject>

[optional body]

[optional footer]

Types:

  • feat: A new feature
  • fix: A bug fix
  • docs: Documentation changes
  • style: Code style changes (formatting, etc.)
  • refactor: Code change that neither fixes a bug nor adds a feature
  • test: Adding missing tests or correcting existing tests
  • chore: Changes to the build process or auxiliary tools

Example:

feat(cv): add support for DOCX files

- Added docx2txt for DOCX parsing
- Updated CV processor to handle DOCX format
- Added tests for DOCX processing

Closes #123

Development Workflow

  1. Create a new branch for your feature/bugfix
  2. Write tests for your changes
  3. Implement your changes
  4. Ensure all tests pass
  5. Update documentation if needed
  6. Submit a pull request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Ollama for providing easy-to-use local LLMs
  • FastAPI for the awesome async web framework
  • Streamlit for the interactive web interface
  • All the amazing open-source libraries that made this project possible

🐳 Docker Setup

  1. Build and start services

    docker-compose up -d --build
  2. View logs

    docker-compose logs -f
  3. Run tests in Docker

    docker-compose exec web pytest

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/your-feature
  3. Commit your changes: git commit -am 'Add some feature'
  4. Push to the branch: git push origin feature/your-feature
  5. Open a pull request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ“§ Contact

For questions or feedback, please open an issue or contact your-email@example.com

About

coBoarding - Speed Hiring Platform AI-Powered Job Application Automation for SME Tech Companies in Europe

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published