A 3 am Python scraper for UC Chile course data. Extracts OFG course codes, fetches details from catalog, determines semester availability.
python main.pyOutputs results/course_data.json with structured course information.
Edit const.py:
UPDATE_COURSE_CODES = False # Re-scrape course codes from web
SAVE_TO_FILE = True # Save results to files
REMOVE_DUPLICATES = True # Filter duplicate codesWorkflow: Set UPDATE_COURSE_CODES = True for first run or new semester, False for updates.
ofg_scraper.py: extracts course codes from OFG pagescatalogue_scraper.py: gets course details from catalogparity_scraper.py: dtermines semester availability from historical datainfo_for_course_codes.py: pipeline orchestrator
JSON with course code, credits, area, name, and semester availability ("odd"/"even"/"both").
- Availability is heuristic based on historical patterns for 2024-2 and 2025-1. Should be updated when data for 2025-2 is available.
- Updates on scraped websites will kill the scrapers.
python -m venv venv
source venv/bin/activate # Linux/Mac: venv\Scripts\activate on Windows
pip install -r requirements.txt