An end-to-end machine learning system that predicts a student’s final grade (G3) using academic history and behavioral features, with built-in explainability via SHAP and deployment through a FastAPI service.
This project focuses on correct ML engineering practices, not just model accuracy.
Predict a student’s final grade based on demographic, academic, and behavioral data, while ensuring:
- Proper feature handling
- Model interpretability
- Reproducible training
- Deployable inference
The goal is to demonstrate understanding of the full ML lifecycle, from data ingestion to explanation.
- Source: UCI Student Performance Dataset
- File used:
student-mat.csv - Delimiter: Semicolon (
;) - Target variable:
G3(final grade, range 0–20)
The dataset is semicolon-separated, not comma-separated.
Incorrect parsing leads to silent schema corruption.
Project Structure :
student-performance-predictor/ │ ├── api/ │ ├── main.py │ ├── schemas.py │ └── routes/ │ └── explain.py │ ├── src/ │ ├── preprocessing.py │ ├── train_regularized.py │ └── train_ablation.py │ ├── data/ │ └── raw/ │ ├── models/ ├── results.md ├── requirements.txt └── README.md