A machine learning project comparing multiple classification models for predicting heart disease presence and severity using the UCI Heart Disease Dataset. Built as part of MSU's CSE 404 (Machine Learning) course.
This project implements and benchmarks five different model types — Neural Networks, SVM, Decision Tree, Random Forest, and XGBoost — on a 5-class classification task (no disease + 4 severity levels) as well as a binary presence/absence task.
Python, PyTorch, scikit-learn, XGBoost, Pandas, Matplotlib, Seaborn
Install dependencies:
pip install torch scikit-learn xgboost pandas matplotlib seaborn ucimlrepo imbalanced-learnRun any model directly:
python model_with_advanced_stats.py
python svm_model_changed.py
python random_forest.py
python decision_tree_model.pyFor XGBoost, open and run xgboost.ipynb in Jupyter Notebook. The dataset is fetched automatically from the UCI ML Repository — no manual download needed.
- Input layer: 13 features
- Hidden layers:
- Layer 1: 128 neurons, ReLU activation
- Layer 2: 64 neurons, ReLU activation
- Output layer: 5 neurons (no activation applied)
- Training (Epoch 250):
- Accuracy: 61.3%
- Average Loss: 0.9996
- Validation:
- Accuracy: 36.7%
- Average Loss: 1.3301
- Test:
- Accuracy: 47.5%
- Average Loss: 1.2531
- Input layer: 13 features
- Hidden layers:
- Layer 1: 128 neurons, ReLU activation
- Layer 2: 64 neurons, ReLU activation
- Layer 3: 32 neurons, ReLU activation
- Output layer: 5 neurons (no activation applied)
- Training (Epoch 250):
- Accuracy: 63.7%
- Average Loss: 0.9793
- Validation:
- Accuracy: 43.3%
- Average Loss: 1.3079
- Test:
- Accuracy: 50.8%
- Average Loss: 1.2819
- Input layer: 13 features
- Hidden layers:
- Layer 1: 512 neurons, ReLU activation
- Layer 2: 128 neurons, ReLU activation
- Output layer: 5 neurons (no activation applied)
- Training (Epoch 250):
- Accuracy: 73.1%
- Average Loss: 0.8119
- Validation:
- Accuracy: 43.3%
- Average Loss: 1.4602
- Test:
- Accuracy: 52.5%
- Average Loss: 1.2090
- Input layer: 13 features
- Hidden layers:
- Layer 1: 512 neurons, ReLU activation
- Layer 2: 256 neurons, ReLU activation
- Layer 3: 128 neurons, ReLU activation
- Output layer: 5 neurons (no activation applied)
- Training (Epoch 250):
- Accuracy: 66.5%
- Average Loss: 0.8162
- Validation:
- Accuracy: 40.0%
- Average Loss: 1.4989
- Test:
- Accuracy: 57.4%
- Average Loss: 1.1832
- Input layer: 13 features
- Hidden layers:
- Layer 1: 128 neurons, ReLU activation
- Layer 2: 64 neurons, ReLU activation
- Output layer: 5 neurons, Softmax activation
- Training (Epoch 250):
- Accuracy: 55.7%
- Average Loss: 1.3482
- Validation:
- Accuracy: 46.7%
- Average Loss: 1.4382
- Test:
- Accuracy: 52.5%
- Average Loss: 1.3802
- Regular Accuracy: 66.0%
- Binary Accuracy: 83.0%
- Precision (Multi-Class): 60.5%
- Recall (Multi-Class): 66.0%
- F1 Score (Multi-Class): 61.2%
- Precision (Binary): 90.7%
- Recall (Binary): 70.1%
- F1 Score (Binary): 79.1%
- Average Loss: 0.8250
- Regular Accuracy: 43.3%
- Binary Accuracy: 83.3%
- Precision: 38.2%
- Recall: 30.0%
- F1 Score: 22.5%
- ROC AUC: 86.8%
- Average Loss: 1.2962
- Regular Accuracy: 59.0%
- Binary Accuracy: 85.2%
- Precision (Multi-Class): 43.2%
- Recall (Multi-Class): 34.0%
- F1 Score (Multi-Class): 24.9%
- Precision (Binary): 79.2%
- Recall (Binary): 82.6%
- F1 Score (Binary): 80.9%
- ROC AUC: 0.85
- Average Loss: 0.9168
The FourLayerExpandedNetwork achieved the best test accuracy (57.4%) and lowest test loss (1.1832), demonstrating its capacity to generalize better on the test data compared to other models. However, the advanced stats model further emphasizes the utility of binary accuracy and ROC AUC metrics, showcasing strong performance in identifying positive cases while maintaining reasonable multi-class classification capability.
Model Performance
• Overall Test Accuracy: 54.95%
Confusion Matrix
Predicted / Actual Class 0 Class 1 Class 2 Class 3 Class 4 Class 0 45 1 1 1 0 Class 1 10 2 3 1 1 Class 2 4 2 2 4 0 Class 3 1 6 2 1 0 Class 4 1 1 1 1 0
Model Performance
Test Classification Report: precision recall f1-score support
0 0.79 0.75 0.77 36
1 0.08 0.07 0.07 15
2 0.00 0.00 0.00 4
3 0.00 0.00 0.00 5
4 0.00 0.00 0.00 1
accuracy 0.46 61
macro avg 0.17 0.16 0.17 61 weighted avg 0.49 0.46 0.47 61
Test Confusion Matrix: [[27 9 0 0 0] [ 5 1 4 5 0] [ 2 1 0 1 0] [ 0 1 2 0 2] [ 0 1 0 0 0]]
- C: 10
- Gamma: 0.1
- Kernel: rbf
- 0.87
| Class | Precision | Recall | F1-Score | Support |
|---|---|---|---|---|
| 0 | 0.88 | 0.94 | 0.91 | 31 |
| 1 | 0.86 | 0.78 | 0.82 | 23 |
| 2 | 0.88 | 0.88 | 0.88 | 16 |
| 3 | 0.75 | 0.79 | 0.77 | 19 |
| 4 | 0.96 | 0.93 | 0.94 | 27 |
- Overall Accuracy: 0.87
- Macro Average:
- Precision: 0.86
- Recall: 0.86
- F1-Score: 0.86
- Weighted Average:
- Precision: 0.87
- Recall: 0.87
- F1-Score: 0.87
| Predicted 0 | Predicted 1 | Predicted 2 | Predicted 3 | Predicted 4 | |
|---|---|---|---|---|---|
| Actual 0 | 29 | 1 | 0 | 1 | 0 |
| Actual 1 | 2 | 18 | 1 | 2 | 0 |
| Actual 2 | 1 | 0 | 14 | 1 | 0 |
| Actual 3 | 1 | 1 | 1 | 15 | 1 |
| Actual 4 | 0 | 1 | 0 | 1 | 25 |
This demonstrates the performance of the Support Vector Machine (SVM) model using the best parameters obtained from hyperparameter tuning.
Model Performance
Test Classification Report: precision recall f1-score support
0 0.79 0.75 0.77 36
1 0.08 0.07 0.07 15
2 0.00 0.00 0.00 4
3 0.00 0.00 0.00 5
4 0.00 0.00 0.00 1
accuracy 0.46 61
macro avg 0.17 0.16 0.17 61 weighted avg 0.49 0.46 0.47 61
Test Confusion Matrix: [[27 9 0 0 0] [ 5 1 4 5 0] [ 2 1 0 1 0] [ 0 1 2 0 2] [ 0 1 0 0 0]]
- 'colsample_bytree': 1.0
- 'learning_rate': 0.2
- 'max_depth': 5
- 'n_estimators': 200
- 'subsample': 0.8
- 0.94
| Class | Precision | Recall | F1-Score | Support |
|---|---|---|---|---|
| 0 | 0.85 | 0.81 | 0.83 | 27 |
| 1 | 0.96 | 0.90 | 0.93 | 29 |
| 2 | 0.79 | 0.79 | 0.79 | 24 |
| 3 | 0.73 | 0.76 | 0.74 | 21 |
| 4 | 0.93 | 1.00 | 0.96 | 27 |
- Overall Accuracy: 0.86
- Macro Average:
- Precision: 0.85
- Recall: 0.85
- F1-Score: 0.85
- Weighted Average:
- Precision: 0.86
- Recall: 0.86
- F1-Score: 0.86
| Predicted 0 | Predicted 1 | Predicted 2 | Predicted 3 | Predicted 4 | |
|---|---|---|---|---|---|
| Actual 0 | 22 | 1 | 2 | 2 | 0 |
| Actual 1 | 2 | 26 | 0 | 1 | 0 |
| Actual 2 | 2 | 0 | 19 | 3 | 0 |
| Actual 3 | 0 | 0 | 3 | 16 | 2 |
| Actual 4 | 0 | 0 | 0 | 0 | 27 |