Machine Learning Assignment of group 63, of class 06.
Group members:
- Carolina Gonçalves - (up202108781@up.pt)
- Bianca Oliveira - (up202000139@up.pt)
- Marco Costa - (up202108821@up.pt)
In order to run this project, you need to have the technologies mentioned "requirments.txt" installed. You can do so by running:
make requirements
It's also expected that in folder "data/raw" are the following datasets:
- awards_players.csv
- coaches.csv
- players_teams.csv
- players.csv
- series_post.csv
- teams_post.csv
- teams.csv
And under the folder "data/raw/Season_11":
- coaches.csv
- players_teams.csv
- teams.csv
The delivery was made with everything already generated. To re-run please do beforehand:
make clean
To run the project you can simply run:
make all
Or in the following order:
make preparation
This is where the data is cleaned, as well as where we treat outliers. The prepared data will be kept in the directory "data/clean".
make analyze
This is where we generate the graphs that we used to analyse the data. The produced documents will be be kept in the directory "docs/data_analyze".
make process
This is where we process the data, by renaming columns, calculating necessary parameters and merging some datasets. The processed data will be kept in the directory "data/processed".
make datasets
This is where we create the final dataset, as well as use the PCA. The produced datasets will be kept under the directory "data/datasets".
make models
This is where we train the models. The produced models will be be kept in the directory "models".
make models_analyze
This is where we analyze the model's performances. This is also where our predictions for the competition are generated. The produced documents will be be kept in the directory "docs/analysis". This produced, somehow, different graphs for different machines, we could not fix this issue, so the results produced might be different from the ones in the the report.