In order to run the notebooks in this repository, the following libraries have to be installed:
- Pandas 0.24.2
- Numpy 1.17.4
- fastai 1.0.59
- autogluon 0.0.3
- seaborn 0.9.0
- scikit-learn 0.21.2
The purpose of this project is to show:
- Basic data preparation techniques prior to data exploration
- Commonly used but powerful chart types for gaining insights
- Results from exploring prominent features pertaining to logistics transit time
- Basic code for training and predicting transit time for shipments
- TT_EDA.ipnyb: This notebook contains the data exploration steps and results
- TransitTime_Fastai.ipnyb: This notebook contains the steps for training and predicting using the fastai library. One of the key highlights is the ability of fastai library to create embeddings for categorical features
- TransitTime_AG.ipnyb: This notebook demonstrates the use of Autogluon library to speed up experimentation of training and prediction with popular machine learning algorithms
- ML_Logistics_Simple_Guide.pdf: This document summarizes the approach and results shown in the Python notebooks. This document can be viewed as a standalone starter guide for data exploration and machine learning application in logistics domain
- TransitTime_EDA.pdf: This document summarizes the main insights generated from the exploratory analysis of the dataset
Thanks to Python open source community for creating valuable libraries used in this project.
This project uses normalized dataset of truckload shipments
Apache license