You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This repository is to showcase my progress in Data Science and Machine Learning
I had my first encounter with Data Science while carrying out research for my MSc project, you can read about it here. So I would like to document my journey and the projects I have enjoyed working on.
Prerequisite to the classification of Nigerian rice dishes. The rice data was scraped from Google images and had a lot of noise (non-rice images). This project goal is to clean the dataset by using a pre-trained deep convoluted classification neural network, the FOOD-5k dataset to first separate food images from non-food images.
Image Classification, Transfer learning, Data Cleaning, Image augmentation, Active Learning
This project involves identifying different breeds of dog via image classification. It also takes in a human image and returns the dog breed doppleganger.
In the wake of the popularity of the BBC #SexForGrades documentary. Students of different universities tweeted about similar occurrences in their universities and organizations. This project aims to identify these Universities and organizations from a slew of tweets with the #SexForGrades tag and create a map for easy visualization of organizations that need to reform and handle these violations.
Entity Extraction and Recognition, Data Visualization, Twitter Data Scraping.
With the pandemic still going on and affecting many aspects of our lives, people have adjusted and adapted to our new realities in various ways. This project aims to higlight and explore the effect of covid on our new year resolutions.
Sentiment Analysis, Data Visualization, Twitter Data Scraping, App development.
Kaggle competition to predict the average monthly spend of a bike shop customers based on demographic and psychographic information. It involved exploring the data via extensive uni and multivariate analysis to detect important demographic features. This insight was used to create new features and applied various regression models to predict the average monthly spend. Result: Achieved RMSE of 3.19Placement: 2nd out of 5 teams
EDA, Data visualization Data Cleaning and Preprocessing, Model building and experimentation, Hyperparameter tuning, Model selection Models used: Linear Regression, RandomForest, GradientBoosting Regressor, XGboost.
Kaggle competition to predict the likelihood of a customer purchasing an item based on their demographic data and past spending habits. It involved exploring the data and preprocessing certain features with high predictive power. Various models were applied, compared, with the best model selected based on the stability and generalization of the model. Result: Achieved accuracy of 0.76Placement: 2nd out of 32 competitors
EDA, Data visualization Data Cleaning and Preprocessing, Model building and experimentation, Hyperparameter tuning, Model Stacking, Model selection Models used: Logistic Regression, Decision Tree,RandomForest, AdaBoosting Classifier, StackingCVClassifier, Voting Classifier GradientBoosting Classifier, XGboost, CatBoost.
The Lagos Marathon is a popular event that takes place every year in Lagos. This explores the data from 2017 to 2019 and tells a story about the participating countries, genders, and overall stats
Data Visualization, Data Cleaning Data Storytelling.