An image classification application used to identify a dog breed.
Replit Access
Binder Access Binder instance is not available as of 02/16/2025:
All instructions for running the application can be found in the User Guide
This application has some sample images available. Here's what you need to do if you want to use your own images:
- Take a picture of the dog.
- Crop the photo (optional, but may help with accuracy).
Take a picture of the dog you'd like to analyze. The clearer the dog's photo, the likelihood of the algorithm correctly identifying the dog breed increases.
Here's an example of a good photo:
A clear photo of the dog showing distinct features.Here's an example of a bad photo:
The dog is obstructed by objects in the photo.While cropping the photo is optional, it will help increase the accuracy of the photo analysis.
Here's an example of an uncropped photo:
An uncropped photo of a dog and excessive background space.Here's an example of a cropped photo:
A cropped photo focused on the dog, reduced background space.Currently, the application only accepts the following image formats:
- '.jpg'
- '.jpeg'
- '.png'
Currently, the algorithm only supports 3 unique breeds:
- Rottweilers
- Golden Retrievers
- Chihuahuas
If you're having problems uploading an image using the remote Notebook application, refer to the User Guide.
Here are the machine learning model statistics. These are dynamically updated each time the model is trained to ensure consistency with the model and results posted.
Confusion matrices tell us how the model performed during the testing phase.
- The x-axis identifies what dog breed the model predicted given an image.
- The y-axis shows the count of each breed type.
Reading from the top-left square diagonally to the bottom-right square, the matrix identifies how many of the predictions the model made were correct. The higher the value, the better the model accuracy.
- Ex: The model correctly predicted a picture of a chihuahua as a chihuahua.
The squares outside of the diagonal identify the amount of times the model incorrectly predicted a breed type. The lower the value the higher the model accuracy.
- Ex: Out of n images provided of the breed chihuahua, the model classified the images k times incorrectly as the breed rottweiler or golden retriever.
This is the confusion matrix of the model when trained using a Random Forest Classifier:
A confusion matrix of the model trained using Random ForestThis is the confusion matrix of the model when trained using a Support Vector Classifier:
A confusion matrix of the model trained using Support Vector ClassifierThe image data used to train the model is based on the Stanford Dogs Dataset. Model training relied on numerous pictures of dogs in the following categories:
- Chihuahua
- Rottweiler
- Golden Retriever
The image distribution of the different dog breeds is similar in size. However, the amount of sample images available was very low, and the breed within the images varied greatly (i.e., short-haired chihuahuas and long-haired chihuahuas, puppies and adults, fur color, etc.).
A pie chart of the image distribution used by the model during training.The model had no features to use during the training and testing phase of development. The model would classify dog breeds based on the pixel density of images. The model would learn to identify patterns in the pictures that would lead it to a conclusion for classifying a dog's breed. Decision trees are created during training and help the model branch to different conclusions when analyzing the pixel density of the images.
During testing, the random forest classifier (RFC) often outperformed the support vector classifier (SVC), which led to the decision to use RFC for the developed model.
This is an example decision tree used in the final version of the machine learning model:
A decision tree chosen at random for viewing the model's decisions when assessing an image.






