Skip to content

Study for automatic discarding of empty images. Training and testing workflow of PARDINUS architecture.

Notifications You must be signed in to change notification settings

SIMIDAT/PARDINUS-EmptyImagesFiltering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Weakly supervised discarding of photo-trapping empty images based on autoencoders

This repository is the official implementation of the paper “Weakly supervised discarding of photo-trapping empty images based on autoencoders”. PARDINUS, built on the foundation of weakly-supervised learning, is the proposed tool to automatically detect empty or blank images.

Install

Install requirements.txt using the following python code in a clean anaconda environment (Python 3.9 is reccomended):

pip install -r requirements.txt

Code organization and usage

The code is organized into two main folders: Train and Test. Test folder contains the scripts to test or evaluate pretrained models on your own images. Train folder contains the scripts for training your own models.

Preprocessing

The images must be 256 pixels height and 384 pixels width, RGB format. If your images needs to be resized, you can use the script resizeImages.py

Test or inference

There are three scripts that you should execute to test trained models on your own images: clustering.py, autoencoders.py and randomForest.py, in this order.

python clustering.py
python autoencoders.py
python randomForest.py

The file config.py contains variables that needs to be set to make the scripts work properly. When indicating the route to a specific folder, you have to create the folder itself, with the name you set. For example, if you set IMAGE_FOLDER = "./MY_IMAGES/", you would have to create a folder in the root named "./MY_IMAGES/", where the images should be stored.

  • IMAGE_FOLDER: set where the clustered and equalized images for the RAEs will be stored. Default: "./Data/"

  • TEST_IMAGES: set where the original images, your own images, are stored. Default: "./Data/BBDDTest"

  • TRAINED_MODELS_ROUTE: set where the trained models are stored. There should be one k-means clustering model, one random forest model and one RAE model for each cluster of images. Default: "./TrainedModels/"

  • ERROR_FILES_ROUTE: set where the trained models are stored. Default: "./ErrorFiles/"

Depending on the training settings or your images features, you may also want to change other parameters in config.py as the number of clusters to create.

The last script, randomForest.py, will create a new file in root directory, Results.csv.

RobustAutoencoder.py defines the architecture of the RAE models that are used to predict the label of the images.

Results output

The execution of randomForest.py script will provide the results of PARDINUS on the test images. The file that store the results is called Results.csv. For each row, this file store the name of the image and the label assigned. Label 0 means that PARDINUS has classified the image as empty, while label 1 implies that there are animals within the image. The image name and the assigned label are separated by the symbol ";", following the CSV standars.

Train new models using your own images

There are six scripts that you should execute to test trained models on your own images: _clustering.py, applyClustering.py, autoencoders.py, applyAutoencoders.py, balanceErrorFile.py and randomForest.py, in this order.

python clustering.py
python applyClustering.py
python autoencoders.py
python applyAutoencoders.py
python balanceErrorFile.py
python randomForest.py

The file config.py contains variables that needs to be set to make the scripts work properly.

  • IMAGE_FOLDER: set where the clustered and equalized images for the trainig of the RAEs will be stored. Default: "./Data/"

  • EMPTY_DATA: set where the empty images are stored. Default: IMAGE_FOLDER + "BBDDTrain/Empty"

  • ANIMAL_DATA: set where the non-empty images are stored. Default: IMAGE_FOLDER + "BBDDTrain/Animal"

  • TRAINED_MODELS_ROUTE: set where the trained models are stored. There should be one k-means clustering model, one random forest model and one RAE model for each cluster of images. Default: "./TrainedModels/"

  • ERROR_FILES_ROUTE: set where the trained models are stored. Default: "./ErrorFiles/"

  • ANIMAL_PROPORTION: set the proportion of animal images versus empty images (i.e 24 means that 24% of all images are non-empty)

Depending on the training settings or your images features, you may also want to change other parameters in config.py as the image width or height, the number of clusters or other training hyperparameters as number of epoch or batch size.

After run all the scripts, trained models should be stored at TRAINED_MODELS_ROUTE.

Data Availability

The dataset used in this project is provided by WWF and is subject to restricted access.

Researchers interested in using this dataset must request permission directly from WWF. The original photographs and data can be accessed via Wildlife Insights, project “Seguimiento Lince WWF España”. The details about the dataset and its metadata can be found at this link. Note that this repository provides scripts and instructions to train and test models using your own datasets.

About

Study for automatic discarding of empty images. Training and testing workflow of PARDINUS architecture.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages