Weakly supervised discarding of photo-trapping empty images based on autoencoders

This repository is the official implementation of the paper “Weakly supervised discarding of photo-trapping empty images based on autoencoders”. PARDINUS, built on the foundation of weakly-supervised learning, is the proposed tool to automatically detect empty or blank images.

Install

Install requirements.txt using the following python code in a clean anaconda environment (Python 3.9 is reccomended):

pip install -r requirements.txt

Code organization and usage

The code is organized into two main folders: Train and Test. Test folder contains the scripts to test or evaluate pretrained models on your own images. Train folder contains the scripts for training your own models.

Preprocessing

The images must be 256 pixels height and 384 pixels width, RGB format. If your images needs to be resized, you can use the script resizeImages.py

Test or inference

There are three scripts that you should execute to test trained models on your own images: clustering.py, autoencoders.py and randomForest.py, in this order.

python clustering.py
python autoencoders.py
python randomForest.py

The file config.py contains variables that needs to be set to make the scripts work properly. When indicating the route to a specific folder, you have to create the folder itself, with the name you set. For example, if you set IMAGE_FOLDER = "./MY_IMAGES/", you would have to create a folder in the root named "./MY_IMAGES/", where the images should be stored.

IMAGE_FOLDER: set where the clustered and equalized images for the RAEs will be stored. Default: "./Data/"
TEST_IMAGES: set where the original images, your own images, are stored. Default: "./Data/BBDDTest"
TRAINED_MODELS_ROUTE: set where the trained models are stored. There should be one k-means clustering model, one random forest model and one RAE model for each cluster of images. Default: "./TrainedModels/"
ERROR_FILES_ROUTE: set where the trained models are stored. Default: "./ErrorFiles/"

Depending on the training settings or your images features, you may also want to change other parameters in config.py as the number of clusters to create.

The last script, randomForest.py, will create a new file in root directory, Results.csv.

RobustAutoencoder.py defines the architecture of the RAE models that are used to predict the label of the images.

Results output

The execution of randomForest.py script will provide the results of PARDINUS on the test images. The file that store the results is called Results.csv. For each row, this file store the name of the image and the label assigned. Label 0 means that PARDINUS has classified the image as empty, while label 1 implies that there are animals within the image. The image name and the assigned label are separated by the symbol ";", following the CSV standars.

Train new models using your own images

There are six scripts that you should execute to test trained models on your own images: _clustering.py, applyClustering.py, autoencoders.py, applyAutoencoders.py, balanceErrorFile.py and randomForest.py, in this order.

python clustering.py
python applyClustering.py
python autoencoders.py
python applyAutoencoders.py
python balanceErrorFile.py
python randomForest.py

The file config.py contains variables that needs to be set to make the scripts work properly.

IMAGE_FOLDER: set where the clustered and equalized images for the trainig of the RAEs will be stored. Default: "./Data/"
EMPTY_DATA: set where the empty images are stored. Default: IMAGE_FOLDER + "BBDDTrain/Empty"
ANIMAL_DATA: set where the non-empty images are stored. Default: IMAGE_FOLDER + "BBDDTrain/Animal"
TRAINED_MODELS_ROUTE: set where the trained models are stored. There should be one k-means clustering model, one random forest model and one RAE model for each cluster of images. Default: "./TrainedModels/"
ERROR_FILES_ROUTE: set where the trained models are stored. Default: "./ErrorFiles/"
ANIMAL_PROPORTION: set the proportion of animal images versus empty images (i.e 24 means that 24% of all images are non-empty)

Depending on the training settings or your images features, you may also want to change other parameters in config.py as the image width or height, the number of clusters or other training hyperparameters as number of epoch or batch size.

After run all the scripts, trained models should be stored at TRAINED_MODELS_ROUTE.

Data Availability

The dataset used in this project is provided by WWF and is subject to restricted access.

Researchers interested in using this dataset must request permission directly from WWF. The original photographs and data can be accessed via Wildlife Insights, project “Seguimiento Lince WWF España”. The details about the dataset and its metadata can be found at this link. Note that this repository provides scripts and instructions to train and test models using your own datasets.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
test		test
train		train
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
resizeImages.py		resizeImages.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Weakly supervised discarding of photo-trapping empty images based on autoencoders

Install

Code organization and usage

Preprocessing

Test or inference

Results output

Train new models using your own images

Data Availability

About

Uh oh!

Releases

Packages

Uh oh!

Languages

SIMIDAT/PARDINUS-EmptyImagesFiltering

Folders and files

Latest commit

History

Repository files navigation

Weakly supervised discarding of photo-trapping empty images based on autoencoders

Install

Code organization and usage

Preprocessing

Test or inference

Results output

Train new models using your own images

Data Availability

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages