Image-Segmentation-PyTorch

Supported networks

UNet: backbones MobileNetV2 (all aphas and expansions), ResNetV1 (all num_layers)
DeepLab3+: backbones ResNetV1 (num_layers=18,34,50,101), VGG16_bn
BiSeNet: backbones ResNetV1 (num_layers=18)
PSPNet: backbones ResNetV1 (num_layers=18,34,50,101)
ICNet: backbones ResNetV1 (num_layers=18,34,50,101)

To assess architecture, memory, forward time (in either cpu or gpu), numper of parameters, and number of FLOPs of a network, use this command:

python measure_model.py

Set

Python3.6.x is used in this repository.
Clone the repository:

git clone --recursive https://github.com/YuantingMaSC/Image-Segmentation.git
cd Image-Segmentation
git submodule sync
git submodule update --init --recursive

To install required packages, use pip:

workon humanseg
pip install -r requirements.txt
pip install -e models/pytorch-image-models

Data preparation

Using "Lableme" tool to prepare and following script:

./original_data> python labelme2voc.py ./imgs_cut ./labelme_dataset --labels ./labels.txt

Training

For training a network from scratch, for example DeepLab3+, use this command:

python train.py --config config/config_DeepLab.json --device 0

where config/config_DeepLab.json is the configuration file which contains network, dataloader, optimizer, losses, metrics, and visualization configurations.

For resuming training the network from a checkpoint, use this command:

python train.py --config config/config_DeepLab.json --device 0 --resume path_to_checkpoint/model_best.pth

One can open tensorboard to monitor the training progress by enabling the visualization mode in the configuration file.

images to Vedio

using imgs2video.py

python imgs2video.py

Inference

There are two modes of inference:

python inference_video.py --watch --use_cuda --checkpoint path_to_checkpoint/model_best.pth
python inference_webcam.py --use_cuda --checkpoint path_to_checkpoint/model_best.pth

Benchmark

Networks are trained on a combined dataset from the two mentioned datasets above.
Input size of model is set to 320.
The CPU and GPU time is the averaged inference time of 10 runs (there are also 10 warm-up runs before measuring) with batch size 1.
The mIoU is measured on the testing subset (737 images) from the combined dataset.
Hardware configuration for benchmarking:

CPU: 12th Gen Intel(R) Core(TM) i9-12900K 3.20 GHz

GPU: NVIDIA GeForce RTX 3090

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
base		base
config		config
dataloaders		dataloaders
dataset		dataset
evaluation		evaluation
models		models
original_data		original_data
trainer		trainer
utils		utils
.gitignore		.gitignore
README.md		README.md
image_label_crop.py		image_label_crop.py
image_laebl_resize.py		image_laebl_resize.py
imgs2video.py		imgs2video.py
inference_video.py		inference_video.py
inference_webcam.py		inference_webcam.py
measure_model.py		measure_model.py
openh264-1.8.0-win64.dll		openh264-1.8.0-win64.dll
openh264-2.3.0-win64.dll		openh264-2.3.0-win64.dll
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image-Segmentation-PyTorch

Supported networks

Set

Data preparation

Training

images to Vedio

Inference

Benchmark

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Image-Segmentation-PyTorch

Supported networks

Set

Data preparation

Training

images to Vedio

Inference

Benchmark

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages