PySemSeg is a library for training Deep Learning Models for Semantic Segmentation in Pytorch. The goal of the library is to provide implementations of SOTA segmentation models, with pretrained versions on popular datasets, as well as an easy-to-use training loop for new models and datasets. Most Semantic Segmentation datasets with fine-grained annotations are small, so Transfer Learning is crucial for success and is a core capability of the library. PySemSeg can use visdom or tensorboardX for training summary visualialization.
Using pip:
pip install git+https://github.com/petko-nikolov/pysemseg
- FCN [paper] - FCN32, FCN16, FCN8 with pre-trained VGG16
- UNet [paper]
- Tiramisu (FC DenseNets)[paper] - FC DenseNet 56, FC DenseNet 67, FC DensetNet 103 with efficient checkpointing
- DeepLab V3 [paper] - Multi-grid, ASPP and BatchNorm fine-tuning with pre-trained resnets backbone
- DeepLab V3+ [paper]
- RefineNet [paper] - [Upcoming ...]
- PSPNet [paper] - [Upcoming ...]
- Pascal VOC
- CamVid
- Cityscapes [Upcoming ...]
- ADE20K [Upcoming ...]
The following is an example command to train a VGGFCN8 model on the Pascal VOC 2012 dataset. In addition to the dataset and the model, a transformer class should be passed (PascalVOCTransform in this case) - a callable where all input image and mask augmentations and tensor transforms are implemented. Run pysemseg-train -h
for a full list of options.
pysemseg-train \
--model VGGFCN8 \
--model-dir ~/models/vgg8_pascal_model/ \
--dataset PascalVOCSegmentation \
--data-dir ~/datasets/PascalVOC/ \
--batch-size 4 \
--test-batch-size 1 \
--epochs 40 \
--lr 0.001 \
-- optimizer SGD \
-- optimizer-args '{"weight_decay": 0.0005, "momentum": 0.9}' \
--transformer PascalVOCTransform \
--lr-scheduler PolyLR \
--lr-scheduler_args '{"max_epochs": 40, "gamma": 0.8}'
or pass a YAML config
pysemseg-train --config config.yaml
model: VGGFCN32
model-dir: models/vgg8_pascal_model/
dataset: PascalVOCSegmentation
data-dir: datasets/PascalVOC/
batch-size: 4
test-batch-size: 1
epochs: 40
lr: 0.001
optimizer: SGD
optimizer-args:
weight_decay: 0.0005
momentum: 0.9
transformer: PascalVOCTransform
no-cuda: true
lr-scheduler: PolyLR
lr-scheduler-args:
max_epochs: 40
gamma: 0.8
To use a checkpoint for inference you have to call load_model
with a checkpoint, the model class and the transformer class used during training.
import torch.nn.functional as F
from pysemseg.transforms import CV2ImageLoader
from pysemseg.utils import load_model
from pysemseg.models import VGGFCN32
from pysemseg.datasets import PascalVOCTransform
model = load_model(
'./checkpoint_path',
VGGFCN32,
PascalVOCTransform
)
image = CV2ImageLoader()('./image_path')
logits = model(image)
probabilities = F.softmax(logits, dim=1)
predictions = torch.argmax(logits, dim=1)