Skip to content

Inspired by a U-Net, SegNet and FCNN models convolutional encoder-decoder network with skipped-connections and pre-trained VGG16 network with Batch Normalization as an encoder. Decoder network was designed by me. Model was trained from scratch with Pytorch.

Notifications You must be signed in to change notification settings

TomekGniazdowski/VOC-Semantic-Segmentation-With-Custom-Model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

VOC Semantic Segmentation With Custom Model

Model

The model was inspired by a U-Net, SegNet and Fully Convolutional Networks models and is a convolutional encoder-decoder network with skipped-connections. I used a pre-trained VGG16 network with Batch Normalization as an encoder. Decoder network was designed by me. Model was trained from scratch with Pytorch.

Data

I trained model on 2012 VOCSegmentation's train dataset (1464 samples). For validation (725 samples) and test (725 samples) I used randomly splitted 2012 VOCSegmentation's validation dataset.

Augmentation

For spacial augmentations I used: horiontal flip, rotation, translation and scaling. As a color augmentations I used Gaussian Blur, ColorJitter and brightness & contrast change. Below are examples from the training and validation sets.

Example of images from train dataset

train ds example

Example of images from validation dataset

val ds example

Training

Training hyperparameters are presented below. Unfortunately I didn't have enough computational power to get the best hyperparameters set or train the model for hours.

Hyperparameter Value
Optimizer Adam (lr=1e-4)
Scheduler One Cycle LR
Epochs 300
Patience 30
L1 regularization coeficient 1e-6

Moreover, beacuse of data inbalance I used weighted cross-entropy loss. Model was trained for 1 hour 49 minutes on Nvidia GeForce RTX 3090 Ti.

Trining curves

training curve

Test results

The model achieved an pixel-level accuracy at 87.95 % on half of the VOC Segmentation's validation set (which was my test dataset), which is comparable to the results achieved in the literature.

Five random examples from the validation dataset of images with predictions and labels.

test random

Ten best predicted images from the test dataset

test best

Five worst predicted images from the test dataset

test worst

Confusion matrix (test dataset)

test confusion matrix

About

Inspired by a U-Net, SegNet and FCNN models convolutional encoder-decoder network with skipped-connections and pre-trained VGG16 network with Batch Normalization as an encoder. Decoder network was designed by me. Model was trained from scratch with Pytorch.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published