Implementation of Graph Convolutional Network to Annotate Corel-5k images with PyTorch library
(for more information see CNN_Image_Annotation_dataset)
I used a multi-label data augmentation method based on Wasserstein-GAN which is fully described here: CNN_Image_Annotation_data_augmentation
(more information can be found at CNN_Image_Annotation_convolutional_models)
The structures of CNN-GCN & GCN are shown in the images below:
Static Correlation (Adjacency) Matrix:
Word Embedding:
Different word embeddings will hardly affect the accuracy, which reveals improvements do not absolutely come from the semantic meanings derived from word embeddings, rather than GCN.
The image below shows the relations between labels after using the word embedding technique (t-sne: 300d -> 2d):
The image below shows the relations between labels after training by GCN (t-sne: 2048d -> 2d):
(check out CNN_Image_Annotation_evaluation_metrics for more information)
To train the model in Spyder IDE use the code below:
run main.py --loss-function {select loss function}
Please note that:
- You should put BCELoss, FocalLoss or AsymmetricLoss in {select loss function}.
Using augmented data, you can train the model as follows:
run main.py --loss-function {select loss function} --augmentation
To evaluate the model in Spyder IDE use the code below:
run main.py --loss-function {select loss function} --evaluate
asymmetric loss (more information at asymmetric loss)
global-pooling | batch-size | num of training images | image-size | epoch time | 𝛾+ | 𝛾- | m |
---|---|---|---|---|---|---|---|
avg | 32 | 4500 | 448 * 448 | 135s | 0 | 4 | 0.05 |
data | precision | recall | f1-score |
---|---|---|---|
testset per-image metrics | 0.594 | 0.670 | 0.630 |
testset per-class metrics | 0.453 | 0.495 | 0.473 |
data | N+ |
---|---|
testset | 175 |
Z-M. Chen, X-S. Wei, P. Wang, and Y. Guo.
"Multi-Label Image Recognition with Graph Convolutional Networks" (CVPR - 2019)