Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Will you release the pre-train models on Caffe #10

Open
soeaver opened this issue Dec 12, 2016 · 12 comments
Open

Will you release the pre-train models on Caffe #10

soeaver opened this issue Dec 12, 2016 · 12 comments

Comments

@soeaver
Copy link

soeaver commented Dec 12, 2016

Will you release the pre-train models on Caffe

@liuzhuang13
Copy link
Owner

liuzhuang13 commented Dec 12, 2016

Thanks for your interests! We'll work on it as soon as possible. Hopefully the model will come out this year.

@kaishijeng
Copy link

Is imagenet caffe model available yet?

Thanks,

@liuzhuang13
Copy link
Owner

Sorry for the late response. Our pretrained DenseNet-121 and prototxt in Caffe has just been released here https://github.com/liuzhuang13/DenseNet#imagenet-and-pretrained-models

Thanks for your interest!

@shicai
Copy link

shicai commented Feb 13, 2017

@liuzhuang13 I just tested the pretrained DenseNet-121 caffe model from Zhiqiang Shen, and the top-1/5 acc by using single center crop is only 70.8/90.3%(256xN) or 72.5/91.3%(256x256). is there something wrong? it should be 75.0/92.3%, right?

@liuzhuang13
Copy link
Owner

liuzhuang13 commented Feb 13, 2017

@shicai Thanks for using our models! Yes, we are aware of this issue. Your 72.5%/91.3% accuracy (256x256) is the same as what we get. The difference may come from a different data augmentation scheme and other implementation difference between fb.resnet.torch and Caffe, despite we have tried to keep other parameters as consistent as possible. On CIFAR the training curve was also significantly different between Torch and Caffe. The Torch ResNet models provided by Facebook were also slightly more accurate than the original Caffe ResNet models.

We hope this difference on ImageNet accuracy won't affect much when fine-tuning on other tasks. We'll update the models if we get more accurate results.

@shicai
Copy link

shicai commented Feb 13, 2017

@liuzhuang13 thank you for your quick reply. I just manually converted the torch model into caffe format. due to some unknown reasons, the acc is about 0.2~0.3% lower than the original torch model. i will check it tomorrow.

@liuzhuang13
Copy link
Owner

@shicai Thanks! Could you share your converted caffe models if convenient?

@Tongcheng
Copy link

Tongcheng commented Feb 13, 2017

@liuzhuang13 @shicai I think there might be differences in the EMA procedure of BN between torch and BN. The default cudnn-torch BN have momentum parameter = 0.1 for EMA of BN, inside cudnn api, this means new_global[Stat] = 0.1batch[Stat]+0.9old_global[Stat], where Stat can be Mean/Var. But caffe do this differently, caffe's global[stat] holds value as batch_Stat[0]+moving_average_fraction * batch_Stat[-1]+moving_average_fraction^2 * batch_Stat[-2]+... in the training of BN, and in the inference, it divide the global_Stat by 1+moving_average_fraction+moving_average_fraction^2+... to get inference Mean/Var. And default moving_average_fraction in caffe is 0.999.

@shicai
Copy link

shicai commented Feb 14, 2017

@Tongcheng thanks for sharing.
@liuzhuang13 I made my repo public, see: https://github.com/shicai/DenseNet-Caffe
everyone can download these caffemodels freely.
hope it helps.

@liuzhuang13
Copy link
Owner

@shicai Thanks for sharing, we have added the link on our readme page. Hope more people start using them:)

@asfix
Copy link

asfix commented Dec 14, 2017

Your repository contains no train val prototxts. On the other hand, please check the end of your files. They all end with a convolutional layer. Is this right?

@FaezeMM
Copy link

FaezeMM commented Apr 29, 2019

I'm trying to fine-tune my dataset using DenseNet models (available here) and Nvidia-Digits system I've already read all the issues and make some modification to my custom network, but it gives me the following error:

conv2_1/x2/bn needs backward computation.
conv2_1/x1 needs backward computation.
relu2_1/x1 needs backward computation.
conv2_1/x1/scale needs backward computation.
conv2_1/x1/bn needs backward computation.
pool1_pool1_0_split needs backward computation.
pool1 needs backward computation.
relu1 needs backward computation.
conv1/scale needs backward computation.
conv1/bn needs backward computation.
conv1 needs backward computation.
label_val-data_1_split does not need backward computation.
val-data does not need backward computation.
This network produces output accuracy
This network produces output loss
Network initialization done.
Solver scaffolding done.
Finetuning from /home/ubuntu/models/DenseNet-Caffe/densenet201.caffemodel
Ignoring source layer input
Check failed: target_blobs.size() == source_layer.blobs_size() (5 vs. 3) Incompatible number of blobs for layer conv1/bn

here is my network, I used the original prototxt and made some modification as bellow

layer {
 name: "train-data"
 type: "Data"
 top: "data"
 top: "label"
 include {
   stage: "train"
 }
 transform_param {
   crop_size: 224
 }
 data_param {
   batch_size: 126
 }
}
layer {
 name: "val-data"
 type: "Data"
 top: "data"
 top: "label"
 include {
   stage: "val"
 }
 transform_param {
   crop_size: 224
 }
 data_param {
   batch_size: 64
 }
}
layer {
 name: "conv1"
 type: "Convolution"
 bottom: "data"
 top: "conv1"
 convolution_param {
   num_output: 64
   bias_term: false
   pad: 3
   kernel_size: 7
   stride: 2
 }
}
layer {
 name: "conv1/bn"
 type: "BatchNorm"
 bottom: "conv1"
 top: "conv1/bn"
 batch_norm_param {
   eps: 1e-5
 }
}
layer {
 name: "conv1/scale"
 type: "Scale"
 bottom: "conv1/bn"
 top: "conv1/bn"
 scale_param {
   bias_term: true
 }
}
layer {
 name: "relu1"
 type: "ReLU"
 bottom: "conv1/bn"
 top: "conv1/bn"
}
.
.
.
.
layer {
 name: "fc6new"
 type: "Convolution"
 bottom: "pool5"
 top: "fc6new"
 convolution_param {
   num_output: 35
   kernel_size: 1
 }
}

layer {
 name: "loss"
 type: "SoftmaxWithLoss"
 bottom: "fc6new"
 bottom: "label"
 top: "loss"
 exclude {
   stage: "deploy"
 }
}
layer {
 name: "accuracy"
 type: "Accuracy"
 bottom: "fc6new"
 bottom: "label"
 top: "accuracy"
 include {
   stage: "val"
 }
}
layer {
 name: "accuracy_train"
 type: "Accuracy"
 bottom: "fc6new"
 bottom: "label"
 top: "accuracy_train"
 include {
   stage: "train"
 }
 accuracy_param {
   top_k: 5
 }
}
layer {
 name: "softmax"
 type: "Softmax"
 bottom: "fc6new"
 top: "softmax"
 include {
   stage: "deploy"
 }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants