-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Will you release the pre-train models on Caffe #10
Comments
Thanks for your interests! We'll work on it as soon as possible. Hopefully the model will come out this year. |
Is imagenet caffe model available yet? Thanks, |
Sorry for the late response. Our pretrained DenseNet-121 and prototxt in Caffe has just been released here https://github.com/liuzhuang13/DenseNet#imagenet-and-pretrained-models Thanks for your interest! |
@liuzhuang13 I just tested the pretrained DenseNet-121 caffe model from Zhiqiang Shen, and the top-1/5 acc by using single center crop is only 70.8/90.3%(256xN) or 72.5/91.3%(256x256). is there something wrong? it should be 75.0/92.3%, right? |
@shicai Thanks for using our models! Yes, we are aware of this issue. Your 72.5%/91.3% accuracy (256x256) is the same as what we get. The difference may come from a different data augmentation scheme and other implementation difference between fb.resnet.torch and Caffe, despite we have tried to keep other parameters as consistent as possible. On CIFAR the training curve was also significantly different between Torch and Caffe. The Torch ResNet models provided by Facebook were also slightly more accurate than the original Caffe ResNet models. We hope this difference on ImageNet accuracy won't affect much when fine-tuning on other tasks. We'll update the models if we get more accurate results. |
@liuzhuang13 thank you for your quick reply. I just manually converted the torch model into caffe format. due to some unknown reasons, the acc is about 0.2~0.3% lower than the original torch model. i will check it tomorrow. |
@shicai Thanks! Could you share your converted caffe models if convenient? |
@liuzhuang13 @shicai I think there might be differences in the EMA procedure of BN between torch and BN. The default cudnn-torch BN have momentum parameter = 0.1 for EMA of BN, inside cudnn api, this means new_global[Stat] = 0.1batch[Stat]+0.9old_global[Stat], where Stat can be Mean/Var. But caffe do this differently, caffe's global[stat] holds value as batch_Stat[0]+moving_average_fraction * batch_Stat[-1]+moving_average_fraction^2 * batch_Stat[-2]+... in the training of BN, and in the inference, it divide the global_Stat by 1+moving_average_fraction+moving_average_fraction^2+... to get inference Mean/Var. And default moving_average_fraction in caffe is 0.999. |
@Tongcheng thanks for sharing. |
@shicai Thanks for sharing, we have added the link on our readme page. Hope more people start using them:) |
Your repository contains no train val prototxts. On the other hand, please check the end of your files. They all end with a convolutional layer. Is this right? |
I'm trying to fine-tune my dataset using DenseNet models (available here) and Nvidia-Digits system I've already read all the issues and make some modification to my custom network, but it gives me the following error:
here is my network, I used the original prototxt and made some modification as bellow
|
Will you release the pre-train models on Caffe
The text was updated successfully, but these errors were encountered: