Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About UFF and full setting train #30

Open
evenrose opened this issue Jan 30, 2024 · 3 comments
Open

About UFF and full setting train #30

evenrose opened this issue Jan 30, 2024 · 3 comments

Comments

@evenrose
Copy link

evenrose commented Jan 30, 2024

Dalao, thanks for your interesting paper. I got a quite similar result from the double-lib version. However after I trained and added UFF as a full setting, the performance dropped several percent compared with the double-lib version. I tried checkpoint-0,1and 2 or regenerate the features for UFF but none of them worked. Would you please let me know if some settings might causing this performance drop? At the same time, I found feature extraction to be CPU intensive, I allocated 16core to it and it keeps at 100% occupation while the graphics card was often waiting for CPU work. Is that reasonable? Thanks.

@evenrose
Copy link
Author

evenrose commented Jan 30, 2024

I tried your pre-trained UFF, it workable and has clear performance improvement. Would you mind letting me know how you train the UFF? In your paper, it's described that "For UFF, the χrgb, χpc are 2 two-layer MLPs with 4× hidden dimension as input feature; We use AdamW optimizer, set learning rate as 0.003 with cosine warm-up in 250 steps and batch size as 16, we report the best anomaly detection results under 750 UFF training steps."

This seems inconsistent with your current code with 3 epochs.

Is that due to I only use one GPU? I have tried it on one A100 80G with

OMP_NUM_THREADS=1 

python -m torch.distributed.run --nnodes=1 --nproc_per_node=1 fusion_pretrain.py    \
--accum_iter 16 \
--lr 0.003 \
--batch_size 16 \
--data_path datasets/patch_lib \
--output_dir checkpoints \

Or I need to apply other parameters to achieve 750 epochs?

@hongchiMa
Copy link

@evenrose Hello, what are your final training epochs and warm-up epochs? I’m using 6 training epochs and 3 warm-up epochs for UFF, but my results are significantly worse than using uff_pretrained

@evenrose
Copy link
Author

@evenrose Hello, what are your final training epochs and warm-up epochs? I’m using 6 training epochs and 3 warm-up epochs for UFF, but my results are significantly worse than using uff_pretrained

It doesn't work. Something must be missing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants