FFT : cufft backend #2756
pfeatherstone
started this conversation in
Ideas
Replies: 3 comments 12 replies
-
Yeah that would be neat. That timing you did includes the transfer time to and from the GPU? Normally you want to organize the code to hide those transfer times. |
Beta Was this translation helpful? Give feedback.
7 replies
-
Do you plan on adding this to dlib? That would be awesome if we ever planned to revamp the DNN stuff into something maybe a bit more pytorch-like. |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I suggest maybe adding a cuFFT backend implementation of
dlib::fft
. Maybe we give it another name likedlib::cu::fft
so that applications can use both CPU and GPU. This won't be useful for small FFTs but sizes >= 1024x1024 this will definitely help. I did a quick test with FFT size 32x1024x1024. With MKL it took around 400ms (single threaded). With cuFFT it took around 3ms. So this is a win.Beta Was this translation helpful? Give feedback.
All reactions