How to manage large 2D FFT in cuda

I have successfully written the CUDA FFT code, which performs 2D convolution of the image, as well as some other calculations.

How do I figure out which largest FFT I can run? 2D R2C convolution plan seems to occupy 2x image size and the other 2x image size for C2R. This is like overhead!

Also, it seems like most benchmarks, etc. relate to relatively small FFTs. It seems that for large images I will quickly run out of memory. How is this usually handled? Can you convolve the FFT on the image tile and combine these results and expect it to be the same as if I performed 2D-FFT on the entire image?

Thank you for answering these questions.

+3
source share
2 answers

CUFFT is planning a different algorithm depending on the size of your image. If you cannot fit into shared memory and are not power 2, then CUFFT plans to convert out of place, while smaller images with the right size will be more susceptible to software.

If you set FFT to the whole image and you need to understand what your GPU can process, I would prefer to guess and check with different image sizes, since CUFFT planning is difficult.

See documentation: http://developer.download.nvidia.com/compute/cuda/1_1/CUFFT_Library_1.1.pdf

, - . , , . FFT , .

, GPU- Matlab , - .

+5

. , , 2 , .

Cutting images into tiles is quite reasonable. The size of the tile will determine the frequency resolution that you can achieve. You might also want to lay tiles.

+1
source

All Articles