FFTW benchmarking, single vs. double precision transforms.
Posted: 2013-05-16T03:37:56-07:00
I did some benchmarking with FFTW for another project. I got tired of messing with the graphs, but on my somewhat standard Core2 Quad, switching from double precision FFTW to single precision FFTW would get a speed increase of around 20-40%.
The main benefit of switching to single precision is however the reduced memory requirements. Both because the DFT obviously takes half the size, but also because the transform can be performed directly on the image data if ImageMagick is compiled with HDRI. I mention this because I recently had to take the DFT of very large images, and I couldn't do it with ImageMagick because it had to create a lot of unnecessary buffers to perform the transform. The loss of precision involved in the move from doubles to floats are likely to be negligible for image operations.
Even better gains would be had switching from FFTW_ESTIMATE to FFTW_MEASURE, but that has two serious drawbacks: It will destroy the input data, and it will take a lot of time to perform (roughly 2-3 times it actually takes to execute the plan). Switching to single precision combined with FFTW_MEASURE would improve the speed by 20-150% depending on the size of the transform. There is however a way out of this, since FFTW can save these measured plans and restore them next time. Since it is likely that a user will perform the same size transform many times during the lifetime of the program, taking advantage of this would get most of the benefits without much of the drawbacks. It doesn't take much room, a couple of kB. There is also the standardized concept of a systemwide "wisdom" file for FFTW, that should be used. It's only one function call.
There's a small chance that I could possibly want to implement some or all of these changes in IM7 or IM6 in the future if they aren't there already.
The main benefit of switching to single precision is however the reduced memory requirements. Both because the DFT obviously takes half the size, but also because the transform can be performed directly on the image data if ImageMagick is compiled with HDRI. I mention this because I recently had to take the DFT of very large images, and I couldn't do it with ImageMagick because it had to create a lot of unnecessary buffers to perform the transform. The loss of precision involved in the move from doubles to floats are likely to be negligible for image operations.
Even better gains would be had switching from FFTW_ESTIMATE to FFTW_MEASURE, but that has two serious drawbacks: It will destroy the input data, and it will take a lot of time to perform (roughly 2-3 times it actually takes to execute the plan). Switching to single precision combined with FFTW_MEASURE would improve the speed by 20-150% depending on the size of the transform. There is however a way out of this, since FFTW can save these measured plans and restore them next time. Since it is likely that a user will perform the same size transform many times during the lifetime of the program, taking advantage of this would get most of the benefits without much of the drawbacks. It doesn't take much room, a couple of kB. There is also the standardized concept of a systemwide "wisdom" file for FFTW, that should be used. It's only one function call.
There's a small chance that I could possibly want to implement some or all of these changes in IM7 or IM6 in the future if they aren't there already.