Hi,
in latest version ImageMagick-6.8.9-8, i compile with opencl and test resize(convert src.jpg -filter box -resize 248x248 -bench 30 logo.jpg)
but the performance of using opencl is slower than only use cpu openmp.
I use cuda6.5
>convert src.jpg -filter box -resize 248x248 -bench 30 logo.jpg
openmp
Performance[1]: 30i 78.947ips 1.000e 0.370u 0:00.380
Performance[2]: 30i 81.081ips 0.507e 0.380u 0:00.370
Performance[3]: 30i 73.171ips 0.481e 0.390u 0:00.410
Performance[4]: 30i 78.947ips 0.500e 0.380u 0:00.380
Performance[5]: 30i 75.000ips 0.487e 0.390u 0:00.400
Performance[6]: 30i 73.171ips 0.481e 0.400u 0:00.410
Performance[7]: 30i 81.081ips 0.507e 0.380u 0:00.370
Performance[8]: 30i 75.000ips 0.487e 0.370u 0:00.400
openmp + opencl
Performance[1]: 30i 7.143ips 1.000e 13.570u 0:04.200
Performance[2]: 30i 17.143ips 0.706e 1.750u 0:01.750
Performance[3]: 30i 17.045ips 0.705e 1.770u 0:01.760
Performance[4]: 30i 17.143ips 0.706e 1.750u 0:01.750
Performance[5]: 30i 17.045ips 0.705e 1.750u 0:01.760
Performance[6]: 30i 17.045ips 0.705e 1.760u 0:01.760
Performance[7]: 30i 17.143ips 0.706e 1.750u 0:01.750
Performance[8]: 30i 17.143ips 0.706e 1.750u 0:01.750
if anybody had meet the same thing.
using opencl is slower in resize
Re: using opencl is slower in resize
Download http://www.imagemagick.org/download/bet ... -9.tar.bz2. Let us know if the OpenCL performance is restored. Thanks.
Re: using opencl is slower in resize
Hi, I had test version ImageMagick-6.8.9-9
>convert src.jpg -filter box -resize 248x248 -bench 30 logo.jpg
openmp:
Performance[1]: 30i 115.385ips 1.000e 0.370u 0:00.260
Performance[2]: 30i 120.000ips 0.510e 0.360u 0:00.250
Performance[3]: 30i 115.385ips 0.500e 0.370u 0:00.260
Performance[4]: 30i 78.947ips 0.406e 0.370u 0:00.380
Performance[5]: 30i 120.000ips 0.510e 0.360u 0:00.250
Performance[6]: 30i 120.000ips 0.510e 0.370u 0:00.250
Performance[7]: 30i 120.000ips 0.510e 0.360u 0:00.250
Performance[8]: 30i 120.000ips 0.510e 0.360u 0:00.250
openmp + opencl:
Performance[1]: 30i 14.218ips 1.000e 1.910u 0:02.110
Performance[2]: 30i 16.760ips 0.541e 1.800u 0:01.790
Performance[3]: 30i 16.949ips 0.544e 1.770u 0:01.770
Performance[4]: 30i 16.949ips 0.544e 1.780u 0:01.770
Performance[5]: 30i 16.949ips 0.544e 1.760u 0:01.770
Performance[6]: 30i 16.854ips 0.542e 1.780u 0:01.780
Performance[7]: 30i 16.949ips 0.544e 1.760u 0:01.770
Performance[8]: 30i 16.949ips 0.544e 1.780u 0:01.770
using opencl is also slower than only openmp.
>convert src.jpg -filter box -resize 248x248 -bench 30 logo.jpg
openmp:
Performance[1]: 30i 115.385ips 1.000e 0.370u 0:00.260
Performance[2]: 30i 120.000ips 0.510e 0.360u 0:00.250
Performance[3]: 30i 115.385ips 0.500e 0.370u 0:00.260
Performance[4]: 30i 78.947ips 0.406e 0.370u 0:00.380
Performance[5]: 30i 120.000ips 0.510e 0.360u 0:00.250
Performance[6]: 30i 120.000ips 0.510e 0.370u 0:00.250
Performance[7]: 30i 120.000ips 0.510e 0.360u 0:00.250
Performance[8]: 30i 120.000ips 0.510e 0.360u 0:00.250
openmp + opencl:
Performance[1]: 30i 14.218ips 1.000e 1.910u 0:02.110
Performance[2]: 30i 16.760ips 0.541e 1.800u 0:01.790
Performance[3]: 30i 16.949ips 0.544e 1.770u 0:01.770
Performance[4]: 30i 16.949ips 0.544e 1.780u 0:01.770
Performance[5]: 30i 16.949ips 0.544e 1.760u 0:01.770
Performance[6]: 30i 16.854ips 0.542e 1.780u 0:01.780
Performance[7]: 30i 16.949ips 0.544e 1.760u 0:01.770
Performance[8]: 30i 16.949ips 0.544e 1.780u 0:01.770
using opencl is also slower than only openmp.
Re: using opencl is slower in resize
By the way, has anybody knowns wether need some special configuration that make opencl speed up works, or post your benchmark about using opencl
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: using opencl is slower in resize
As I understand it, OpenCL has a non-negligible overhead. So for small image, it may be slower than without due to the overhead. I do not know what size your input image is.
Re: using opencl is slower in resize
My src.jpg size is 549 * 412 about 59kb.fmw42 wrote:As I understand it, OpenCL has a non-negligible overhead. So for small image, it may be slower than without due to the overhead. I do not know what size your input image is.