Push pixel cache a little further in Win32

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
Post Reply
johnsfine

Push pixel cache a little further in Win32

Post by johnsfine »

This is the continuation of the thread from
viewtopic.php?p=37735#p37735

I have a work around, that is enough for the operation discussed in that thread. But if the problem were another 25% larger, this work around wouldn't be enough.

1) You need to edit c:\boot.ini
Normally that file is protected and hidden. So you need to set FolderOptions/View to be able to see it (check "show hidden files and folders" and uncheck "Hide protected operating system files") then go into the properties of c:\boot.ini to unprotect it further. Then you can finally edit it.
Add /3GB to the line in that file that starts your OS.
See documentation at:
http://technet.microsoft.com/en-us/sysi ... 63892.aspx
That then won't take effect until the next reboot.

2) You need to change a header flag in each .exe file that must use over 2GB. One way is with Microsoft's EDITBIN program (which is included with various of their development tools packages and also available for free download). For example, I did

Code: Select all

editbin /largeaddressaware \Tools\ImageMagick-6.4.2\32-Q8\convert.exe
Of course, adjust that command based on the name and location of the exe you want to flag. That change is stored in the .exe file (so you only need to do this once per .exe file).

With that done, the task discussed in that other thread runs in 120 seconds using 2.4GB of ram (and no temporary disk file) on my system.

In ImageMagick 64-Q16 it took 99 seconds and 4.8GB of ram, so there is still something strange/disappointing about the performance. The extra (hardware) cache misses from using 4.8GB (because of Q16) instead of 2.4GB ought to overwhelm other performance factors in this example, so the Q8 run should be faster. In both cases all three pixel cache copies of the image were mapped anonymous memory, so (unlike before the work around) the pixel cache shouldn't be a problem.

I don't really understand why all three copies must exist at the same time, but maybe there is a good reason.

If all three must exist at the same time and one has an access pattern that is a disaster when put in a temp file, I don't understand why the program doesn't prioritize the allocation of anonymous memory (when run without this work around).

If files must be used, I don't understand why it only attempts static mapping (which partially works if you limit memory, but then solves nothing) rather than dynamic mapping, so limiting memory would then give decent (not great) performance.
User avatar
magick
Site Admin
Posts: 11064
Joined: 2003-05-31T11:32:55-07:00

Re: Push pixel cache a little further in Win32

Post by magick »

I don't really understand why all three copies must exist at the same time, but maybe there is a good reason.
Its the nature of this particular resize algorithm, it is first scaled in the horizontal direction and then the vertical direction. As previously mentioned, use -scale instead. It only makes one copy and reads the pixels sequentially so performance is much better even if the pixels are cached to disk.
If all three must exist at the same time and one has an access pattern that is a disaster when put in a temp file, I don't understand why the program doesn't prioritize the allocation of anonymous memory (when run without this work around).

If files must be used, I don't understand why it only attempts static mapping (which partially works if you limit memory, but then solves nothing) rather than dynamic mapping, so limiting memory would then give decent (not great) performance.
ImageMagick is open source. Feel free to improve these algorithms and submit your patches here. We will get them into the ImageMagick distribution so everyone can benefit from your efforts. In the mean-time we modified the resize algorithm to scale with the random access pixel cache pattern first. In some cases this might be an advantage if the image is large and there is a limited availability of anonymous memory mapping.
Post Reply