Collating a collection of images into one PDF without holding them all in memory at once
Posted: 2018-10-15T17:15:50-07:00
One of the things about `magick`/`convert` that I find most useful is the ability to transform a collection of images into a single PDF. However, I've found that for large collections of images, the utility tends to hang indefinitely. I tested it under a debugger and confirmed that yes, it is attempting to load all of the image data into memory before writing it out, making it infeasible to collate arbitrarily large collections of images.
Is it possible to write the PDF without holding all the image data in memory at any given time? I poked the source and found `PingImage`/`PingImages`, which looks like it should be useful for this purpose (since it doesn't load the image data, only metadata and a reference to the on-disk data). However, when I test the utility with the `-ping` flag it doesn't write the image data out to the resulting PDF, only the dimensions.
I understand that in many situations - for instance, when you need to perform multiple transformations on a number of images before writing them - it's much more efficient to hold all the image data in memory rather than reading and writing it to disk multiple times. However, I'm wondering if it's possible to ask the utility (or the API) to optimize for memory efficiency in this case.
Cheers.
-wbn
Is it possible to write the PDF without holding all the image data in memory at any given time? I poked the source and found `PingImage`/`PingImages`, which looks like it should be useful for this purpose (since it doesn't load the image data, only metadata and a reference to the on-disk data). However, when I test the utility with the `-ping` flag it doesn't write the image data out to the resulting PDF, only the dimensions.
I understand that in many situations - for instance, when you need to perform multiple transformations on a number of images before writing them - it's much more efficient to hold all the image data in memory rather than reading and writing it to disk multiple times. However, I'm wondering if it's possible to ask the utility (or the API) to optimize for memory efficiency in this case.
Cheers.
-wbn