PDF to images as STDOUT

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
Post Reply
ldegruchy
Posts: 3
Joined: 2013-03-08T12:38:03-07:00
Authentication code: 6789

PDF to images as STDOUT

Post by ldegruchy »

I'm trying to use the convert command to convert a PDF into a series of images in STDOUT. Thus far, I've only been successful in doing this when call the command with the destination filename(s).

Code: Select all

convert test.pdf test.jpg
results in the following files on the filesystem:

test-0.jpg
test-1.jpg
test-2.jpg

However, this doesn't work:

Code: Select all

convert test.pdf jpg:- > test.jpg
Instead of seeing a combination of the bytes or all three jpg files as I would expect, I'm instead able to open the resulting test.jpg file in an image viewer and it shows the first page of the PDF. Calling the process from Java yields the same results.

This happens using either JPG and PNG as destination formats.

I referred the following post to try to get this to work, but the poster was working with a TIFF, not a PDF:

viewtopic.php?f=1&t=15913

The reason I'm doing this is that I want to call convert from a Java process that needs to send the bytes to a calling application. I'd rather not constantly write files to the filesystem, read them and then delete them.

Is there a command line option I'm missing, or am I doing something else incorrectly?
User avatar
magick
Site Admin
Posts: 11064
Joined: 2003-05-31T11:32:55-07:00

Re: PDF to images as STDOUT

Post by magick »

ImageMagick creates 3 jpeg images and concatenates them. You only see one image with your viewers because that is all that is supported for JPEG. It is not a multi-frame image format like TIFF or GIF, for example.
ldegruchy
Posts: 3
Joined: 2013-03-08T12:38:03-07:00
Authentication code: 6789

Re: PDF to images as STDOUT

Post by ldegruchy »

magick wrote:ImageMagick creates 3 jpeg images and concatenates them. You only see one image with your viewers because that is all that is supported for JPEG. It is not a multi-frame image format like TIFF or GIF, for example.
You misunderstood me. I was able to create 3 separate jpegs when outputting the results to the file. When I send the results to STDOUT, I only get the first page of the PDF. Nothing is concatenated in either case.

Also, from what I understand, imagemagick uses ghostscript behind the scenes for PDF conversion. The following command worked when run from Java, as it produced 3 jpg files to STD, just like convert did directly to the filesystem.

Code: Select all

/usr/bin/gs -dSAFER -dBATCH -dNOPAUSE -r150 -sDEVICE=jpeg -dTextAlphaBits=4 -sOutputFile=- -f test.pdf jpg:-
I'm evaluating multiple tools for PDF to image conversion for my company, and part my evaluation includes different types of open source licenses, which is why I didn't just drop everything and choose ghostscript.
User avatar
magick
Site Admin
Posts: 11064
Joined: 2003-05-31T11:32:55-07:00

Re: PDF to images as STDOUT

Post by magick »

What version of ImageMagick are you using? We're using ImageMagick 6.8.3-8 and we're getting the same results as your Ghostscript command. That is 3 JPEG images concatenated together. Here is our use case:
  • convert rose: wizard: logo: test.pdf
    gs -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -r150 -sDEVICE=jpeg -dTextAlphaBits=4 -sOutputFile=- -f test.pdf jpg:- > gs.jpg
    convert test.pdf jpg:- > im.jpg
If we inspect gs.jpg and im.jpg we get 3 JFIF markers in each suggesting 3 JPEG image files concatenated.
ldegruchy
Posts: 3
Joined: 2013-03-08T12:38:03-07:00
Authentication code: 6789

Re: PDF to images as STDOUT

Post by ldegruchy »

magick wrote:What version of ImageMagick are you using? We're using ImageMagick 6.8.3-8 and we're getting the same results as your Ghostscript command. That is 3 JPEG images concatenated together. Here is our use case:
  • convert rose: wizard: logo: test.pdf
    gs -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -r150 -sDEVICE=jpeg -dTextAlphaBits=4 -sOutputFile=- -f test.pdf jpg:- > gs.jpg
    convert test.pdf jpg:- > im.jpg
If we inspect gs.jpg and im.jpg we get 3 JFIF markers in each suggesting 3 JPEG image files concatenated.
Thanks for responding so quickly and helpfully.

I was running older versions of ghostscript and imagemagick. I'm on RHEL 6 and the default versions of those programs are older.

I installed the newer versions of those programs from source:

Code: Select all

$ gs -version
GPL Ghostscript 9.07 (2013-02-14)
Copyright (C) 2012 Artifex Software, Inc.  All rights reserved.

Code: Select all

$ convert -version
Version: ImageMagick 6.8.3-3 2013-03-08 Q16 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2013 ImageMagick Studio LLC
Features: DPC OpenMP
Delegates: bzlib fontconfig freetype jng jp2 jpeg lcms pango png ps tiff x xml zlib
However, I'm not getting the same results as you are (I ran the below commands in a newly created directory to ensure there were no artifacts)

Code: Select all

$ convert rose: wizard: logo: test.pdf

Code: Select all

$ evince test.pdf 
I see: rose image page 1, wizard at table image page 2, wizard standing up page 3

Code: Select all

$ gs -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -r150 -sDEVICE=jpeg -dTextAlphaBits=4 -sOutputFile=- -f test.pdf jpg:- > gs.jpg
GPL Ghostscript 9.07: Unrecoverable error, exit code 1

Code: Select all

$ convert test.pdf jpg:- > im.jpg

Code: Select all

$ ls -la im.jpg 
-rw-r--r-- 1 {OMITTED} {OMITTED} 104744 Mar  8 17:30 im.jpg

Code: Select all

$ eog im.jpg
I see the rose image in the JPG

I compiled imagemagick before ghostscript, but I doubt that has something to do with it since I'm getting a failure with ghostscript that you are not.

Again, thanks for your help. If you have any diagnostics you think I should run, I'll try them out.
User avatar
magick
Site Admin
Posts: 11064
Joined: 2003-05-31T11:32:55-07:00

Re: PDF to images as STDOUT

Post by magick »

You will get the same results from gs.jpg as well. Most / all JPEG viewers expect one and only one JPEG image per image file. In both the Ghostscript and ImageMagick cases, 3 JPEG images are concatenated into 1 file but the viewers only see the first one.
Post Reply