Converting PDF to JPG when using searchable images

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
Post Reply
Scarred Sun
Posts: 4
Joined: 2018-04-01T20:44:25-07:00
Authentication code: 1152

Converting PDF to JPG when using searchable images

Post by Scarred Sun »

I have a PDF-to-multiple-JPG output set up on my website, and I have been able to convert PDFs to multiple JPGs just fine up until now--when I've moved to using searchable image PDF setups. The way these PDFs are set up are very straightforward--one JPG image per page and the searchable text markup. I'd ideally keep the searchable text intact when running these PDFs through ImageMagick; is there any way I can just have the converter ignore the text inside and just make JPGs?

Edit: the exact error message I receive when trying to do this is
Error creating thumbnail: convert: no decode delegate for this image format `' @ error/constitute.c/ReadImage/504. convert: no images defined `/tmp/transform_2dc9f6e73735.jpg' @ error/convert.c/ConvertImageCommand/3258.
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Converting PDF to JPG when using searchable images

Post by fmw42 »

Scarred Sun
Posts: 4
Joined: 2018-04-01T20:44:25-07:00
Authentication code: 1152

Re: Converting PDF to JPG when using searchable images

Post by Scarred Sun »

So, I tried switching from ghostscript to pdfimages and am running into a similar problem: upon running

Code: Select all

(/usr/bin/pdfimages -f 1 -l 1 -j -p /path/to/myfile.pdf pdf | '/usr/bin/convert' '-depth' '8' '-quality' '95' '-resize' '190' '-' '/tmp/transform_6ee557c14986.jpg')
I still get

Code: Select all

Error creating thumbnail: I/O Error: Couldn't open image file 'pdf-001-000.jpg' convert: no decode delegate for this image format `' @ error/constitute.c/ReadImage/504. convert: no images defined `/tmp/transform_17fcb2522cee.jpg' @ error/convert.c/ConvertImageCommand/3258.
When I run

Code: Select all

identify -list format
I get JPG as rw- and

Code: Select all

convert -list configure
does list a JPEG delegate, so I'm a bit at a loss on how to debug from here.
snibgo
Posts: 12159
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: Converting PDF to JPG when using searchable images

Post by snibgo »

pdfimages writes to files. Your command assumes it writes to stdout, so can be piped.
Scarred Sun wrote:'/usr/bin/convert' '-depth' '8' '-quality' '95' '-resize' '190' '-' '/tmp/transform_6ee557c14986.jpg'
What version IM are you running? In v7, you cannot operate on an image before reading it, so the "-resize" should come after "-".
snibgo's IM pages: im.snibgo.com
Scarred Sun
Posts: 4
Joined: 2018-04-01T20:44:25-07:00
Authentication code: 1152

Re: Converting PDF to JPG when using searchable images

Post by Scarred Sun »

I'm using 6.9.7-4 in this case.
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Converting PDF to JPG when using searchable images

Post by fmw42 »

I do not think pdfimages has a stdout to pipe to convert (as snibgo said). So try separating the pdfimages command from the convert command. Save the result from the pdf images will be automatically created. Also as snibgo said, better to read the input right after convert as proper IM syntax.

For example:

Code: Select all

pdfimages -f 1 -l 1 -png lena1.pdf lena1
creates lena1-000.png

Seems that it does not respect the lack of -p and still seems to write page numbers anyway.

Then do

Code: Select all

convert lena1-000.png ...
snibgo
Posts: 12159
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: Converting PDF to JPG when using searchable images

Post by snibgo »

pdfimages always includes the image number within the filename. With option "-p", it also includes the page number. (A pdf file may contain pages with more than one image, and pages with no images.)
snibgo's IM pages: im.snibgo.com
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Converting PDF to JPG when using searchable images

Post by fmw42 »

snibgo wrote: 2018-04-07T19:24:17-07:00 pdfimages always includes the image number within the filename. With option "-p", it also includes the page number. (A pdf file may contain pages with more than one image, and pages with no images.)
Thanks for the clarification. I misunderstood the meaning of -000.
Scarred Sun
Posts: 4
Joined: 2018-04-01T20:44:25-07:00
Authentication code: 1152

Re: Converting PDF to JPG when using searchable images

Post by Scarred Sun »

The lack of a pipe leaves me in a bind--I'm actually using this along with Mediawiki to convert PDFs to JPG (thus the earlier gs use.) Are there any other options besides gs and pdfimages for this task? I'm assuming ImageMagick can't do this natively.
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: Converting PDF to JPG when using searchable images

Post by fmw42 »

Imagemagick used Ghostscript to process PDF images. So not natively with it.
Post Reply