Page 1 of 1
Converting PDF to JPG when using searchable images
Posted: 2018-04-01T20:47:38-07:00
by Scarred Sun
I have a PDF-to-multiple-JPG output set up on my website, and I have been able to convert PDFs to multiple JPGs just fine up until now--when I've moved to using searchable image PDF setups. The way these PDFs are set up are very straightforward--one JPG image per page and the searchable text markup. I'd ideally keep the searchable text intact when running these PDFs through ImageMagick; is there any way I can just have the converter ignore the text inside and just make JPGs?
Edit: the exact error message I receive when trying to do this is
Error creating thumbnail: convert: no decode delegate for this image format `' @ error/constitute.c/ReadImage/504. convert: no images defined `/tmp/transform_2dc9f6e73735.jpg' @ error/convert.c/ConvertImageCommand/3258.
Re: Converting PDF to JPG when using searchable images
Posted: 2018-04-01T21:42:17-07:00
by fmw42
Re: Converting PDF to JPG when using searchable images
Posted: 2018-04-07T13:10:09-07:00
by Scarred Sun
So, I tried switching from ghostscript to pdfimages and am running into a similar problem: upon running
Code: Select all
(/usr/bin/pdfimages -f 1 -l 1 -j -p /path/to/myfile.pdf pdf | '/usr/bin/convert' '-depth' '8' '-quality' '95' '-resize' '190' '-' '/tmp/transform_6ee557c14986.jpg')
I still get
Code: Select all
Error creating thumbnail: I/O Error: Couldn't open image file 'pdf-001-000.jpg' convert: no decode delegate for this image format `' @ error/constitute.c/ReadImage/504. convert: no images defined `/tmp/transform_17fcb2522cee.jpg' @ error/convert.c/ConvertImageCommand/3258.
When I run
I get JPG as rw- and
does list a JPEG delegate, so I'm a bit at a loss on how to debug from here.
Re: Converting PDF to JPG when using searchable images
Posted: 2018-04-07T13:36:12-07:00
by snibgo
pdfimages writes to files. Your command assumes it writes to stdout, so can be piped.
Scarred Sun wrote:'/usr/bin/convert' '-depth' '8' '-quality' '95' '-resize' '190' '-' '/tmp/transform_6ee557c14986.jpg'
What version IM are you running? In v7, you cannot operate on an image
before reading it, so the "-resize" should come
after "-".
Re: Converting PDF to JPG when using searchable images
Posted: 2018-04-07T17:29:02-07:00
by Scarred Sun
I'm using 6.9.7-4 in this case.
Re: Converting PDF to JPG when using searchable images
Posted: 2018-04-07T18:01:31-07:00
by fmw42
I do not think pdfimages has a stdout to pipe to convert (as snibgo said). So try separating the pdfimages command from the convert command. Save the result from the pdf images will be automatically created. Also as snibgo said, better to read the input right after convert as proper IM syntax.
For example:
Code: Select all
pdfimages -f 1 -l 1 -png lena1.pdf lena1
creates lena1-000.png
Seems that it does not respect the lack of -p and still seems to write page numbers anyway.
Then do
Re: Converting PDF to JPG when using searchable images
Posted: 2018-04-07T19:24:17-07:00
by snibgo
pdfimages always includes the image number within the filename. With option "-p", it also includes the page number. (A pdf file may contain pages with more than one image, and pages with no images.)
Re: Converting PDF to JPG when using searchable images
Posted: 2018-04-07T19:48:16-07:00
by fmw42
snibgo wrote: ↑2018-04-07T19:24:17-07:00
pdfimages always includes the image number within the filename. With option "-p", it also includes the page number. (A pdf file may contain pages with more than one image, and pages with no images.)
Thanks for the clarification. I misunderstood the meaning of -000.
Re: Converting PDF to JPG when using searchable images
Posted: 2018-04-08T13:33:45-07:00
by Scarred Sun
The lack of a pipe leaves me in a bind--I'm actually using this along with Mediawiki to convert PDFs to JPG (thus the earlier gs use.) Are there any other options besides gs and pdfimages for this task? I'm assuming ImageMagick can't do this natively.
Re: Converting PDF to JPG when using searchable images
Posted: 2018-04-08T13:49:28-07:00
by fmw42
Imagemagick used Ghostscript to process PDF images. So not natively with it.