I am trying to get a "screenshot" of the first page for a pdf file(thumbnail for webpage).
these pdf files are scanned books later processed with an ocr tool
the convert is only working if there are no text fields on the cover(first page) of the book.
am i missing some parameters?
thx in advance
convert command :
convert -colorspace rgb -quality 80 -thumbnail 150x150 ./BE-KBR00_A-0580834_0000-00-00.pdf[0] ./BE-KBR00_A-0580834_0000-00-00.jpg
os linux
files
thumbnail
https://docs.google.com/open?id=0ByL7Fu ... nVjZkpwUHM
original pdf(download for good resolution)
https://docs.google.com/open?id=0ByL7Fu ... UlSdUsxTlE
convert problem for pdf with text
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: convert problem for pdf with text
I do not seem to be able to get to that pdf file
Re: convert problem for pdf with text
same PDF maybe this is downloadable
https://docs.google.com/open?id=0ByL7Fu ... DRjbVdvbjA
https://docs.google.com/open?id=0ByL7Fu ... DRjbVdvbjA
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: convert problem for pdf with text
vdvb wrote:same PDF maybe this is downloadable
https://docs.google.com/open?id=0ByL7Fu ... DRjbVdvbjA
identify same.pdf
same.pdf[0] PDF 662x975 662x975+0+0 16-bit Bilevel Gray 569KB 0.010u 0:00.009
same.pdf[1] PDF 668x968 668x968+0+0 16-bit Bilevel Gray 569KB 0.000u 0:00.009
same.pdf[2] PDF 678x964 678x964+0+0 16-bit Bilevel Gray 569KB 0.000u 0:00.009
same.pdf[3] PDF 667x951 667x951+0+0 16-bit Bilevel Gray 569KB 0.000u 0:00.009
same.pdf[4] PDF 669x961 669x961+0+0 16-bit Bilevel Gray 569KB 0.000u 0:00.000
same.pdf[5] PDF 671x969 671x969+0+0 16-bit Bilevel Gray 569KB 0.000u 0:00.000
same.pdf[6] PDF 674x971 674x971+0+0 16-bit Bilevel Gray 569KB 0.000u 0:00.000
shows 7 pages, but
convert same.pdf +adjoin same_%d.tif
or
convert same.pdf[0-6] +adjoin same_%d.tif
Only gets the first page.
This is probably indicative of the fact that your pdf file has alpha data (though perfectly opaque alpha channel). So you need to find your delegates.xml file, edit it the line ps:alpha so that the Device is pnmraw rather than pngalpha. IM cannot do both at the same time. One or the other only --- transparency in one page, or no transparency in multiple pages.
On my system:
find /usr | grep "delegates.xml"
/usr/local/etc/ImageMagick/delegates.xml
<delegate decode="ps:alpha" stealth="True" command=""gs" -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 "-sDEVICE=pnmraw" -dTextAlphaBits=%u -dGraphicsAlphaBits=%u "-r%s" %s "-sOutputFile=%s" "-f%s" "-f%s""/>
I edited it as above and then ran:
convert same.pdf +adjoin same_%d.tif
and got all seven tif files labeled same_0.tif ... same_6.tif
Re: convert problem for pdf with text
thx for the help already
i changed the delegates.xml but no succes
the problem is not that the convert not works, but the quality
check a screenshot from the converted pdf
https://docs.google.com/open?id=0ByL7Fu ... Vp1eXEtbXc
and check the screenshot from the original pdf
https://docs.google.com/open?id=0ByL7Fu ... jJ0d0tuUmc
i dont know if you can see the images(problem with gdrive) but you can download them and see the difference.
Thx in advance
i changed the delegates.xml but no succes
the problem is not that the convert not works, but the quality
check a screenshot from the converted pdf
https://docs.google.com/open?id=0ByL7Fu ... Vp1eXEtbXc
and check the screenshot from the original pdf
https://docs.google.com/open?id=0ByL7Fu ... jJ0d0tuUmc
i dont know if you can see the images(problem with gdrive) but you can download them and see the difference.
Thx in advance
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: convert problem for pdf with text
After editing my delegates.xml file to set sDEVICE=pnmraw, everything works just fine for me. The results are noise free with or without the thumbnail. But note the thumbnail must be after reading the input pdf. I am on IM 6.8.1.0 Q16 Mac OSX Snow Leopard
convert BE-KBR00_A-0580834_0000-00-00.pdf -thumbnail 150x150 -quality 80 +adjoin BE-KBR00_A-0580834_0000-00-00_%d.jpg
convert BE-KBR00_A-0580834_0000-00-00.pdf -quality 80 +adjoin BE-KBR00_A-0580834_0000-00-00_%d.jpg
Perhaps you need to upgrade your version of ImageMagick, Ghostscript and/or libjpeg
convert BE-KBR00_A-0580834_0000-00-00.pdf -thumbnail 150x150 -quality 80 +adjoin BE-KBR00_A-0580834_0000-00-00_%d.jpg
convert BE-KBR00_A-0580834_0000-00-00.pdf -quality 80 +adjoin BE-KBR00_A-0580834_0000-00-00_%d.jpg
Perhaps you need to upgrade your version of ImageMagick, Ghostscript and/or libjpeg
Re: convert problem for pdf with text
It worked for me too: after editing my delegates.xml file with set sDEVICE=pnmraw, my PNG thumbnail generation from a PDF file works again with ImageMagick 6.8 as it used to with ImageMagick 6.6.
Just a quick note: I don't understand exactly where this problem comes from, if ImageMagick or Ghoscript is the cause, but I have the feeling it will break quite a lot of softwares, and quickly. If it is feasible, I would suggest making the change in ImageMagick as soon as possible...
Just a quick note: I don't understand exactly where this problem comes from, if ImageMagick or Ghoscript is the cause, but I have the feeling it will break quite a lot of softwares, and quickly. If it is feasible, I would suggest making the change in ImageMagick as soon as possible...
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: convert problem for pdf with text
I believe it is a Ghostscript issue and not IM.