Page 1 of 1
convert produces unreadable pdf
Posted: 2017-10-19T14:00:58-07:00
by scott@pareto.net
I am using ImageMagick on a Red Hat EL7 system:
$ convert -version
Version: ImageMagick 6.7.8-9 2016-06-03 Q16
I have a file (my.pdf) that Windows file properties reports is a screenshot from snagit.
Viewing my.pdf with Adobe Reader DC displays a readable, although slightly faint, image.
Converting it to anything, even another PDF, produces an unreadable image, e.g.
$ convert my.pdf new.pdf
new.pdf is super faint and unreadable.
$ convert my.pdf my.tif
...is also unreadable.
I do not have control over the input PDFs (they are received email attachments).
Suggestions on how to use 'convert' on this PDF file and end up with a readable result?
Alternatively, if the input PDF file is hopeless, is there a way to use ImageMagick or another tool to identify these (bad) PDFs so I could just pass them through without attempting to process them?
Thanks,
Scott Ballinger
Re: convert produces unreadable pdf
Posted: 2017-10-19T15:30:54-07:00
by fmw42
IM 6.7.8.9 is ancient. You would be best to upgrade. Also upgrade Ghostscript, since ImageMagick uses that to read PDF files.
For us to be able to help, you will need to provide an example PDF. You can post it to some free hosting service that will not change the format (such as dropbox.com) and put the URL here. If it is a screen snap, then it is not likely PDF any more. Perhaps it is mislabeled as PDF. In any regard, please post and example.
Re: convert produces unreadable pdf
Posted: 2017-10-19T17:24:50-07:00
by snibgo
Screenshots are usually raster images. Using IM to read a PDF will convert each page to a raster file at whatever resolution you specify. But if you simply want to extract each raster image, unchanged, then pdfimages is the appropriate tool, not IM.
Re: convert produces unreadable pdf
Posted: 2017-10-20T08:23:29-07:00
by scott@pareto.net
Hello, thanks for the replies.
This is a medical insurance application that processes medical claims received via secure email. The medical claim is the PDF attachment. The body of the email message is extracted as ASCII file message.txt and is then combined with the PDF attachment to create the final claim document, e.g.
Code: Select all
$ convert message.txt attachment.pdf claim-document.pdf
The system has over 10 million claim documents stored as PDFs, and dozens of programs that use ImageMagick in various intake processes. I am hesitant to upgrade ImageMagick because it is my understanding that the command line syntax has changed in V7.
HIPAA rules preclude sharing or posting the problematic PDF. I believe it to be a legitimate PDF file: Adobe Reader DC File>Properties reports that it is "PDF Producer: SnagIt by TechSmith... PDF Version: 1.4 (Acrobat 5.x)... Fast Web View: Yes"
Perhaps ImageMagick is not the appropriate tool for this job. What I need to do is combine an ASCII text file and an old PDF into a new PDF. Suggestions for alternate ways to accomplish this using command line tools would be appreciated.
Re: convert produces unreadable pdf
Posted: 2017-10-20T09:02:01-07:00
by fmw42
Your IM version is ancient and likely so is your Ghostscript. I would suggest again you upgrade both. Perhaps you have a PDF file version that is not compatible with Ghostscript. You could try converting directly with Ghostscript.
Can you post the information from your original PDF from
Does it happen with other PDF files that are not from that source?
Re: convert produces unreadable pdf
Posted: 2017-10-20T09:02:42-07:00
by Bonzo
You do not need to upgrade to a V7 version; there are some V6 versions that are not very old.
Re: convert produces unreadable pdf
Posted: 2017-10-20T09:04:49-07:00
by fmw42
I am hesitant to upgrade ImageMagick because it is my understanding that the command line syntax has changed in V7.
You can still upgrade to a version of IM 6.
The major change with IM 7 is to replace convert with magick. You can also set up a symbolic link between convert to magick, so you can still use convert.
Re: convert produces unreadable pdf
Posted: 2017-10-27T07:48:33-07:00
by scott@pareto.net
Hello,
1. Upgrading IM and ghostscript is a good idea.
2. This is a one-off problem specific to a single PDF. I have literally millions of PDFs that were imported using IM without issue.
3. I am fiddling around with identify to see what the characteristics are of of this file are that cause the problem.
4. However, the work-around is to use IM to create separate PDFs from received TXT, JPG, TIF, JPG, BMP, PNG files as necessary and then use pdfunite (instead of IM) to combine them.
Thanks.
Re: convert produces unreadable pdf
Posted: 2017-10-27T09:55:52-07:00
by fmw42
I suggest you zip your text and PDF files and post to some free hosting service such as dropbox.com and put the URL here. That way we can test you same command.