Convert PDF to Tiff produces blank page
Convert PDF to Tiff produces blank page
I regularly have to convert PDFs to TIFF files for use by our public records database. Occasionally, when I convert a PDF I will get a completely white TIFF instead of the page. It can occur anywhere in a multi-page PDF or occasionally with a single-PDF as is the case with the one attached. All the PDFs are produced by scanning on our Xerox Workcenter copiers (various models). I can't seem to find any rhyme or reason why I get the blank pages. Re-running the command produces identical output, so if say page 3 of the PDF produces a blank TIFF even though it has an image in the PDF, it will always produce a blank TIFF. Opening the PDF in IrfanView and exporting all pages will always produce the correct TIFFs.
This is the conversion command I use:
convert -compress Group4 -density 240 -scene 49 -verbose \\path\to\server\containing\scans\DOC_20141202151240.PDF E:\destinatio\path\pro14p%04d.tif
I can't seem to attach files yet, or I would attach an example of the problem PDF and result produced.
This is the conversion command I use:
convert -compress Group4 -density 240 -scene 49 -verbose \\path\to\server\containing\scans\DOC_20141202151240.PDF E:\destinatio\path\pro14p%04d.tif
I can't seem to attach files yet, or I would attach an example of the problem PDF and result produced.
-
- Posts: 12159
- Joined: 2010-01-23T23:01:33-07:00
- Authentication code: 1151
- Location: England, UK
Re: Convert PDF to Tiff produces blank page
You can upload the PDF to somewhere like dropbox.com and paste the URL here.
I suspect the problem is "-compress Group4". First, move that to after the input filename, not before. This quantizes to two levels: black and white. If all the input pixels are closer to white than black, the output will be entirely white.
I suspect the problem is "-compress Group4". First, move that to after the input filename, not before. This quantizes to two levels: black and white. If all the input pixels are closer to white than black, the output will be entirely white.
snibgo's IM pages: im.snibgo.com
Re: Convert PDF to Tiff produces blank page
Here is a link to the problem PDF: https://www.dropbox.com/s/53lpnfm439aku ... 0.PDF?dl=0
And this is the output: https://www.dropbox.com/s/zdt1eh2ofe48f ... 9.tif?dl=0
I tried removing the -compress Group4 and ran this command: convert \\path\to\DOC_20141202151240.PDF -density 300x300 -scene 47 E:\scan\pro14\pro14p%04d.tif
And got this output: https://www.dropbox.com/s/4z16kcg8pdyr8 ... 7.tif?dl=0
Essentially the same image but with more bits per pixel. I should also note that opening the PDF in IrfanView DOES NOT produce the correct image as it usually does. It gives me the solid white image. I suspect this is actually a ghostscript issue, but have not been able to find any other reports of it.
And this is the output: https://www.dropbox.com/s/zdt1eh2ofe48f ... 9.tif?dl=0
I tried removing the -compress Group4 and ran this command: convert \\path\to\DOC_20141202151240.PDF -density 300x300 -scene 47 E:\scan\pro14\pro14p%04d.tif
And got this output: https://www.dropbox.com/s/4z16kcg8pdyr8 ... 7.tif?dl=0
Essentially the same image but with more bits per pixel. I should also note that opening the PDF in IrfanView DOES NOT produce the correct image as it usually does. It gives me the solid white image. I suspect this is actually a ghostscript issue, but have not been able to find any other reports of it.
Last edited by BrentD on 2014-12-05T12:56:52-07:00, edited 1 time in total.
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Convert PDF to Tiff produces blank page
Your links are not good. They are broken by spaces and ... in them and so we have no way to find out what the ... represent. Check for spaces in the actual links and replace them with %20.
Re: Convert PDF to Tiff produces blank page
Post edited. I replied to the wrong thread earlier and just copied the text and pasted it here without realizing it shortens the links.
-
- Posts: 12159
- Joined: 2010-01-23T23:01:33-07:00
- Authentication code: 1151
- Location: England, UK
Re: Convert PDF to Tiff produces blank page
The PDF contains a single page, which is a single image, which is encoded in jbig2. I don't know if GS can read jbig2. pdfimages reads it okay.
snibgo's IM pages: im.snibgo.com
Re: Convert PDF to Tiff produces blank page
I'm trying to find a multi-page image that exhibits the same problem, but it doesn't look like I have saved any. I can scan several pages on the copier, run the convert command, get a blank page in the middle of the output, then go back and scan the same images again on the same copier and run the same conversion command and it will convert all the pages. Would it be possible that the copier is randomly choosing a different compression for each page in the PDF?
Re: Convert PDF to Tiff produces blank page
Doesn't look like JBIG is the problem: http://www.imagemagick.org/script/formats.php#supported
That doesn't specify specific encodings, though. Copier has options to allow "Huffman Encoding" and "Arithmetic Encoding" under the "JBIG2" options.
That doesn't specify specific encodings, though. Copier has options to allow "Huffman Encoding" and "Arithmetic Encoding" under the "JBIG2" options.
-
- Posts: 12159
- Joined: 2010-01-23T23:01:33-07:00
- Authentication code: 1151
- Location: England, UK
Re: Convert PDF to Tiff produces blank page
Could be. When I scan, I never save to PDF. I always save to an image format such as PNG, TIFF or (rarely) JPG. For most purposes, JPG is good enough.BrentD wrote:Would it be possible that the copier is randomly choosing a different compression for each page in the PDF?
That page shows formats readable by IM. But IM doesn't read the PDF. Ghostscript reads the PDF.BrentD wrote:Doesn't look like JBIG is the problem: http://www.imagemagick.org/script/formats.php#supported
snibgo's IM pages: im.snibgo.com
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Convert PDF to Tiff produces blank page
You actually need to run
convert -list format
and
convert -version
to see what formats your compile of IM can use and what delegates you have installed. Many formats require a delegate library and do not come by default with a source compile of IM. IM binaries come with the most frequently needed ones, but perhaps not all you may need. I do not recall if JBIG requires a delegate or not.
For my system, convert -version, lists JBIG. So I suspect it needs a delegate library to be installed.
convert -list format
and
convert -version
to see what formats your compile of IM can use and what delegates you have installed. Many formats require a delegate library and do not come by default with a source compile of IM. IM binaries come with the most frequently needed ones, but perhaps not all you may need. I do not recall if JBIG requires a delegate or not.
For my system, convert -version, lists JBIG. So I suspect it needs a delegate library to be installed.
Re: Convert PDF to Tiff produces blank page
Was this resolved? I just encountered the same problem testing a client provided 28 page PDF. Page 23 results in a blank page when converting from PDF to TIFF. It also converts to a blank page when being converted by a C# Windows service using Snowbound Software's RasterMaster product. Lacking a resolution, is there any way to tell with ImageMagick when an image is going to convert to a blank page? Our C#/RasterMaster code can detect this and sends an email alert so that the document can be scanned and resubmitted in TIFF format.
-
- Posts: 12159
- Joined: 2010-01-23T23:01:33-07:00
- Authentication code: 1151
- Location: England, UK
Re: Convert PDF to Tiff produces blank page
Ghostscript v9.15 won't convert the image in https://www.dropbox.com/s/53lpnfm439aku ... 0.PDF?dl=0 . This is a single jbig2 image.
If that is what your document contains on page 23, then, no, the problem has not been resolved. Perhaps you could talk to the Ghostscript people.
To find what images are in your document:
If that is what your document contains on page 23, then, no, the problem has not been resolved. Perhaps you could talk to the Ghostscript people.
To find what images are in your document:
Code: Select all
pdfimages XX.pdf -list
snibgo's IM pages: im.snibgo.com