Page 1 of 1

Group4 Compression not Working with PDF

Posted: 2014-04-27T15:28:57-07:00
by x0054
I am trying to compress some PDF files with Group4 compression. This command works:

convert -density 300 -threshold 90% -compress group4 input.pdf tif:- | convert - -compress group4 output.pdf

But these do not:

convert -density 300 -threshold 90% -monochorme -compress group4 input.pdf output.pdf
convert input.pdf -density 300 -threshold 90% -monochorme -compress group4 output.pdf
convert -density 300 -threshold 90% -compress group4 input.pdf output.pdf

The input is a color PDF scan. 2 errors occur. First, in first and second command above, the monochorme filter is processed before the threshold filter. Second, in all of the above commands the resulting PDF does not have group4 compression. I am not sure what's going on. Those commands worked just fine on an older copy of IM on windows. However, with the latest copy of IM I have to first convert the color PDF into a multipage TIF file and then convert that into PDF, which basically doubles the processing time.

Any idea how to fix this? I think it might be a bug, as it works fine with older copy of IM I have on my windows machine. I tested this on OSX and Linux.

- Bogdan

Re: Group4 Compression not Working with PDF

Posted: 2014-04-27T16:33:37-07:00
by fmw42
try putting -compress group4 between your pdf input and your tiff output

Re: Group4 Compression not Working with PDF

Posted: 2014-05-02T00:46:02-07:00
by x0054
The thing is, I am not trying to get a tiff, I am trying to get a PDF with individual pages as tiff images compressed in group4.

Re: Group4 Compression not Working with PDF

Posted: 2014-05-02T04:43:50-07:00
by snibgo
Not all compressions apply to all output formats. For example, zip doesn't apply to png.

I don't know much about pdf, but I think that Group4 compression doesn't apply to pdf. At least, I can't create a pdf that "identify -verbose" reports as "compression: Group4" either with current IM or a couple of older versions.

Re: Group4 Compression not Working with PDF

Posted: 2014-05-02T09:56:34-07:00
by fmw42
convert -density 300 -threshold 90% -compress group4 input.pdf tif:- | convert - -compress group4 output.pdf
This is probably the only way to do it. You have to imbed a tiff raster image into a PDF vector shell. Group 4 only applies to tiff.

In fact, you probably do not need the final group4, so I would expect that this should work

Code: Select all

convert -density 300 -threshold 90% -compress group4 input.pdf tif:- | convert - output.pdf

Re: Group4 Compression not Working with PDF

Posted: 2014-05-02T16:36:16-07:00
by x0054
Well, the only reason I am positing that it's a bug is because it worked just fine in older IM I have on my Windows machine. It's not operational, so I can not check it's version, but it was from some time in 2010. Oh, and the last group 4 is necessary, without it the resulting PDF file is about 8 times larger than the source tiff file.

Actually, to whom ever this may be useful in the future, I ended up using convert to go from scanned PDF to TIFF, and then used tiff2pdf to convert the tiff into PDF. The tiff2pdf tool is a lot faster, it basically can convert the file as fast as your disk can write it, while IM takes 1-1.5 seconds per page. The bid downside to tiff2pdf is that it does not take stdin as input.

- Bogdan

Re: Group4 Compression not Working with PDF

Posted: 2014-05-02T17:39:07-07:00
by fmw42
I tested under IM 6.7.5.5 but it does not show group4. That is as far back as I can go.

Code: Select all

convert logo: logo.pdf
im6755update convert -density 300 -threshold 90% -monochrome -compress group4 logo.pdf logo2.pdf

Re: Group4 Compression not Working with PDF

Posted: 2014-05-03T00:17:56-07:00
by dlemstra
PDF does support writing the images in a PDF with group4 compression. The following command should work on Windows:

Code: Select all

convert -density 300 input.pdf -threshold 90% -compress group4 output.pdf
Identify will not report the compression because it is not being read by the PDF coder. If you open the PDF in a text editor you should see this:

Code: Select all

/Filter [ /CCITTFaxDecode ]
What do you mean by: the monochorme filter is processed before the threshold filter? The -monochrome option only tells the PDF reader to use specific Ghostscript options. If you want a gray scale image you should use the option: -type Grayscale