Page 1 of 1

Convert images to PDF, remove transparency?

Posted: 2016-10-27T18:53:59-07:00
by mhulse
Hello,

I am using these commands to create a PDF from a directory full of JPGs and PNG32s:

Code: Select all

convert \
${args.inputFiles} \
-quality 100 \
-density 300x300 \
-compress jpeg \
-units PixelsPerInch \
-background white \
-alpha remove \
+repage \
${args.outputPdf}
For some reason, the system I am uploading the PDF to thinks there's transparency.

My goal is to create a print-quality PDF with no transparency.

Can anyone suggest a more robust set of commands to do this? If I remember correctly, I tried using -flatten but I ended up getting a single page PDF.

Thanks!

Re: Convert images to PDF, remove transparency?

Posted: 2016-10-27T19:28:36-07:00
by fmw42
What is your IM version and platform? Please always provide that information when asking questions. See viewtopic.php?f=1&t=9620

Can you post one of your PNG file? Do they have transparency? Does it happen with JPG input images.

I am to sure, but I do not think PDFs allow JPG compression, nor quality, but I could be wrong. When I do your command on a jpg file, the result does have a fully opaque alpha channel. Also the compression shows up as undefined. So that seems to confirm that PDF does not permit JPG compression.

But adding -alpha off in the following does not remove the alpha channel.

Code: Select all

convert \
lena.jpg \
+repage \
-density 300x300 \
-units PixelsPerInch \
-alpha off \
tmp.pdf
Even this fails to remove the alpha channel by piping to a new pdf.

Code: Select all

convert \
lena.jpg \
+repage \
-density 300x300 \
-units PixelsPerInch \
-alpha off \
miff:- |\
convert - \
-alpha off \
tmp.pdf
So I think it may be due to the sDEVICE being used for writing PDF in the delegates.xml file.

Re: Convert images to PDF, remove transparency?

Posted: 2016-10-27T20:42:31-07:00
by mhulse
Hi fmw42, thank you so much for your help, I really appreciate it!!! :)
fmw42 wrote:What is your IM version and platform? Please always provide that information when asking questions. See viewtopic.php?f=1&t=9620
Oh, shoot! Sorry about that! I will be sure to include this information for all future posts. Thank you for the kick in the pants! :)

I'm on macOS, Sierra:

Code: Select all

Version: ImageMagick 6.9.6-2 Q16 x86_64 2016-10-11 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2016 ImageMagick Studio LLC
License: http://www.imagemagick.org/script/license.php
Features: Cipher DPC Modules
Delegates (built-in): bzlib freetype jng jpeg ltdl lzma png tiff xml zlib
fmw42 wrote:Can you post one of your PNG file? Do they have transparency? Does it happen with JPG input images.
I can't attach the originals, but I can post a few test images (one of them is completely transparent as it is how I make blank pages in the PDF):

Image

Image
fmw42 wrote:I am to sure, but I do not think PDFs allow JPG compression, nor quality, but I could be wrong. When I do your command on a jpg file, the result does have a fully opaque alpha channel. Also the compression shows up as undefined. So that seems to confirm that PDF does not permit JPG compression.
Ahhhh, interesting!

You know, I have not played with that setting on the PDF creation. I just assumed it applied to the quality of the PDF. I want the PDF to be as high quality as possible, so I'd gladly remove those commands if they're not doing anything useful. :)

Out of curiosity, how did you see the compression showing up as "undefined"? Is there a verbose mode when running the convert command (reading docs now)?
fmw42 wrote:But adding -alpha off in the following does not remove the alpha channel.

Even this fails to remove the alpha channel by piping to a new pdf.

So I think it may be due to the sDEVICE being used for writing PDF in the delegates.xml file.
Interesting!

So, I wonder if it would be best for me to not use PNGs with alpha channels?

What is the best way for me to detect if a PDF has a transparency?

Thanks a billion for your help!!!!

Re: Convert images to PDF, remove transparency?

Posted: 2016-10-27T21:14:02-07:00
by fmw42
It is not your input image. It did not matter with JPG which does not support transparency. It is the delegate being used to write the PDF. IM uses Ghostscript for PDF. Not sure if just reading or both reading and writing. The delegates.xml file shows


<delegate decode="pdf" encode="eps" mode="bi" command=""gs" -q -dQUIET -dSAFER -dBATCH -dNOPAUSE -dNOPROMPT -dMaxBitmap=500000000 -dAlignToPixels=0 -dGridFitTT=2 "-sDEVICE=eps2write" -sPDFPassword="%a" "-sOutputFile=%o" "-f%i""/>

So it might that one of these forces it to write the fully opaque alpha channel rather than no alpha channel.

I do not know enough about either, but you might be able to use gs to write your output directly or modify the above line in your delegates.xml file to tell it not to write the alpha channel. Sorry, I just don't know enough in this area.

You can find the compression setting and if there is an alpha channel in the verbose information for your output, via

Code: Select all

identify -verbose youroutputimage
Best quality in a PDF is really what density you use when converting to raster formats, but it may also depend upon the density you use to convert your raster into the PDF. The PDF is a vector format that does not have a quality or size per se. But in this case you are imbedding a raster image into a vector PDF shell. So both might matter.

Your compression and quality may only be relevant in defining the raster image imbedded in the PDF shell. But note that PNG compression and quality are totally different from JPG. So if it matters, you should specify them differently for PNG and for JPG.

See
http://www.imagemagick.org/script/comma ... hp#quality
http://www.imagemagick.org/script/comma ... p#compress

Personally, I would not specify either for PNG. You should do test at low quality for JPG and see if the PDF shows that low quality. If not, then quality does not matter for JPG going to PNG either.

Re: Convert images to PDF, remove transparency?

Posted: 2016-10-27T21:29:53-07:00
by fmw42
A quick tests does show that quality makes a difference for JPG

Image

Code: Select all

convert \
lena.jpg \
-quality 100 \
tmp100.pdf

Code: Select all

convert \
lena.jpg \
-quality 10 \
tmp10.pdf

Re: Convert images to PDF, remove transparency?

Posted: 2016-10-27T21:34:34-07:00
by fmw42
I do not see much difference the "quality" of these two when setting the density --- both scale nicely when the window is enlarged as it should be in a PDF. I suspect the only difference would be if you extracted the two imbedded images and then printed them.

Code: Select all

convert \
lena.jpg \
-quality 100 \
-density 72 \
tmp100_72.pdf

Code: Select all

convert \
lena.jpg \
-quality 100 \
-density 600 \
tmp100_600.pdf

Re: Convert images to PDF, remove transparency?

Posted: 2016-10-27T22:11:48-07:00
by mhulse
fmw42 wrote:It is not your input image. It did not matter with JPG which does not support transparency. It is the delegate being used to write the PDF. IM uses Ghostscript for PDF. Not sure if just reading or both reading and writing. The delegates.xml file shows

So it might that one of these forces it to write the fully opaque alpha channel rather than no alpha channel.
Ahhhh, I see. that's very interesting. Thank you for clarifying (I would have never figured this out on my own). :)
fmw42 wrote:I do not know enough about either, but you might be able to use gs to write your output directly or modify the above line in your delegates.xml file to tell it not to write the alpha channel. Sorry, I just don't know enough in this area.
No worries at all! I really appreciate all of your help!

I actually did try to learn how to use GS to create a PDF, but unfortunately, from what I could tell, GS can't make a PDF from a globbed directory of image files. With that said, I will do some more investigating. :)

You can find the compression setting and if there is an alpha channel in the verbose information for your output, via

Code: Select all

identify -verbose youroutputimage
Interesting! I ended up running the above command on two PDFs.

ImageMagick-created (relevant info):

Code: Select all

  Format: PDF (Portable Document Format)
  Mime type: application/pdf
  Class: DirectClass
  Geometry: 621x630+0+0
  Resolution: 72x72
  Print size: 8.625x8.75
  Units: Undefined
  Type: TrueColorAlpha
  Endianess: Undefined
  Colorspace: sRGB
  Depth: 16/8-bit
  Channel depth:
    red: 8-bit
    green: 8-bit
    blue: 8-bit
    alpha: 1-bit
  Channel statistics:
    Pixels: 391230
    Red:
      min: 0 (0)
      max: 65535 (1)
      mean: 17971 (0.27422)
      standard deviation: 23984.7 (0.365984)
      kurtosis: -0.821928
      skewness: 0.926717
      entropy: 0.702296
    Green:
      min: 2570 (0.0392157)
      max: 65535 (1)
      mean: 31195.9 (0.476019)
      standard deviation: 19404.1 (0.296088)
      kurtosis: -1.2913
      skewness: 0.55737
      entropy: 0.936192
    Blue:
      min: 0 (0)
      max: 65535 (1)
      mean: 29607.2 (0.451777)
      standard deviation: 19640.6 (0.299696)
      kurtosis: -1.29059
      skewness: 0.567892
      entropy: 0.932274
    Alpha:
      min: 65535 (1)
      max: 65535 (1)
      mean: 65535 (1)
      standard deviation: 0 (0)
      kurtosis: 0
      skewness: 0
      entropy: 0
And here's a PDF where I used GS to optimize the ImageMagick-created PDF:

Code: Select all

  Format: PDF (Portable Document Format)
  Mime type: application/pdf
  Class: DirectClass
  Geometry: 621x630+0+0
  Resolution: 72x72
  Print size: 8.625x8.75
  Units: Undefined
  Type: PaletteAlpha
  Endianess: Undefined
  Colorspace: sRGB
  Depth: 16/8-bit
  Channel depth:
    red: 8-bit
    green: 8-bit
    blue: 8-bit
    alpha: 1-bit
  Channel statistics:
    Pixels: 391230
    Red:
      min: 0 (0)
      max: 65535 (1)
      mean: 50281.4 (0.767246)
      standard deviation: 18430 (0.281224)
      kurtosis: 0.456069
      skewness: -1.40546
      entropy: 0.838868
    Green:
      min: 0 (0)
      max: 65535 (1)
      mean: 50288.3 (0.76735)
      standard deviation: 18432.3 (0.281259)
      kurtosis: 0.456317
      skewness: -1.40567
      entropy: 0.83818
    Blue:
      min: 0 (0)
      max: 65535 (1)
      mean: 50280.7 (0.767234)
      standard deviation: 18429.8 (0.281221)
      kurtosis: 0.455978
      skewness: -1.40539
      entropy: 0.838893
    Alpha:
      min: 65535 (1)
      max: 65535 (1)
      mean: 65535 (1)
      standard deviation: 0 (0)
      kurtosis: 0
      skewness: 0
      entropy: 0
This is the GS command I am using to test:

Code: Select all

gs \
-dNOPAUSE \
-dBATCH \
-dQUIET \
-sDEVICE=pdfwrite \
-dAutoRotatePages=/None \
-dColorConversionStrategy=/LeaveColorUnchanged \
-dColorImageDownsampleType=/Bicubic \
-dColorImageResolution=304 \
-dCompatibilityLevel=1.4 \
-dDoThumbnails=true \
-dPreserveOverprintSettings=true \
-sOutputFile=${args.gsConvertedPdf} \
${args.imageMagickPdf}
Strange that they both say 72x72 for resolution? Also, I assume "Type: TrueColorAlpha" and "Type: PaletteAlpha" are the lines that would tell me if there's an alpha channel?
fmw42 wrote:Best quality in a PDF is really what density you use when converting to raster formats, but it may also depend upon the density you use to convert your raster into the PDF. The PDF is a vector format that does not have a quality or size per se. But in this case you are imbedding a raster image into a vector PDF shell. So both might matter.
Interesting! So, that might explain why the PDF info from above say 72x72 … That's the PDF resolution, but not the resolution of the images?
fmw42 wrote:Your compression and quality may only be relevant in defining the raster image imbedded in the PDF shell. But note that PNG compression and quality are totally different from JPG. So if it matters, you should specify them differently for PNG and for JPG.

See
http://www.imagemagick.org/script/comma ... hp#quality
http://www.imagemagick.org/script/comma ... p#compress
Excellent tips! Thank you so much for the clarification, linkages and help! I really owe you one (er, I owe you many)! :)

Thank you!

Re: Convert images to PDF, remove transparency?

Posted: 2016-10-27T22:17:27-07:00
by mhulse
fmw42 wrote:A quick tests does show that quality makes a difference for JPG
Ah, that's good to know! Thank you for testing.
fmw42 wrote:I do not see much difference the "quality" of these two when setting the density --- both scale nicely when the window is enlarged as it should be in a PDF. I suspect the only difference would be if you extracted the two imbedded images and then printed them.

Code: Select all

convert \
lena.jpg \
-quality 100 \
-density 72 \
tmp100_72.pdf

Code: Select all

convert \
lena.jpg \
-quality 100 \
-density 600 \
tmp100_600.pdf
Interesting! I did check images in my PDF by opening in Photoshop. They got extracted with a resolution of 304 (my input density when making the images using ImageMagick).

This is very interesting! I'll keep experimenting and reading the docs. Thank you so much for your help fmw42! This is of great help to me. :)

Re: Convert images to PDF, remove transparency?

Posted: 2016-10-27T22:33:07-07:00
by mhulse
This is interesting news:

http://stackoverflow.com/a/31593921/922323
Use Ghostscript with the pdfwrite device to produce a new PDF based on the input PDF. Set CompatibilityLevel to 1.3 so that transparency will be 'flattened' (ie rendered to bitmap).

Or use one of the rendering devices to produce a bitmap (eg JPEG).

An appropriate command could be as simple as this:

gs -o out.pdf \
-sDEVICE=pdfwrite \
-dCompatibilityLevel=1.3 \
out.pdf
So, I guess that makes sense. Using PDF 1.3 has not transparency. I assumed the same to be true for 1.4 (the version that I assumed IM outputs). I also assumed there was a PDFA 1.4 (A for alpha). Though, Google for "PDFA 1.4" does not yield anything useful, so maybe I am confused about PDF versioning. :D

If worse comes to worst, maybe I can just use GS to convert the IM PDF and peg the PDF version to 1.3, thus force-removing the alpha purely by changing the PDF version?

Re: Convert images to PDF, remove transparency?

Posted: 2016-10-28T00:23:56-07:00
by fmw42
This page talks about PDF (reading) == http://www.imagemagick.org/script/formats.php

Writing to PDF may be done internally to IM. It is not clear to me that IM uses GS to write PDF, but that could be the case. There may be a way to modify the delegates.xml file to force it to write to PDF 1.3?

Ghostscript or IM may be using pdfwrite to save the output. You might look into that.

One of the other users or IM developers may be able to point you in the right direction.

Re: Convert images to PDF, remove transparency?

Posted: 2016-10-28T00:36:27-07:00
by snibgo
I'm fairly certain that IM doesn't use Ghostscript to write PDFs.