Page 1 of 1

Possible Bug: PDF to PNG "identify: zTXt: truncated"

Posted: 2019-02-24T02:35:34-07:00
by jexler
Overview

I want to convert a PDF to a PNG while also trimming any white space and get a PNG that does not display everywhere (e.g. Firefox shows an error) and I also get a warning when I try to determine the image width with the "identify" command. Since this only happens with one of about 20 similar PDFs, I suspect this might be a bug. In addition, the generated PNG is quite large and when I formally "convert <same-width>" the PNG to a copy that copy is much smaller - this may rather be a usage issue, but I am not sure.

Step by step (URLs to the files further below)

Code: Select all

$ convert -density 300 -flatten -trim +repage moebius.pdf PNG32:moebius.png
$ ls -l moebius.pdf moebius.png
-rw-r--r--@ 1 alain  staff  4409825 Feb 23 16:01 moebius.pdf
-rw-r--r--@ 1 alain  staff  7787908 Feb 24 09:34 moebius.png
=> No errors or warnings, creates a 7.8 MB moebius.png
That moebius.png opens fine in Photoshop (CS5.1) and in Safari and Chrome Browsers, but in Firefox (current, 65.0.1) I get an error "The image [url] cannot be displayed because it contains errors."
When I open in in Apple's Preview Application, it cuts off exactly 11 pixels on the right border of the image.
(After I click anywhere in the image, then it also shows these 11 pixels until I close the image again.)

Code: Select all

$ identify -format "%w" moebius.png
identify: zTXt: truncated `moebius.png' @ warning/png.c/MagickPNGWarningHandler/1744.
1012
I get the correct width with no text cut off, but a warning.
Then I tried this:

Code: Select all

$ convert -resize 1012 moebius.png moebius2.png
convert: zTXt: truncated `moebius.png' @ warning/png.c/MagickPNGWarningHandler/1744.
$ ls -l moebius2.png
-rw-r--r--@ 1 alain  staff  2779622 Feb 24 09:34 moebius2.png
$ identify -format "%w" moebius.png
identify: zTXt: truncated `moebius.png' @ warning/png.c/MagickPNGWarningHandler/1744.
1012
So I get a much smaller moebius2.png, 2.8 MB instead of 7.8 MB.
This is close to the size I get when copying+pasting+exporting moebius.png in Photoshop (2.4 MB), which is roughly what I would have hoped for and I think also got roughly with some older version of ImageMagick a couple of years back.
Interestingly that image moebius2.png displays without error on Firefox (and also fine in Safari and Chrome).
Behavior in Apple Preview and the warning that "identify" shows remain the same, however.

I have used the current version of ImageMagick (7.0.8-9) downloaded as tgz from this site, after I had the same behavior (different line number in warning) with a 6.x.x. version installed with macports (now all macport versions uninstalled). I am on OS X 10.12.6 "Sierra".

Code: Select all

$ convert -version | grep Version
Version: ImageMagick 7.0.8-9 Q16 x86_64 2018-08-04 https://www.imagemagick.org
URLs to the files

in: https://www.artecat.ch/jexler/moebius/moebius.pdf
out 1: https://www.artecat.ch/jexler/moebius/moebius.png
out 2: https://www.artecat.ch/jexler/moebius/moebius2.png

Questions

- Am I maybe doing something wrong or unusual? (As I said, about 20 very similar PDFs convert without warnings and open without problems in Firefox and other browser - although all of them are larger than expected)
- Besides the possible bug, is there a way to get a smaller result with a single call to convert?

Re: Possible Bug: PDF to PNG "identify: zTXt: truncated"

Posted: 2019-02-24T03:11:11-07:00
by jexler
Correction: The large size of the PNG (moebius.png) and the warning are correlated, converting twice does not reduce the size of the other PNGs that I could convert from PDFs without warnings nor errors when opening then in Firefox. (I have not tried 100% all of them, but at least the ones most similar in pixel height.)

So, the two issues seems to be closely related.

Re: Possible Bug: PDF to PNG "identify: zTXt: truncated"

Posted: 2019-02-24T05:45:49-07:00
by jexler
I have run pngcheck (2.3.0, installed via macports):

Code: Select all

$ pngcheck -vt moebius.png
File: moebius.png (7787908 bytes)
  chunk IHDR at offset 0x0000c, length 13
    1012 x 13638 image, 32-bit RGB+alpha, non-interlaced
  chunk gAMA at offset 0x00025, length 4: 0.45455
  chunk cHRM at offset 0x00035, length 32
    White x = 0.3127 y = 0.329,  Red x = 0.64 y = 0.33
    Green x = 0.3 y = 0.6,  Blue x = 0.15 y = 0.06
  chunk bKGD at offset 0x00061, length 6
    red = 0x00ff, green = 0x00ff, blue = 0x00ff
  chunk pHYs at offset 0x00073, length 9: 300x300 pixels/unit (1:1)
  chunk tIME at offset 0x00088, length 7: 24 Feb 2019 10:33:54 UTC
  chunk zTXt at offset 0x0009b, length 5008274, keyword: Raw profile type xmp
    (compressed zTXt text)
  chunk IDAT at offset 0x4c6c39, length 32768
    zlib: deflated, 32K window, maximum compression
  chunk IDAT at offset 0x4cec45, length 32768
  [... quite a few IDAT chunks of lenght 32768 ...]
  chunk IDAT at offset 0x767029, length 25797
  chunk tEXt at offset 0x76d4fa, length 37, keyword: date:create
    2019-02-24T09:33:54+01:00
  chunk tEXt at offset 0x76d52b, length 37, keyword: date:modify
    2019-02-24T09:33:54+01:00
  chunk tEXt at offset 0x76d55c, length 20, keyword: pdf:Version
    PDF-1.5 
  chunk IEND at offset 0x76d57c, length 0
No errors detected in moebius.png (96 chunks, 85.9% compression).
There is a 5 MB chunk of type "zTXt" with "keyword" "Raw profile type xmp (compressed zTXt text)".
That would correspond to "zTXt" in the warning and explain the size difference, 7.8 MB - 5 MB = 2.8 MB.

moebius2.png has no "zTXt" chunk, the other PDFs that convert without warnings appear to have "zTXt" chunks, for some of the images with large height that chunk has a size the order of 1-1.5 MB.

Maybe I can just remove/omit "zTXt" chunks as a workaroud? I will search for such an option...

Re: Possible Bug: PDF to PNG "identify: zTXt: truncated"

Posted: 2019-02-24T06:02:50-07:00
by snibgo
As you are reading a PDF file, which IM delegates to Ghostscript, you should say what version of GS you use. I use v9.19.

As you are using IM v7, I suggest you use "magick", not "convert". This requires that you use the correct command order: read the image, then process it, then write it, eg:

Code: Select all

magick -density 300 moebius.pdf -flatten -trim +repage -crop 100x100+0+0 +repage -define png:exclude-chunk=zTXt x.png
I confirm that Firefox can't display the converted PNG file, and "identify" says:

Code: Select all

identify.exe: zTXt: truncated `x.png' @ warning/png.c/MagickPNGWarningHandler/1665.
"-define png:exclude-chunk=zTXt" doesn't seem to remove the very large zTXt chunk. "-strip" does, and then "identify" gives no warning, and Firefox can read it.

Re: Possible Bug: PDF to PNG "identify: zTXt: truncated"

Posted: 2019-02-24T06:05:08-07:00
by jexler
This worked (-define png:include-chunk=none):

Code: Select all

convert -define png:include-chunk=none -density 300 -flatten -trim +repage moebius.pdf PNG32: moebius.png
Lacks some metadata then like creation date tEXt chunk, but that is fine for my use case.

I also tried just to exclude the zTXt chunk with -define png:exclude-chunk=zTXt, but unfortunately that still gave the same warning as without that define...

Overall, I have a workaround now...

@snibgo: Just saw your reply now, will reply shortly...

Re: Possible Bug: PDF to PNG "identify: zTXt: truncated"

Posted: 2019-02-24T06:21:37-07:00
by jexler
Ghostscript 9.26:

Code: Select all

$ gs -version
GPL Ghostscript 9.26 (2018-11-20)
Copyright (C) 2018 Artifex Software, Inc.  All rights reserved.
magick moebius.pdf -density 300 -flatten -trim +repage -define png:exclude-chunk=zTXt PNG32:moebius.png
=> 5.3 MB, warning at identify afterwards, image has too low resolution =>I guess "-density 300" has to come before moebius.pdf

magick -density 300 moebius.pdf -flatten -trim +repage -define png:exclude-chunk=zTXt PNG32:moebius.png
=> 7.8 MB, warning at identify afterwards => also with magick exclusion of just the "zTXt" chunk appears not to work (the ways I tried)

magick -density 300 moebius.pdf -flatten -trim +repage -define png:include-chunk=none PNG32:moebius.png
=> 2.8 MB, no warning at identify afterwards

Re: Possible Bug: PDF to PNG "identify: zTXt: truncated"

Posted: 2019-02-24T06:39:40-07:00
by snibgo
jexler wrote:I guess "-density 300" has to come before moebius.pdf
Yes. I saw my mistake, and corrected it.

Is there a bug? "-define png:exclude-chunk=zTXt" doesn't work, and I think that is a bug.

Writing a png with that chunk, "identify" gives just a warning, but Firefox can't read the file. I'm not sure if IM has a bug (making a bad png), or Firefox has a bug (unable to read large zTXt chunks). I suspect the problem is in Firefox.

Re: Possible Bug: PDF to PNG "identify: zTXt: truncated"

Posted: 2019-02-24T14:35:46-07:00
by jexler
I have been so free to create a bug report at mozilla, essentially asking for feedback or if there is an easy way to find out why Firefox considers the PNG not OK. (I also asked if this might be a security measure against attacks with images with very large non-image chunks):

https://bugzilla.mozilla.org/show_bug.cgi?id=1530222

Re: Possible Bug: PDF to PNG "identify: zTXt: truncated"

Posted: 2019-02-24T21:37:30-07:00
by jexler
There's a 4'000'000 bytes limit in libpng on "not idat/fdat chunks" that Chrome currently avoids by patching libpng: https://bugzilla.mozilla.org/show_bug.cgi?id=1530222#c2

So, for all that it appears the generated PNG with 5 MB zTXt chunk is OK, no bug in that regard in ImageMagick.

Remains the issue that excluding just the zTXt chunk did not work. (That looks like a bug to me, too.)

Re: Possible Bug: PDF to PNG "identify: zTXt: truncated"

Posted: 2019-02-24T21:46:30-07:00
by jexler
jexler wrote: 2019-02-24T21:37:30-07:00 So, for all that it appears the generated PNG with 5 MB zTXt chunk is OK, no bug in that regard in ImageMagick.
Except maybe the "truncated" warning. I guess there is no limit on the zTXt chunk size in RFC 2083?
https://tools.ietf.org/html/rfc2083#page-27
https://tools.ietf.org/html/rfc2083#page-64
(Or is there maybe still something wrong with the data in zTXt chunk?)

Let me leave it at that from my side - I am not specialized on image formats and excluding all chunks works fine for my use case.

PS: If the warning makes sense, then I guess it would ideally already be displayed at the first convert from PDF to PNG?

Re: Possible Bug: PDF to PNG "identify: zTXt: truncated"

Posted: 2019-02-25T03:47:28-07:00
by snibgo
Thanks for your research.

IM uses libpng. I suppose when IM reads the large zTXt, libpng truncates it to 4 Mbytes. IM notices this and reports it as a warning.

But when Firefox does the same, it decides the image may be corrupt so doesn't display it.

Perhaps libpng needs fixing. If zTXt can validly contain 5MB, libpng should be able to read it.

Re: Possible Bug: PDF to PNG "identify: zTXt: truncated"

Posted: 2019-03-01T13:33:48-07:00
by jexler
The guys at Mozilla noted that the max chunk size is configurable in libpng and fixed in it their code with a call to png_set_chunk_malloc_max(...):

https://hg.mozilla.org/integration/mozi ... 78699bdf7b