Image size can greatly increase when alpha is on (PNG, JP2)

Misaki · Post by **Misaki** » 2015-11-07T23:04:53-07:00

This might be a bug in libraries that are managed by other people. Even if a PNG or JPEG-2000 image has an alpha channel that is fully on, it can greatly increase an image's size.

My imagemagick is 6.7.7-10 2014-08-21 Q16, the default installation from Ubuntu 14.10. (I know it's not the latest version of IM, and this version of Ubuntu no longer gets security updates.)

Some examples:

This image was made to test, and illustrate problems with displaying images in Eye of Gnome with colour distortion, the transition from black to almost black should be imperceptible but it is obvious if zoomed in.

identify -verbose reports this image to be PaletteAlpha, but exiftool says it's RGB with Alpha. It was produced by saving from GIMP, after adding an alpha channel. The no alpha output from GIMP is 21.5 kB, the alpha output from GIMP is 23.6 kB, the output from IM with 'png32:' prefixed to the filename is 25.3 kB, and the output with png24: prefixed is 23.4 kB.

identify reports that these are still PaletteAlpha and Palette, but exiftool says they're RGB with Alpha and RGB. The default is an actual palette output, which is 17.5 kB (IM doesn't report it as PaletteAlpha maybe because no pixels are transparent). Adding '-type TrueColor[Matte]' as suggested here results in the same 17.5 kB file, maybe there is an option somewhere to avoid optimizing the type but luckily using the prefix worked, even though 'identify' doesn't correctly report it.

So the file with alpha channel is 1.5~2 kB larger. In comparison, a completely white image of the same size is 1.4 kB when output from GIMP, ~400 bytes when converted with IM, and 192 bytes when converted with IM with -strip applied. So there's a difference, but not very large for this file.

Imgur doesn't host jp2 files, but my results are that the jp2 file with no alpha channel is 28.7 kB, while the jp2 with an alpha channel is 38.9 kB. (Eye of Gnome doesn't correctly display jpeg-2000 with alpha but that's another issue.) The alpha was turned off using "-alpha off", which is not to the correct way to do it in the latest version of IM but works in my version.

Another image:

(Some of the passengers on a recent plane crash.)

The original is jpg. When saved from GIMP as PNG with an alpha channel, it's 311 kB. With PNG and no alpha, it's 282 kB. It actually has fewer pixels than the first image. When converted in IM with no alpha channel, it's 284 kB; with an alpha channel, it's 314 kB. pngcrush reduces these to 281 kB and 309 kB. (Have seen a case where IM > pngcrush reduced by more than pngcrush alone, even with brute force option... reordering palette or something)

For jpeg-2000, with alpha it's 291 kB. Without, it's 196 kB.

It seems likely that the alpha channel is interfering with the encoding of the other channels, instead of being performed separately. But all programs seem affected by this; pngcrush, imagemagick, and GIMP. Even with ffmpeg, the image with alpha has a size of 475 kB, while no alpha is 454 kB. (pngcrush reduces these to the same size as the crushed output from IM.)

With recent images I was looking at, one went from about 2.6 MB as output from IM with alpha, to 2.2 MB with alpha turned off.

Is this just a 'feature' of the PNG format, that the alpha channel is encoded together with other channels and disrupts their patterns when it isn't correlated with them?

Post by **snibgo** » 2015-11-08T00:52:29-07:00

... even though 'identify' doesn't correctly report it.

"identify" reports what the data looks like after it has been read, decoded and stored in memory. It does not report what the data looks like in the file. To see that information, exiftool is useful.

Your results show that storing alpha takes more space than not storing alpha. Does this surprise you? Unless I miss something, your results show the space for alpha is generally less than one-third on top of the space for three colour channels. Which is not surprising when the alpha of all the pixels is the same.

I'll leave the PNG questions to someone else.

Post by **fmw42** » 2015-11-08T01:26:53-07:00

PNG8: means one palette or palettealpha channel, PNG24: means 3 channels (rgb), PNG32: means 4 channels (rgba)

see
http://www.imagemagick.org/Usage/formats/#png_formats

file sizes vary according to the number of channels

glennrp · Post by **glennrp** » 2015-11-08T05:52:56-07:00

fmw42 wrote:PNG8: means one palette or palettealpha channel

Also, to ImageMagick, PNG8 means binary transparency. The palette has only one fully transparent entry and the rest of the colors are fully opaque. If there are multiple transparent colors, or if semitransparency is present, it's simply called an "indexed-color PNG", as in the PNG specification. The names PNG8, PNG24, PNG32, etc., are not defined or used in the PNG spec.

Misaki · Post by **Misaki** » 2015-11-08T19:25:23-07:00

Doesn't IM always store pixels at a value specific to the build? For Q16, 16 bits per channel per pixel, regardless of whether the source image was an 8-bit palette or a 1-bit bilevel PNG. It may be that Imagemagick is reading a PNG that has 256 or fewer colors, converting it to palette automatically, and then displaying this processed result, but it seems unnecessary to do this before output and that's why it seems like a bug (that is, I am skeptical that IM converts to palette in memory automatically). Palette is obviously not suitable for things like resizing when pixels are resampled.

It is reasonable that adding an alpha channel increases an image's size. But there is no reason for the increase to be larger than the size of a grayscale or palette image that represents the alpha channel, if it were to be combined separately with the image. More generally, a completely white, rectangular image or channel is easy to describe, and should not greatly increase a file's size.

I tried to check the size of a PNG48 file, but my IM version might not be able to output it, lol. "The PNG48, PNG64 and PNG00 styles were added as of IM v6.8.2-0". I tried using -depth 16, which works for jp2, but maybe it was automatically optimized down to PNG24. The 'difference of differences' encoding mode of PNG could mean that PNG48 is close to being the same size as png24, since the differences between pixels will still be divisible by 256 (or whatever), I think? Reducing bit depth isn't quite the same because the range represents the brightest and darkest, so the step size if converted back to 0~255 isn't an integer multiple (off topic: I accidentally closed this window with Ctrl-W, and when it was reopened, this reply hadn't been cleared, nice) which means a linear increase of value would no longer be linear. Maybe someone else could test PNG48 size though, as a way to show that even if the maximum possible number of bits per pixel increases (whether from bit depth or channel count), the output size does not have to.

I appreciate the replies so far, but no one has acknowledged that inefficient encoding of alpha channels is a problem, much less confirmed that it exists in their software versions or said how the problem could be fixed.

I accidentally discovered that imagemagick has the -interlace option. This increased file size for all PNG images I tested though, except maybe a completely white image, for which the noninterlaced version is 4,794 bytes while interlaced is 4,737 bytes. I found no changes between different interlace types for any output format in my IM version. But if it was possible to keep the RGB values non-interlaced, and then use 'Plane' interlacing for the alpha channel, this would reduce file size for most large files with an alpha channel. Often, an alpha channel is included when it isn't needed (due to being completely white), and even for large files that do make use of their alpha channel, it will probably be an outline around some object and be the same size or smaller as a 'Plane' interlacing than if it were noninterlaced.

However, I don't know if the PNG specification allows for the RGB channels to be noninterlaced while having the alpha channel interlaced.

I tried testing to see if the webp format would be smaller without an alpha channel, but it seems that only the lossless webp files have one, and even if the input file for a lossless webp doesn't have an output channel, ffmpeg still reports that the webp uses the pixel format (or whatever it's technically called) of argb.

Specifically, using this image:

(Click for high-resolution version, source: US Government via Wikipedia)

convert '/home/misaki/stolen & temp/UGM-109_hits_target_on_San_Clemente_Island_1986.jpg' png24:output-png24.png

size: 5885 kB

convert '/home/misaki/stolen & temp/UGM-109_hits_target_on_San_Clemente_Island_1986.jpg' png32:output-png32.png

size: 6552 kB

cwebp '/tmp/output-png24.png' -o '/tmp/output-png24.webp'
cwebp '/tmp/output-png32.png' -o '/tmp/output-png32.webp'

Identical, size: 4413 kB

(PNG24 with -interlace PNG is 7887 kB, jp2 at -quality 100 is 4779 kB, jp2 from the png32 with alpha channel is 5579 kB)

I tried converting the png32 with alpha to jpeg network graphics format .jng, which supposedly has alpha support, but it doesn't seem to have retained the alpha channel as seen when converting back to png.

The only difference between png24 and png32 is the alpha channel. This alpha channel increased the image size by 667,485 bytes. Extracting the alpha channel confirms that it is pure white, and IM outputs it as a 4.8 kB PNG (GIMP at 24 kB, ffmpeg at 11 kB). In other formats, this completely white image is 38 kB for jpg, and 309 bytes for jp2. (311 bytes if -depth 16 operation is performed, identify reports "Depth: 16/1-bit Channel depth: gray: 1-bit", Histogram: 6470000: (65535,65535,65535) #FFFFFFFFFFFF rgb(255,255,255) )

A large difference between 4.8 kB, what you would expect if IM encoded the alpha channel as a separate plane, and 667 kB.

Post by **snibgo** » 2015-11-08T20:14:34-07:00

I know practically nothing about the 99 compression algorithms IM can use for PNG, but I think the following are true:

1. The best-compression algorithms for graphics (solid colours) do not give the best compressions for photos (noisy colours), and vice versa.

2. The same algorithm is used for all channels.

Thus, for some images, the best space-saver would use different algorithms for alpha and the colour channels. IM has no facility for this. I don't know if the PNG format has a facility for this.

Misaki · Post by **Misaki** » 2015-11-08T20:39:37-07:00

I know a bit, but basically if people confirm that the problem exists, someone would still need to have the motivation to fix it. This problem does not affect most people. If a large PNG file has an unnecessary alpha channel, converting to JPG will automatically remove it. If converting to JP2, it's necessary to add an extra operation, which requires knowledge, but knowing how to use IM at all also requires knowledge. I think I can only remember one case where I had a PNG image where the transparency was actually useful and I wanted to convert it to JP2, but didn't because Eye of Gnome can't display JPEG-2000 with transparency correctly (along with this other problem of alpha greatly increasing size of JP2 files, even if it isn't used for anything). So I just left it as PNG or maybe converted it from 16-bit-depth PNG to 8-bit.

There are multiple compression methods not only for a balance of speed and size, but also because it's a general rule that compression effectiveness is not predictable; if you compress every possible combination of 32 bits (so 4 billion combinations in all?), the outcomes must all be unique and if some of the results are smaller than 32 bits, others are larger even before adding overhead.

But all common compressed formats encode long strings of the same value efficiently, even if it's basically the same encoding used by the Zip format or whatever. When one PNG method is better than another, it's because of things like encoding a row efficiently means the columns are encoded inefficiently, or something.

Oh well, I just tried something but the results weren't as I expected. Using '-interlace png', or '-interlace plane' (doesn't matter, line and plane give same result) on the recent image with alpha resulted in a significantly larger file than without alpha. I have a vague understanding of what some of the values for zs and fm mentioned here could be, but not really; so I tried setting them all to one and compared. The -interface line, plane, and png options are all the same, interlacing still increases the file size significantly, and turning off alpha still decreases file size significantly. (8.9 MB interlaced with alpha off, 10.3 MB interlaced with alpha on.)

So probably, PNG is not interlacing by plane at all, but by line or something.

Post by **snibgo** » 2015-11-08T21:49:37-07:00

Misaki wrote:I know a bit, but basically if people confirm that the problem exists, someone would still need to have the motivation to fix it.

If the problem you want solving is that IM doesn't always make the smallest file possible then, yes, I agree that problem exists. Making the smallest possible output takes far more time, and writing a PNG file from a photo is already slow. If IM contained a flag that said "run pngcrush on the output", I'd be happy. If it always did this, I'd be unhappy, because most of the time I don't care how large the PNG file is.

If you have a patch that improved IM's PNG compression, without adding time, I'd be happy.

You may be interested in time and space trials I've done for PNG and some other formats: http://im.snibgo.com/spdsiz.htm

Misaki · Post by **Misaki** » 2015-11-08T22:19:09-07:00

If it's possible for PNG to encode alpha as a separate plane, then a simple test could determine whether this should be done. One issue is that while you might have an image where alpha changes with anti-aliasing at the border of a figure, and the image also changes at that point and has a background of the same colour for all fully-transparent pixels, changes to the image at that edge will probably not exactly match changes to the alpha channel. I was thinking of doing a test of this (like slightly blurring an image, subtracting from original, then finding size out of output) but it's probably not worth it. In the past I tried subtracting a jpg encoding from an image (then adding 128 to values), and the resulting image, which looked totally gray but had tiny variations, was about the same size as the original PNG.

So it's possible for typical images, where the alpha channel is generated manually by a human and is thus likely to be simple, there is no benefit for making the image's and the alpha channel's encoding interact at all. With some cases it would be worse, with others it would be better, but the average benefit would be small and not worth taking the time to test it or not balancing the penalty when no test is done.

That is, maybe you could assume that an alpha channel is mostly flat, and when it changes the image will also change, though you don't know the direction. Maybe it's a white shape on a transparent black background, or a black shape on a transparent white background. In other cases, like with the moon used in Imagemagick examples, the underlying image doesn't change in any predictable way when alpha changes. If you efficiently specify the direction of change, using the alpha as a 'base' could maybe improve the image's compression; but this is more likely for a lossy encoding format that can be 'good enough', and PNG isn't lossy.

For the few cases where the change is totally predictable, and there is no anti-aliasing of alpha, it's already possible to use PNG24 with some colours being specified as transparent.

But if a test were done, it would probably be something like "How large is the alpha channel if you do a very fast compression attempt on it? If it's larger than a certain value for the image size, assume it should be encoded together with the other channels even if this guess is wrong."

But mostly, I don't know if other software versions have this problem. My version of IM is from June 2012, almost three years old! I also don't know if there are more recent, or better versions of the libraries that handle JPEG-2000 or PNG images.

Misaki · Post by **Misaki** » 2015-11-08T23:23:15-07:00

snibgo wrote:You may be interested in time and space trials I've done for PNG and some other formats: http://im.snibgo.com/spdsiz.htm

The PNG chart definitely looks useful for someone who frequently saves PNG images. As a sort of contrast though, this is for IM; I also sometimes use GIMP, which has a 'Compression' slider when saving PNG files from 0 to 9. I think this corresponds to the vertical values in a column in your chart, since the first value is named compression-level and, other than the default of 00 and -quality 20, time generally increases as you descend. Other than the 'x5' column (compression-filter=5?) there aren't large changes in size, so this chart might tell you that you could save time by using less compression in GIMP but you might not be able to control the compression-filter ("fm") in that program.

A random search result says that some compression strategies are better or worse for photographic images, which would be more useful information if software would automatically identify such images when determining compression strategy. I just skimmed a few bits of it~

The "good" highlighting doesn't seem to work. It shows up in the source, but there is no highlighting on my browser. RMSE difference for jpg files are all at 0 in the first chart, and all have the same value in the second chart.

Video uses 'post-processing' to deal with artifacts similar to those in jpeg, the deblock loop filter thing in h264. It would probably increase quality for jpeg images for diagram-type images, and could by applied by viewing software. In the x264 video encoder for h264 codec, the "animation" preset (cartoon-style animation, not 3D graphics) increases the strength of the deblocking filter, which decreases artifacts around sharp edges.

BPG (other) probably uses these smoothing mechanisms, which avoid smoothing if the difference between pixels is above certain values, and did well in a comparison study by Mozilla. I think it has a lossless form but they didn't test speed, just quality. A lot of the compress methods you tried didn't work in those formats, when I tried using "-compress" with jpg and png files they didn't do anything. I have never used any of those formats other than JPG and PNG, but I think World of Warcraft screenshots used to be saved as tiff.

The reason jpg files did better with -quality ≥ 90 in your study is that when the input is RGB, Imagemagick selects -sampling-factor 1x1 if quality is manually set to ≥ 90. The default from RGB input is quality 92, sampling-factor 2x2 (same as 4:2:0). Using 2x2 chroma subsampling (or others; sometimes I have selected 2x1, but 1x2 seemed like it might have introduced more artifacts for some reason maybe) seems sort of like a habit from when video signals were sent uncompressed using radio waves. The x264 video codec increases the bits allocated to chroma channels when using yuv420 format, and decreases them when using yuv444 format (1x1 chroma pixel things), so that despite yuv444 having four times as many chroma samples, the bitrate is about the same when other quality settings are kept constant. (It also lets you increase quality separately, mapped to '-chromaoffset' in ffmpeg, but this could interact better with bitrate and quantizer limits as could other modifiers like a setting that affects sharp edges vs flat areas.) But a larger advance will be in selectively decreasing quality for 'uninteresting' parts of a video frame, like backgrounds.

The size differences between PNG and uncompressed formats for the photo would be larger if comparing 8-bit files. Harder to find patterns with 16-bit channels. This page says normal JPG can be 12-bits, which with gamma colorspaces or whatever, would let you save intermediate photos as JPG at much smaller file sizes without any effect on final quality. I was reading people arguing (via) and they at least agree that 12 bits is enough, if a 12-bit JPG still had two bits of 'noise' it would still be better than the 8-bit images found on the web. Other PNG compression methods are also only slightly slower than the fastest, but offer significantly better compression. For example, you select TIFF using ZIP for long-term storage, which takes 14.6 seconds for 77% output size. But PNG -quality 04 takes 4.1 seconds for 77.8% output size, and -quality 03 takes 3.1 seconds for 79% output size. Using -quality 11 to 14, instead of 01 to 04, just takes longer on the photo, but with the diagram it makes the output size for PNG comparable to that of TIFF with ZIP, while still being significantly faster and of similar size to TIFF with ZIP for the photo.

Misaki · Post by **Misaki** » 2015-11-09T02:09:12-07:00

Another example of how it should work. Taking the most recent photo (missile above a plane), resizing it to x500 using '-resize x500' and adding alpha with '-resize x500 -alpha opaque', and uploading the two resulting images to BPG Web Encoder:

output-x500.png
569.73 KB BPG size : 17.65 KB

output-x500-alpha.png
662.03 KB BPG size : 17.81 KB

At this quality level, much of the 'grain' or detailed noise is no longer visible, but there are no ringing artifacts at edges and, more relevant to this thread, the BPG file with alpha is only ~160 bytes larger than the one without. (The PNG images output by the Javascript decoder are both rgba/TrueColorAlpha format, though, and the same size.)

glennrp · Post by **glennrp** » 2015-11-10T06:07:27-07:00

The PNG format specification does not accomodate storing the alpha channel in a separate plane. PNG32 is always composed of RGBA pixels, stored as
RGBARGBARGBA...

glennrp · Post by **glennrp** » 2015-11-10T06:22:45-07:00

I accidentally discovered that imagemagick has the -interlace option. This increased file size for all PNG images I tested

In PNG, interlacing means something different from what you expected. It sends the RGBA pixels in an order that can be displayed progressively, so a low-resolution image appears first followed by successively higher-resolutions. An interlaced PNG is almost always somewhat larger in filesize.

Legacy ImageMagick Discussions Archive

Image size can greatly increase when alpha is on (PNG, JP2)

Image size can greatly increase when alpha is on (PNG, JP2)

Re: Image size can greatly increase when alpha is on (PNG, JP2)

Re: Image size can greatly increase when alpha is on (PNG, JP2)

Re: Image size can greatly increase when alpha is on (PNG, JP2)

Re: Image size can greatly increase when alpha is on (PNG, JP2)

Re: Image size can greatly increase when alpha is on (PNG, JP2)

Re: Image size can greatly increase when alpha is on (PNG, JP2)

Re: Image size can greatly increase when alpha is on (PNG, JP2)

Re: Image size can greatly increase when alpha is on (PNG, JP2)

Re: Image size can greatly increase when alpha is on (PNG, JP2)

Re: Image size can greatly increase when alpha is on (PNG, JP2)

Re: Image size can greatly increase when alpha is on (PNG, JP2)

Re: Image size can greatly increase when alpha is on (PNG, JP2)