Size-optimizing colors in jpeg image; also cropping oddities

tsftd · Post by **tsftd** » 2012-01-14T12:40:08-07:00

I have a large number of (legal) comic book image files which I'm batch converting to fit on my iphone (since the originals are quite large and Apple charges an arm and a leg for each gb of storage space). The source files are jpeg, and the target format is also jpeg. I've got all the basics down (resizing, converting to grayscale, etc), but I'm trying to squeeze some more space out.

Given the nature of the comics, I don't need a huge range of color. Some pictures (mostly covers) I'm keeping in color, but those generally don't have a huge range, and quite frankly, I'm not all that picky. Most of them are grayscale. I have tried 2 things: 1) reducing the number of colors/bit depth. this doesn't appear to help since imagemagick appears to have a minimum of 8 bits for jpeg images, and they are both already 8 bit depth. it does vary the image sizes somewhat (occasionally slightly smaller, but most often slightly larger), probably because *internally* it does convert the bit depth, changing the colors somewhat, and thus changing the output file more or less randomly.

2) try to reduce the number of colors, on the principle that it should make the files more "compressible" -- many of the images have slight variations in color that are simply artifacts of the scanning process, so reducing the number of colors should theoretically cause those colors to become the same, increasing gains from pattern compression. this was not, in fact, the case, for either the output files or zipping the output files. this is especially confusing since reducing an image to one color in fact DOES reduce the size, meaning that there must be some sort of pattern compression going on. also note that when using -colors [x] or -posterize [x], identify -format %k [x] does NOT confirm that the number of colors was reduced; in fact, the number usually increases for color images (and stays at 256 for grayscale).

I also am confused about the results of the trim command. many of the images have spare white space to the left or right, which I can use the trim command to remove. however, the output file often is LARGER. And no, I'm not making the mistake of comparing the original to the output; since trim isn't lossless, I'm outputting an untrimmed file for comparison. shouldn't a SMALLER file actually be smaller in size? even if the trimmed data is fairly repetitive...

I certainly am no jpeg expert, so anyone who can explain why these are happening and/or how to fix whatever I may be doing wrong would have my gratitude.

Post by **fmw42** » 2012-01-14T13:32:25-07:00

IM decompresses and recompresses jpgs. So you will lose some quality. However, I suspect your problem is that the recompressed output by IM is using the default -quality setting, which may not match the input images quality setting. So you can reduce the size at the expense of lower quality by adding -quality XX to your command and find the smallest XX that is tolerable for you. Also it is best not to save the image as jpg, then trim and then write it again to jpg. If you have to go through an intermediate step to know where to trim, then save the pre-trimmed image as png (with +repage after the trim) and then convert to jpg again.

see
http://www.imagemagick.org/script/comma ... f0#quality

tsftd · Post by **tsftd** » 2012-01-14T15:51:45-07:00

Thanks, but that's what I meant by "And no, I'm not making the mistake of comparing the original to the output; since trim isn't lossless, I'm outputting an untrimmed file for comparison." ie, when comparing, I output the files like so:

convert file.jpg -colorspace gray control.jpg
convert file.jpg -colorspace gray -colors 128 testcolors.jpg
convert file.jpg -colorspace gray -trim -fuzz 75% testtrim.jpg

and then compare the sizes of "control.jpg" vs "testcolors.jpg" and "control.jpg" vs "testtrim.jpg". quite frankly, even if i were to use the same quality setting -- easily obtainable from the identify command -- and i knew that the .jpg had been made with imagemagick (since different programs have different underlying values for the 'quality' setting), comparing the output to the original wouldn't be fitting for my case; since I am already running it thru another transform, it's going to go thru one more generation anyway, so the real test is comparing it to an image identical in every way except without the transform in question.

also, when running images thru chains of transforms, I convert them to .pnm files -- in fact, i use cjpeg to do the final encode to jpeg rather than imagemagick. I'm just doing jpeg to jpeg in these tests because it gives me what i need to know without having to go thru a big chan.

Post by **fmw42** » 2012-01-14T16:28:02-07:00

All I can say is that if you want to control image size, you can add -quality XX to your command. IM may not get the right quality from your input and may default to -quality 92.

Also in general the fuzz parameter should come before the -trim.

NicolasRobidoux · Post by **NicolasRobidoux** » 2012-01-16T09:40:59-07:00

For things like comics, JPEG is absolutely the wrong format. Go with png. If you make sure that you have 256 colours or less in each image (through careful dithering, if necessary), png will use an 8-bit palette (PNG8) and this should lead to very small files if you experiment with the various types of compression (png, unlike jpg, has "quality settings" which actually are altogether different methods). The book Even Faster Web Sites by Steve Souders (which generally has on target advice, although some of the proposed software tools are not always the best) discusses this.

The other thing is that you need to choose a resizing method which fits the content and resizing ratio (and, if you decide to stick with JPEG, the details of how you are using JPEG compression: I have manipulated the details of the encoding in the past to get more bang for the buck; ideally however, the resizing method and the details of the JPEG encoding need to match).

If you are willing to hire a consultant, I've done similar work for a large company (all images on their site use my code and they get hundreds of thousands of page views per day) and probably could get you going really quickly and reasonably cheaply.

NicolasRobidoux · Post by **NicolasRobidoux** » 2012-01-16T09:53:53-07:00

More often than not, JPEG will produce a larger file for a given visual quality if you apply it to an image which has been forced to use a small palette. The reason is that such an image will have blocks of flat colour, and at the sharp boundaries between such blocks you'll need a lot of discrete cosine "modes" to capture the interface well. This will be less pronounced if you use "high quality dithering" than otherwise, because then the interfaces between blocks will be smoother; nonetheless this is unavoidable.

Summary: JPEG generally does not benefit from reducing the number of colours in an image.

This does not mean that you can't get acceptable results by compressing (resized) comics with JPEG. Just that reducing the number of colours is not helpful in this case.

GIF and PNG (if there are sufficiently few colours: 256 or less) do benefit, but not JPEG.

NicolasRobidoux · Post by **NicolasRobidoux** » 2012-01-16T10:01:49-07:00

Another thing: If your input image is JPEG, trimming by a size which is not an integer multiple of the size of the original JPEG blocks pretty much guarantees that you'll get a larger file (unless you trim by a lot).

JPEG works by tiling the image into small blocks, and for maximum compression/best quality you need to take the block structure into account both at the input and the output stage.

Also, you are likely to get significantly different file sizes (and visual quality) depending where in the chain of operations you trim. Basically, trimming should be the very first thing you do (unless you explicitly trim along block boundaries) and you should NOT store the result of trimming into JPEG, only the very final result. (If you use JPEG as input and final output, you probably should use at least a 16 bit toolchain, although this may not make a huge difference.)

If you decide to stick to JPEG as final storage format, I could be that I can manipulate the JPEG quantization tables so that you get very good quality with minuscule files. (But again, your first pass should be PNG8 unless you're trapped in JPEG for some reason.)

Again: I've sorted out this stuff for a large client (that has both resized photographs and images like country flags, which are somewhat like comics covers) and I consult.

NicolasRobidoux · Post by **NicolasRobidoux** » 2012-01-16T11:31:40-07:00

... and I too use cjpeg, because it offers more control, because I can modify its source code so it does exactly what I want, and because it's fast.

NicolasRobidoux · Post by **NicolasRobidoux** » 2012-01-16T11:53:13-07:00

You know that cjpeg can produce grayscale JPEGs, yes? http://linuxcommand.org/man_pages/cjpeg1.html

tsftd · Post by **tsftd** » 2012-09-12T21:34:23-07:00

I finally found out what causes this behavior: posterizing introduces sharper edges between different-colored sections of the image, which in turn increases the complexity of the image for jpeg (and thus causes it to use more bits). dithering theoretically should help, since it de-sharpens the lines introduced, but it also therefore negates the primary benefit that would be theoretically gained (increased compression due to data redundancy).

Thus, posterizing a jpeg primarily helps an image which either 1) already has sharp lines between colors, 2) has an excessive number of extremely similar colors, or 3) can use an exceptionally small number of colors. In all three cases, you're looking for the benefit of increased redundancy to exceed the penalty induced by sharper distinctions between the colors.

Post by **anthony** » 2012-09-12T22:11:42-07:00

In an image with only a few colors, blurring the image very slightly before saving to jpeg will reduce the jpeg image file enormously.

Also resizing images using something like lanczos with more 'lobes' also produces smaller files as it generated a good frequency. BUt then That is in theory. I have not actually tried it.

However nicholas is right PNG or even GIF with a very small palette should compress well and perfectly, though it may not be as small as a low quality lossy JPEG.

Basically try things. And please report what you find works! People here are interested to know what you find works, or did not work.

tsftd · Post by **tsftd** » 2012-09-15T07:15:18-07:00

actually, in my tests, blurring does not increase compression (decrease output size). keep in mind, however, that these are dealing with grayscale images (with grayscale source, such as black-and-white comics) with very, very low color counts. on the order of 4-16. so unfortunately, it doesn't have enough colors to blur properly. also, my goal is to not compromise quality (as you could simply encode at a lower quality if you wanted to do that), and in my source material, blurring is a very, very bad thing to do.

in general, however, you are correct -- in fact, several passes of blur-posterize-blur-posterize would likely yield good results. although, of course, you are sacrificing quality at some point.

Legacy ImageMagick Discussions Archive

Size-optimizing colors in jpeg image; also cropping oddities

Size-optimizing colors in jpeg image; also cropping oddities

Re: Size-optimizing colors in jpeg image; also cropping oddi

Re: Size-optimizing colors in jpeg image; also cropping oddi

Re: Size-optimizing colors in jpeg image; also cropping oddi

Re: Size-optimizing colors in jpeg image; also cropping oddi

Re: Size-optimizing colors in jpeg image; also cropping oddi

Re: Size-optimizing colors in jpeg image; also cropping oddi

Re: Size-optimizing colors in jpeg image; also cropping oddi

Re: Size-optimizing colors in jpeg image; also cropping oddi

Re: Size-optimizing colors in jpeg image; also cropping oddi

Re: Size-optimizing colors in jpeg image; also cropping oddi

Re: Size-optimizing colors in jpeg image; also cropping oddi