Page 2 of 3

Re: image size convention in resize.c

Posted: 2010-11-12T08:32:53-07:00
by NicolasRobidoux
I'll get going on this slowly.

There are, actually, three desirable behaviors:

corners (current behavior, and default; it extrapolates when upsampling)

centers (it extrapolates when downsampling)

and

switch (mix and match depending on whether one is enlarging or downsampling in each direction; it never extrapolates).

For a quick discussion of why extrapolation should be avoided if one wants accuracy, see http://en.wikipedia.org/wiki/Extrapolation. Basically, if you want accuracy, you should minimize the impact of the abyss. (Note that the orthogonal resize methods indirectly "feel" the abyss by virtue of the "chopping off" of the piece of the filter kernel which sticks ouf of the positions of the centers of the boundary pixels, which destroys the fact that the kernels are symmetric left/right and up/down. Since this symmetry contributes greatly to the good properties of the filters (accuracy being at the top of the list), this justifies wanting to minimize the number of pixels for which kernel truncation occurs.)

One may ask: "Do we really want accuracy?" and this is a valid question. For example, if you are blending into transparency, accuracy is clearly a secondary consideration.

(Enough pontificating. I'll write an explanatory blurb for IM Examples once the -define is programmed.)

nicolas

Re: image size convention in resize.c

Posted: 2010-11-12T12:14:36-07:00
by fmw42
Nicholas,

Not sure I follow all this, but it looks like how the filter kernel is extended at the edges of the image or more precisely how the image is extended so that a full kernel can be use. I am not absolutely sure -resize works with -virtual-pixel, but -virtual-pixel allows one to extend the image (for convolution, fx and distorts) in many ways


convert -list virtual-pixel
Background
Black
CheckerTile
Dither
Edge
Gray
HorizontalTile
HorizontalTileEdge
Mirror
Random
Tile
Transparent
VerticalTile
VerticalTileEdge
White


I many off base here, but Anthony can fill you in more about this and whether it is relevant to resize.

Fred

Re: image size convention in resize.c

Posted: 2010-11-12T18:50:21-07:00
by NicolasRobidoux
fmw42 wrote:Nicholas,

Not sure I follow all this, but it looks like how the filter kernel is extended at the edges of the image or more precisely how the image is extended so that a full kernel can be use. I am not absolutely sure -resize works with -virtual-pixel, but -virtual-pixel allows one to extend the image (for convolution, fx and distorts) in many ways

Fred
In my understanding, no abyss is used for orthogonal resize.

To some extent, this is besides the point (pardon the bad pun!) anyway.

Whether the scheme is extended with artificially created abyss values, or the filter kernel has the part sticking out "chopped off", or the filter kernel is "folded," the result near the boundary simply can't be as accurate as when it's used in the interior of the image. Things get worse if you extrapolate.

nicolas

Re: image size convention in resize.c

Posted: 2010-11-12T19:31:29-07:00
by fmw42
OK, thanks for the clarification. I was not sure if any of the multiple lobed schemes needed extended data. As I recall the Keyes Cubic Convolution uses a 4x4 array of points to interpolate between the center 2x2. But I never really thought about or knew what it did at the edges of an image.

Fred

Re: image size convention in resize.c

Posted: 2010-11-12T20:41:43-07:00
by NicolasRobidoux
The resizes which don't involve the resample code (a.k.a. distort) methods simply only give weights to the points within the support which actually are within the image (this is not equivalent to using the nearest neighbour a.k.a. smear abyss policy).

This is like chopping off the piece of the kernel which sticks out of the data.

It's a reasonable and practical way of dealing with boundary issues.

It's also fairly inaccurate.

Re: image size convention in resize.c

Posted: 2010-11-14T22:16:18-07:00
by anthony
fmw42 wrote:I am not absolutely sure -resize works with -virtual-pixel, but -virtual-pixel allows one to extend the image (for convolution, fx and distorts) in many ways
Fred... The reason for this discussion is that resize actually works in the way Nicholas describes.

The edges of the image in a resize is regarded as being.... Non-existant.

I do not mean they are thought of as transparent, or containing virtual pixels, but as being an 'abyss' of nothingness. It simply ignores any contribution from any 'out of bounds' pixels to the results of the filtered value.

For orthogonal resizing this is desirable, as otherwise we will get edges which have either a transparency component even when none is present, or a bolding effect at edges and especially in the corners.

However the thing that Nicholas is looking at is how the filter is actually the positioning (the central resampling point) used for the resize. Especially for enlargements.

This is not a problem with distort, which 'does the right thing' but when an abyss is involved the distort 'right thing' is not necessarily the 'right thing' for resize. This is why he is looking at generating a define to allow experimenting with the two methods.

PS: +distort also added extra pixels due to the filter size. Really that size is currently just +1 pixel. In reality it should be + filter support for that edge. Currently that is a little difficult to pre-calculate for most distorts, and in many cases a +1 is good enough. Also remember distort does not resize exactly to integer pixel bounds, but can resize to sub-pixel bounds.


Notes on Resize vs Distort Image Comparison...
To compare distort operator with the resize operator you would need to use affine distort with the destination pixel size that resize determined, then crop appropriately from 0,0 to W,H without a +repage (the -crop of the resulting 'layer image' replaces the +repage).

This will ensure that BOTH X and Y scaling are the same in both cases, though not necessary the same with each other. Resize will rarely preserve aspect ratio perfectly, only nearly due to integer rounding effects.

This is essentially what the second example in
http://www.imagemagick.org/Usage/distor ... ol_escapes
did by using flatten to crop the results to the source image. No +repage was used.

So this rezise

Code: Select all

convert rose: -resize 500x rose_resize.png
identify rose_resize.png
rose_resize.png PNG 500x329 500x329+0+0 8-bit DirectClass 133KBB 0.000u 0:00.000
and this distort

Code: Select all

convert rose: -alpha on -virtual-pixel transparent \
        +distort Affine '0,0 0,0   %w,0 500,0   0,%h 0,329' \
        -alpha off  -crop 500x329+0+0   rose_distort.png
Are the actual equivalents, with exactly the same scaling. The alpha handling in distort ensures that distort edge colors receive no extra color effects due to virtual pixels. Without the alpha off, some transparency will leak into the image, just as some of the image leaks out into the transparency (like blur or any convolve would).

Time-wise the distort took twice as long due to the convolution effects.

Only differences should be the exact filter (defaults in the above) and the way the filter was applied. That is 2-pass orthogonal vs 1-pass cylindrical (radial). In this case I find the 'resize' slightly blurrier than the equivalent distort! But then the distort also more blocky.

You need to do a "flicker_cmp" of the two images to actually see this.

Image Resize
Image Distort

The above two shows that the current default handling of resampling points for resize and distort are identical.

The proposed -define will change this, modifying the scaling of the results very slightly.

Re: image size convention in resize.c

Posted: 2010-11-14T23:16:34-07:00
by NicolasRobidoux
I just realized that there is not strict need for a flag that allows automatic switching, since this can be emulated by doing resizes one direction at a time using each convention separately.

(Oops! This only works when using orthogonal resizing, and then only if one controls carefully the intermediate dimensions. Forget that!)

(Just to make sure my bad idea was clear: to enlarge from wxh to WxH, first enlarge to Wxh making sure to keep the horizontal alignment unchanged, then to WxH. And, yes, it would be slow.)

Re: image size convention in resize.c

Posted: 2010-11-15T00:06:47-07:00
by anthony
NicolasRobidoux wrote:I just realized that there is not strict need for a flag that allows automatic switching, since this can be emulated by doing resizes one direction at a time using each convention separately.
I can't really see how you can do that!

You can't use two pass distortion, as that would still use a cylindrical filter and technique, not a 1 dimentional orthogonal filter. And resize will still do it using its own integer bounds calculations.

The scaling is slightly different, and even the final image size may be slightly different due to the rounding to integers, may result in a slight one pixel size change.

Remember for -resize WxH the image is resize to fit one dimension, then the other dimension is calculated using that scaling, rounded to integer, and that scale recalculated to those 'pixel bounds'. The result is that the slight scle change, can be a slight change in the second 'calculated' dimension.

No I can't see you doing this, and still having filters applied correctly.

Re: image size convention in resize.c

Posted: 2010-11-15T11:21:00-07:00
by NicolasRobidoux
Here are the results with

Image Lanczos resize (tensor Sinc-Sinc 3-lobe)

Image LanczosSharp distort (Clamped EWA Jinc-Jinc 3-lobe with blur=0.9812505644269356)

IMHO, the results suggest that Clamped EWA does not suffer from "diagonal hash artifacts" (checkerboard mode undamped near diagonal interfaces and lines) like analogous tensor (orthogonal) methods with negative lobes. Look, in particular, at the outline of the rose.

Image A sharper Lanczos distort (Clamped EWA Jinc-Jinc 3 lobe) with blur=.8956036897402794

Code: Select all

convert rose: -filter lanczos -resize 500x rose_lanczos.png

convert rose: \
        -alpha on \
        -virtual-pixel transparent \
        -filter LanczosSharp \
        +distort Affine '0,0 0,0   %w,0 500,0   0,%h 0,329' \
        -alpha off \
        -crop 500x329+0+0 \
        rose_LanczosSharp.png

convert rose: \
        -alpha on \
        -virtual-pixel transparent \
        -filter Lanczos \
        -define filter:blur=.8956036897402794 \
        +distort Affine '0,0 0,0   %w,0 500,0   0,%h 0,329' \
        -alpha off \
        -crop 500x329+0+0  \
        rose_Lanczosp8956.png
(Thank you for the better code, Anthony.) Warning: You need a really recent IM to get the same distort results.

Re: image size convention in resize.c

Posted: 2010-11-15T18:59:04-07:00
by anthony
Adding -filter Lanczos to my original resize and distort lines (with Im auto-switch from resize using Sinc-Sinc to distort using Jinc-Jinc).

Imageresize_lanczos
Imagedistort_lanczos

A flicker compare shows that the images are basically identical, but with except for the diagonal along the lower edge of the rose (very obvious even without flicker compare). Here distort is far smoother (less blocky) with less blocky diagonals.

And this is without any 'blur factor sharpening' added.

Seems that for diagonals, distort is now superior to resize. Which is not surprising as the 2-pass orthogonal filtering is only identical to 1-pass cylindrical when used for Gaussian filters.

Of course distort is slower!

Re: image size convention in resize.c

Posted: 2010-11-16T07:04:54-07:00
by NicolasRobidoux
There is something somewhat miraculous with Jinc-Jinc 3-lobe Clamped
EWA which does not happen, for example, with the 2-lobe version.

Assume bounded data (-1 to 1, or 0-255 or whatever).

Let "no-op" ("no operation") mean: filter only at the original pixel
locations (that is: don't enlarge, don't shrink, don't rotate, don't
translate, use the resampling filter as non-resampling
filters (like Gaussian blur or unsharp mask) are normally used.

Ideally, no-op should return exactly the same image. Resize (Sinc-Sinc)
Lanczos, Lanczos2, Catrom, Hermite, triangle, and many other resize
filters do have this property. Note however that the "most common
situations" resize default, namely Mitchell, does NOT return the same
image under no-op: there is some blur built in (Mitchell-Netravali is
a blend of B-spline smoothing and Catmull-Rom).

Cylindrical (distort) methods, however, basically never satisfy this
condition (forcing them to appears to generally return a sucky scheme). The
question thus arises of how to tune the distort filters so as to
minimize the difference between the input and output images under
no-op. The way I've gone about doing this was by selecting a blur, that is,
by rescaling the support of the filter kernel, which has the opposite scaling
effect on the frequency response surface. I am guessing that this could
also be done by carefully building the windowing function, but I don't
have any idea of how to do this intelligently.

Result (for distort Lanczos = Clamped EWA 3-lobe Jinc-windowed Jinc):

If you want to minimize the worst case deviation over all input
images, you should actually blur Jinc-Jinc Lanczos a little: -define
filter:blur=1.0013153655794474036.

If you want to minimize the worst case deviation over all input images
which are constant on columns (or rows), you should sharpen Jinc-Jinc
Lanczos a little: -define filter:blur=0.98125056442693560925. This is
exactly what LanczosSharp is a shortcut to (a scheme which is nearly
indistinguishable from Lanczos in any case).


The minimizers do not improve things much, however. Assuming that the
image data is between -1 and 1:

The worst case deviation with the optimal blur is 0.61379977472511893,
a very slight reduction from plain Jinc-Jinc 3-lobe, which gives a
worst case deviation of 0.61500130396286701. With purely vertical (or
horizontal) data, things are much better: the worst case deviation
with the optimal blur is 0.0092922495762062882, about a 60% reduction
from plain Jinc-Jinc 3-lobe's worst case deviation of
0.024833234807352726. These numbers suggest that "macroscopic"
features are well preserved, but that fully 2D high frequency patterns
("pixel hash" and like noise) are strongly damped. This is confirmed visually.

What is (to me) absolutely amazing is how close these tuned variants
of Jinc-Jinc Lanczos are to the "unblurred and unsharpened" original.
The difference is almost always invisible to the naked eye (numerically,
we are talking a maximum difference of a few percents in the results).

That is: distort Lanczos (unlike distort Lanczos2) is just about as
good as it can be, just as it is.

Unexpected freebies are a sign
that you've hit the spot.

Re: image size convention in resize.c

Posted: 2010-11-16T09:18:50-07:00
by NicolasRobidoux
Quick note: Some people recommend using unsharp mask with a very tight radius as the very last step of a chain of transformations applied to a resized (or warped) image.

IMHO, a method which is a bit blurrier but very moderately aliased, like distort Lanczos, is a good candidate for this approach to producing sharp looking resized/warped (and possibly further processed) images, in both the upsampling and downsampling cases. Unsharp mask will not be so kind to heavily aliased resize results (whether obtained with resize or distort), especially enlargements.

Re: image size convention in resize.c

Posted: 2010-11-18T10:11:06-07:00
by NicolasRobidoux
Fred:

FYI: Here are the results with my "fully 2D" methods upsmooth (VSQBS) and upsharp (Nohalo+LBB). Sorry the alignment is not quite right: John Cupitt (of VIPS/NIP2) and I have not fixed this yet so it's easy to get things just right.

Image NIP2 upsharp (Nohalo+LBB, that is, halo-free sharpening subdivision + bounded interpolation) result

Image NIP2 upsmooth (VSQBS = diagonal preserving smoothing subdivision + quadratic B-Spline smoothing) result

There is absolutely no halo in the second picture, and almost none in the first.

IMHO, upsharp gives results superior to resize Lanczos (Sinc-Sinc orthogonal Lanczos 3-lobes). Whether it gives results superior to distort Lanczos (Jinc-Jinc cylindrical Lanczos 3-lobes) is a matter of taste (sharper or antialiased?). In the upsharp result, you can definitely see "nonlinear jaggedness sharpening artifacts" at the bottom of the rose (where it is red on white, which is where the Lanczoses add tons of haloing). (As mentioned previously, I am not sure that upsmooth has any real advantage over IM Cubic for most images except that its footprint is smaller, hence that suitably programmed it is faster.)

P.S. I may as well add the result with upsize:

Image NIP2 upsize (LBB = Locally Bounded Bicubic) result

Re: image size convention in resize.c

Posted: 2010-11-18T17:11:53-07:00
by anthony
I am sorry I am not really familar with upsmooth or exactly what you did in the last example.

For the first image I gathered you used a normal resize for both images (with diagonal jaggies) and then applied an unsharp filter, just as photoshop would normally do.

But I am not certain what a unsmooth is. Gaussian resize with unsharp? I really don't know.

You must remember that while I know a lot about this area, I don't know much of the mathematics, or newer later techniques. Most research papers look like gobble-de-gook to me. What I am is a programmer with a university degree for computer science. If I can understand the technique I can implement it.

Do you have good parameter calculation for the unsharp in a 'resize-unsharp' resize?

It seems to me that a -resize-unsharp type for another type of resizing option, might be a good addition.

Re: image size convention in resize.c

Posted: 2010-11-18T17:44:24-07:00
by NicolasRobidoux
anthony wrote:I am sorry I am not really familar with upsmooth or exactly what you did in the last example.
(No apologies needed: I was not very clear.)

NIP2 upsharp, upsize and upsmooth are generic names for "state-of-the-art" resampling methods tuned for upsampling that my collaborators and me are developing. They follow an approach which is almost orthogonal to what IM does.

These methods are so new that only one of them (upsharp) has a published article associated with it. (And then, only an old, prototype, version.)

Let me explain a bit more carefully how upsharp works. I cannot give full details in a short space (the code for it is a few hundred lines long, not including memory access and comments).

First, it doubles the density of the image by computing new values at all the midpoints of neighbouring input pixel locations (horizontally, vertically and diagonally). It does so using a (nonlinear) "sharpening" subdivision method (based on the "minmod" function) with the properties that

1) no overshoots are created, ever

2) if the input image consists of a "soft" diagonal line or interface (soft means that the transition between peak and through takes two pixels: it does not occur "suddenly") then the subdivided image is also constant on diagonals, and

3) if the input image is a linear gradient, so is the subdivided image.

Then, it interpolates the result using a nonlinear variant of Catmull-Rom which has the property that the reconstructed surface is always within the max and min of nearby values. This does not mean that there can't be a halo, just that the halo cannot overshoot or undershoot the original data (locally: it is constrained within the min and max pixel values of a 4x4 patch).

To summarize, upsharp, which is the NIP2 nickname for for Nohalo subdivision finished with Locally Bounded Bicubic interpolation, consists of:

Step 1) A "soft diagonal preserving" (nonlinear) sharpening binary subdivision scheme such that new pixel values are always between the values at the nearest two (for horizontal or vertical insertions) or four (for diagonal insertions) input pixels.

Step 2) A nonlinear "Catmull-Rom with strongly clamped over/undershoot" to get values at locations other than the subdivisions.

I could also describe the scheme as follows: First, the image density of the image is double using a rudimentary edge-detecting subdivision method which never overshoots. Then, the resulting double density version of the image is interpolated using a variant of Catmull-Rom which very strongly dampens the Gibbs-phenomenon without adding blur.

Result: A sharp and reasonably smooth scheme with nearly non-existent haloing.

The big difference with what you were describing is that the sharpening step comes first, not last.

The "up" of "upsharp" refers to the fact that this is a scheme tuned for upsampling. The "sharp" corresponds to the fact that the corresponding method is not very blurry.

I hope this helps make sense. If you are a masochist, you can have a look at http://portal.acm.org/citation.cfm?id=1557657

Just to make sure it's clear: I did not use IM for this. I used the competing library/system VIPS/NIP2. (These methods are also programmed for the GEGL library, although me and my students have not completely finished the job.) The approach taken is not very resize.c friendly.