Page 2 of 6

Re: Cheaper sinc computation for resize.c

Posted: 2010-08-30T02:58:29-07:00
by NicolasRobidoux
More questions :

Q1:

Some IM constants (for example, those defined in image-private.h) are given with 50 decimal digits after the decimal point:

#define MagickPI 3.14159265358979323846264338327950288419716939937510
#define Magick2PI 6.28318530717958647692528676655900576839433879875020
#define MagickPI2 1.57079632679489661923132169163975144209858469968755
#define MagickSQ1_2 0.70710678118654752440084436210484903928483593768847
#define MagickSQ2 1.41421356237309504880168872420969807856967187537695
#define MagickSQ2PI 2.50662827463100024161235523934010416269302368164062

Is there anything special about 50 digits that I should know about, or can I assume that a long double never is larger than an IEEE binary128, which means that anything above 40 (39, really) decimal digits (before and after the decimal point) is a waste (according to http://en.wikipedia.org/wiki/IEEE_754-2008)? (40 decimal digits being twice what's needed for standard x86 extended precision, so that's covered too.)

(Note: I realize that, possibly, MagickPI will always be rounded to double when used macro-like. I'm just checking.)

Q2:

Does it make sense to "you" to include the key parameters of our methods in the code as follows?

MagickRealType c0,c1,c2,...,c13;
...
c1=-0.2777777749828003702607348742700210983971e-1L;
...
alpha=c0+x_squared*(c1+x_squared*(c2+...

(I'm trying to get a feel for IM coding style yet get the accuracy I know I can get. For example, I'm a const keyword addict, but clearly this is not standard IM practice, and it makes no difference with a good compiler so bye bye const.)

Re: Cheaper sinc computation for resize.c

Posted: 2010-08-30T06:49:19-07:00
by magick
The constants are copied directly from the C header files for 64-bit Linux machines.

Constants are fine but you must ensure you do not have any inline declarations. ANSI compiler cannot grok them:
  • const int a=3;
    c=a*a;
    const int b=2;
Using a variable assignment or an inline constant should be just a matter of coding style preference. I suspect the inline constant may in some cases be faster, certainly it would be if the code is not optimized.

Re: Cheaper sinc computation for resize.c

Posted: 2010-08-30T07:08:01-07:00
by NicolasRobidoux
RE:
fmw42 wrote:Forgive me again for sticking my nose in.

I would suggest that it might be useful for evaluation purposes to see a graph showing the difference in the "fast" sinc from the more formal "exact" one and a difference graph and error measure. Thus, if differences are significant enough (or results differ when using in -resize), one could offer it as an alternate sinc filter for -resize so that users can choose speed over quality.

Anthony has graphed the various sinc-like filters that are used with -resize. See http://www.imagemagick.org/Usage/resize ... r_windowed as well as the windowing functions slightly below that section and other filters above that.
This is totally reasonable. We produce such graph ourselves. Note however that the most accurate of our approximations are MORE accurate than what you'll get using the double precision standard library sin function when "our" polynomials are evaluated using long doubles, so some of these plots should be taken with a large grain of salt.

Course outline to finish, grad student progress reports to evaluate, and I'm on it.

Re: Cheaper sinc computation for resize.c

Posted: 2010-08-30T07:28:00-07:00
by NicolasRobidoux
Another question:

I assumed so far that the most important use for the Sinc function is lanczos 3, hence the [-3,3] interval of approximation.

But http://www.imagemagick.org/Usage/resize ... r_windowed suggests that actually Sinc is often used over [-4,4].

Should we produce approximations of [-4,4] instead? The larger the interval, the more expansive the approximations. But not by a lot.

------

Note:

We have special purpose lanczos 2, 3, 4 approximations too, but I don't quite see how to integrate them in the resize.c code. (The lanczoses are actually easier to approximate cheaply than sinc.) We'll bring this up later. Just know that us approximating sinc does not mean that lanczos itself cannot be optimized further. It can, and we know how.

Re: Cheaper sinc computation for resize.c

Posted: 2010-08-31T07:05:42-07:00
by NicolasRobidoux
There is something that we have not done (yet). We have no issues about the accuracy of the our formulas, but we have not done speed benchmarking. (The sin function often uses a co-processor implementation when compiled with gcc, so it is pretty hard to beat, irregardless of the quality of an approximation.)

Are you able to suggest IM jobs which would be appropriate for that?

Re: Cheaper sinc computation for resize.c

Posted: 2010-08-31T21:48:17-07:00
by NicolasRobidoux
I just put an updated version of resize.c at http://web.cs.laurentian.ca/nrobidoux/misc/IM/resize.c

I'm a bit short on time right now, but I would like to explain a few things, very quickly:

1) A polynomial is used if |x|<=4 which, I understand, is the overwhelming majority of calls to Sinc. If |x|>4, the usual sin(pi*x)/(pi*x) formula is used.

2) There are two versions of the approximation: One which will give, when filtering with a Sinc which is NOT windowed with a Sinc, an answer which is always within 1 of the one obtained using an infinite precision sinc if the final image is 16 bit, and one which is within 1 if the final image is 32 bit (if I understood QuantumDepth correctly). If the filtering is done with a Sinc windowed with a Sinc (lanczos), then the bound of 1 for 32 bit holds, but the bound is 2 for 16 bit images. (Note that we have direct lanczos approximations which we are hoping to integrate into IM as well, so this point may be moot.)

The current default in the code (as per #ifdef 1) is the 16 bit version. Feel free to have some conditional condition business which switches to the 32 bit version when QuantunDepth is 32.

3) In my opinion (and, if I understand properly, Don Munsil, among others), the above tight precisions are unneeded. If you are interested in even faster approximations, let me know. The key point is that what you want is something which mimics the frequency response of the corresponding windowed Sinc, and to get that, one does not need to be this close to their <<exact>> versions. But I said I was going to provide a cheap Sinc, so I'm not going to deviate from it more than essentially what rounding does when converting back to ints.

4) IMPORTANT: If you want filtering results maximally close to those obtained with "infinite precision sinc," do NOT put something like

Code: Select all

if x==0
  return (1.0);
in the code. I can give a detailed explanation of why (in a few days) if you want one, but basically, if you do this, you will

a) introduce a (small) discontinuity in the computed Sinc values

b) actually increase the error in the final result of filtering.

c) (and add a branch.)

I could actually fix things so that forcing Sinc to return 1 when x=0 does not introduce a discontinuity, but doing so would lower (slightly) the quality of the results. I realize that this may seem unintuitive, but there is a medium length technical explanation which I'll provide (in a few days) if you ask.

5) If you are happy with this, we'll replace other approximations as well. Feedback, positive or negative, welcome, of course.

Re: Cheaper sinc computation for resize.c

Posted: 2010-08-31T22:28:01-07:00
by NicolasRobidoux
fmw42 wrote:Forgive me again for sticking my nose in.

I would suggest that it might be useful for evaluation purposes to see a graph showing the difference in the "fast" sinc from the more formal "exact" one and a difference graph and error measure. Thus, if differences are significant enough (or results differ when using in -resize), one could offer it as an alternate sinc filter for -resize so that users can choose speed over quality.

Anthony has graphed the various sinc-like filters that are used with -resize. See http://www.imagemagick.org/Usage/resize ... r_windowed as well as the windowing functions slightly below that section and other filters above that.
I put a (very ugly: to get this done quickly, I used the FLOSS computer algebra system Axiom, which I can program fast) plot of the absolute (what kinda looks like an upside down sinc) and relative (looks like an upside down modulated cosine) errors over the interval where the polynomial is used in the resize.c code for the cheapest of the two approximations. Because the sinc and the approximation are even, I only need to show [0,4] even though the approximation is "on" over [-4,4].

You will find this plot here: http://web.cs.laurentian.ca/nrobidoux/m ... 4order7.ps

There is no point plotting the polynomial itself: You will not see any difference with sinc.

Given that we minimize the maximum of the relative error over the whole interval, the above plot should not surprise anyone familiar with the Chebyshev/Kolgomorov/Remez theorems/method about minimax approximation. See, e.g. http://mathdl.maa.org/images/upload_lib ... nding.html and the excellent lecture notes http://www.damtp.cam.ac.uk/user/na/PartIIIat/b05.pdf.
Our approximations are actually computed using the boost library's minimax program (with some very careful reformulation of the original problem): http://www.boost.org/doc/libs/1_36_0/li ... nimax.html. If you'd like me to post the driver code, let me know.

Re: Cheaper sinc computation for resize.c

Posted: 2010-08-31T23:42:50-07:00
by NicolasRobidoux
So that people not have to download the source from my web site to look at it, here is the key new piece of resize.c.

Code: Select all

static MagickRealType Sinc(const MagickRealType x,
  const ResizeFilter *magick_unused(resize_filter))
{
  const MagickRealType xx = x*x;
  if (xx <= 16.0)
    {
      /*
	Approximations of the sinc function over the interval [-4,4]
	constructed by Nicolas Robidoux with the assistance of Chantal
	Racette with funding from the Natural Sciences and Engineering
	Research Council of Canada.
      */
#if 1
      /*
	Approximation with maximum relative error 6.3e-6 < 1/2^17 when
	computed in double or in long double precision, small enough
	for QuantumDepth = 16.
      */
      const MagickRealType c0= 0.173610016489197553621906385078711564924e-2L;
      const MagickRealType c1=-0.384186115075660162081071290162149315834e-3L;
      const MagickRealType c2= 0.393684603287860108352720146121813443561e-4L;
      const MagickRealType c3=-0.248947210682259168029030370205389323899e-5L;
      const MagickRealType c4= 0.107791837839662283066379987646635416692e-6L;
      const MagickRealType c5=-0.324874073895735800961260474028013982211e-8L;
      const MagickRealType c6= 0.628155216606695311524920882748052490116e-10L;
      const MagickRealType c7=-0.586110644039348333520104379959307242711e-12L;
      const MagickRealType p=c0+xx*(c1+xx*(c2+xx*(c3+xx*(c4+xx*(c5+xx*(c6+xx*c7
								       ))))));
#endif
#if 0
      /*
	Approximation with maximum relative error of 4.1e-11 < 1/2^33,
	small enough for QuantumDepth = 32.
      */
      const MagickRealType c0 = 0.173611111104053387736747210985091995555e-2L;
      const MagickRealType c1 =-0.384241241675270460704990597975054901693e-3L;
      const MagickRealType c2 = 0.394206107585307194760735392082304221077e-4L;
      const MagickRealType c3 =-0.250994418576941322440573445154577235099e-5L;
      const MagickRealType c4 = 0.112006375446163666148081492819921348554e-6L;
      const MagickRealType c5 =-0.374978898062694028977311290390107785130e-8L;
      const MagickRealType c6 = 0.983871412287130403267322909960351120031e-10L;
      const MagickRealType c7 =-0.208263021467529255455129616917897259775e-11L;
      const MagickRealType c8 = 0.360360141255689825413969279496845105034e-13L;
      const MagickRealType c9 =-0.500117812133871122182855704211250504815e-15L;
      const MagickRealType c10= 0.506270333308352987196209731044295839327e-17L;
      const MagickRealType c11=-0.277631746025848834036870351854616274324e-19L;
      const MagickRealType p=c0+xx*(c1+xx*(c2+xx*(c3+xx*(c4+xx*(c5+xx*(c6+xx
                                   *(c7+xx*(c8+xx*(c9+xx*(c10+xx*c11))))))))));
#endif
      return((xx+-1.0)*(xx+-4.0)*(xx+-9.0)*(xx+-16.0)*p);
    }
  else
    {
      /*
	If |x|>4, use the "direct" approach:
      */
      return(sin(MagickPI*(double) x)/(MagickPI*x));
    }
}

Re: Cheaper sinc computation for resize.c

Posted: 2010-09-01T05:47:52-07:00
by magick
Your patch will be available in ImageMagick 6.6.4-0 Beta by sometime tomorrow. Thanks.

Re: Cheaper sinc computation for resize.c

Posted: 2010-09-02T06:17:19-07:00
by NicolasRobidoux
Thank you.

-----

Sorry for being so picky about such a small piece of code, but I recommend that

Code: Select all

  if (x == 0.0)
    return(1.0);
be removed.

If, for whatever reason, you want to leave it there, I will produce a slightly inferior version of the approximations which is not affected by Sinc(0) being hardwired to 1.

SHORT TECHNICAL EXPLANATION

Hardwiring Sinc(0)=1 introduces a discontinuity in the filter's formula. This discontinuity is very small: Because the worse absolute error is actually at 0, it is about 6.3e-6 for the formula used with QuantumDepth <= 16, and 4.1e-11 otherwise. However, it is there, and discontinuous kernels add artifacts (which, here, will be, most likely, irrelevantly small).

Removing

Code: Select all

  if (x == 0.0)
    return(1.0);
will remove the discontinuity.

If you prefer to keep this code in, I'll produce coefficients which remove the discontinuity. Basically, I'll fix the coefficients so as to hardwire Sinc(0)=1. I can do this by multiplying them all by a constant. This, essentially, doubles the error in final results (in the worst case), so I may have to add one polynomial order to maintain the very tight error tolerances I imposed when I chose the approximation to use for each QuantumDepth.

If this is what you want, just let me know, and skip the following.

LONGER TECHNICAL EXPLANATION

At a location x, the relative absolute error between Sinc(x) and an approximating polynomial P(x) is defined as

e(x) = | ( Sinc(x) - p(x) ) | / | Sinc(x) | if Sinc(x) != 0,

and 0 otherwise (if Sinc(x)=0).

This definition only makes sense if p(x) = 0 whenever Sinc(x) = 0, as is the case with our approximations. Otherwise, at zeros of the denominator, the relative error is better understood in a limit sense, which makes it infinite if Sinc(x) = 0 but p(x) != 0.

The polynomial P(x) computed by the approximations is THE polynomial of its degree which minimizes the maximum of the relative absolute error over the interval [-4,4]. (In the code, the usual formula Sinc(x) = sin(pi*x)/(pi*x) is used for x values outside of [-4,4].)

It turns out that this global minimizer of the absolute relative error is not exactly equal to Sinc(0)=1 at 0.

A simple way of fixing things is replacing P(x) by Q(x)=C*P(x) where C = 1/P(0). Note that Q(0)=1.

Why did I not do this?

First, I would like to explain why the fact that P(0) != 1 does not matter when filtering. Basically, this is done by showing that filtering with Q is the same as filtering with P.

Going to 1D for simplicity (appropriate for tensor methods, which are most of resize.c's methods), the result of filtering using a footprint with 2N+1 values is

( sum_{i=-N}^N ( f_i * k(x-i) * w(x-i) ) ) / ( sum_{i=-N}^N ( k(x-i) * w(x-i) ) )

where f_i is the pixel value a i, k(x) is the filtering kernel, and w(x) is the windowing function.

The result with k(x) = Sinc(x) is what we are approximating.

If we use Q(x), we get

( sum_{i=-N}^N ( f_i Q(x-i) * w(x-i) ) ) / ( sum_{i=-N}^N ( Q(x-i) * w(x-i) ) )

Now, Q(x)=C*P(x).

Consequently C cancels out since it is a factor of the numerator and the denominator, and we get the same result using P instead of Q (or vice versa).

That is: In general, resize.c does not care whether the various functions are multiplied by individual constants.

Now, you may ask, why did I not simply give you Q in the first place?

The reason is that if the filtering window is wider than 8, which is the width of the interval [-4,4], both the polynomial approximation AND the standard formula sin(pi*x)/(pi*x) are used. The constant C does NOT appear in sin(pi*x)/(pi*x), and consequently it does not cancel out (completely). This leads to a doubling in the final result's error in the worst case.

In other words, if you allow the approximation not to be exact at 0, you get MORE accurate filtering results.

Put another way: Minimizing the approximation error over the whole domain requires sacrificing exactness at one point.

------

To summarize:

Please either remove the code which hardwires Sinc(0)=1, OR allow me to give you a slightly less accurate OR slightly more expensive approximation.

Or just ignore the above, since we are talking about a difference of at most 1 part in 2^16 or so.

Re: Cheaper sinc computation for resize.c

Posted: 2010-09-02T06:19:44-07:00
by magick
I have removed the statement as you recommend. Look for it in 6.6.4-0 Beta sometime tomorrow. Thanks.

Re: Cheaper sinc computation for resize.c

Posted: 2010-09-03T07:03:27-07:00
by NicolasRobidoux
I just double checked, and the maximum relative error for the QuantumDepth = 16 approximate sinc when used to filter with lanczos 2 or lanczos 3 is 2.5e-5 < 2/(2^16-1) = 3.1e-5. So, my earlier statement that using it is guaranteed to lead to an answer within 2 of using infinite precision lanczos2 or 3 to get a 16 bit output image holds. (My earlier computation was done doing back of the envelope error propagation. This new bound is computed directly with the formula.)

That is, I can give a 100% guarantee that the final answer will always be within 2 with 16 bit output.

Likewise, the max relative error for QuantumDepth = 32 when filtering with lanczos 2 or 3 is 1.6e-10 < 2.3e-10 = 1/(2^32-1).

So, I can also give a 100% garantee that the final answer will always be within 1 with 32 bit output.

If you feel that this is too large a tolerance (I honestly don't think that this makes any difference quality-wise), let me know and I'll add one coefficient to the Quantum_Depth = 16 approximation. Or I'll give you direct approximations of lanczos 2, 3, ... bypassing the issue for sinc-sinc filters.

If Sinc is used with a non-sinc window, the differences are even smaller, because approximation error only enters in one factor instead of two. For Quantum_Depth = 16, this gives 1.3e-5 < 1.5e-5 = 1/2^16; with Quantum_Depth = 32, this gives 8.2e-11 < 2.3e-10 = 1/2^32. So, the approximated Sinc is garanteed to yield filtering results within 1 of those obtained with an infinite precision Sinc when doing a windowed Sinc with another windowing than Sinc for both 16 and 32 bit output images.

Re: Cheaper sinc computation for resize.c

Posted: 2010-09-04T05:00:08-07:00
by NicolasRobidoux
One last thing (before I move on to approximating other kernels):

The difference between the frequency response obtained with the Quantum_Depth <= 16 Sinc approximation and the frequency response of the "infinite precision" Sinc is minuscule.

So, if it is the quality of the results which matter, as opposed to conforming to some ideal (windowed sinc (discrete) filtering) which is itself an approximation of a continuous convolution with a windowed Sinc which is itself an approximation of a continuous convolution with the "infinite precision" Sinc which itself is an idealization since it really is meant for band limited continuous signals, I would ALWAYS use the cheaper of the two approximations. This cheaper approximation, for a number of reasons, fully deserves the "Poor Man's Sinc" controlled appelation.

Re: Cheaper sinc computation for resize.c

Posted: 2010-09-04T14:42:20-07:00
by NicolasRobidoux
There is something I forgot to take into account in my error estimates :

It is that generally the filter is used twice, once in each direction. This doubles all the relative error estimates given above for the filtering results.

Re: Cheaper sinc computation for resize.c

Posted: 2010-09-05T13:16:20-07:00
by NicolasRobidoux
I just added an 8-bit quantumdepth version. The new code is at

http://web.cs.laurentian.ca/nrobidoux/misc/IM/resize.c

A minor typo is also fixed in the brand new blackman code comments.

This code is based on today's SVN, so it's "on top" of Cristy's recent svn commits.