Just for fun, I'll write up interpolation methods used in image (or signal) processing during affine transforms (scaling, rotation, etc.). An application like Photoshop offers a pull-down menu so you can choose an interpolation method when you do an affine transform.
Interpolation can be viewed as deriving a continuous signal g(x), x is real, from a discrete signal f(j), j an integer. In practice the desired result of interpolation will be another discrete signal h(k), k an integer where h(k) = g(x).
Interpolation methods commonly used for images are, in increasing order of computational cost:
* Nearest neighbor
* Linear
* Cubic
* Cubic spline
The first two methods are too trivial to discuss. "Cubic interpolation" on images is often called "Bicubic interpolation" (and Linear called Bilinear) because the interpolation method is operated TWICE, once along an X axis, once along a Y axis. But in this post I'll just consider a single dimension.
[ETA: The above description is wrong. You do FOUR 4-point interpolations along an X-axis to get four interpolated points for input to the final (fifth) interpolation along the Y axis.]
It is easy to conflate "cubic" and "cubic spline" interpolation methods, especially since BOTH methods involve splines of cubic polynomials. But the two methods are very different.
In the cubic spline method, the SIGNAL is interpolated to be a spline of cubics; in the cubic method it is an INTERPOLATION FUNCTION which is set to a spline of cubics. I never pursued
Cubic Spline Interpolation: Its computational cost seemed far too huge for processing large images quickly and, as discussed below, the higher-speed Cubic interpolation gives excellent results.
All methods except cubic spline can be described via an
interpolation function, w(i). A SINGLE interpolation function is set in advance and used for all interpolations. (Unlike cubic spline interpolation where the proper cubic is computed for each pair of adjacent samples.) The derived continuous function g(x) is simply the convolution of f() and w(), i.e. for 0 < y < 1,
. . . . . g(j+y) = w(-1-y)*f(j-1) + w(-y)*f(j) + w(1-y)*f(j+1) + w(2-y)*f(j+2)
This equation shows the convolution input as having 4 points {f(j-1), f(j), f(j+1), f(j+2)} — these are the four points in the original discrete signal which are closest to f(j+y).
In nearest neighbor, linear, and cubic interpolations, the convolution input has respectively 1, 2, and 4 points.
If the signal is band-limited and sampled above the Nyquist rate, then it is a theorem of Fourier theory that the ideal interpolation function will be the
Sinc function.
Whew! After all this prelude, we see that some solutions to the interpolation problem can be viewed as simply finding an approximation to the Sinc function, and using that approximation as our interpolation function! We can't use the sinc function itself: it's infinite, with small ripples extending forever. (Conceptually, our interpolation function will be infinite, but its value will be zero everywhere outside the central kernel area.)
Images are NOT band-limited — they contain sharp edges — but it turns out the cubic interpolation method we outline works rather well on many real images.
The Interpolation function — remember that this is ideally an approximation to sinc() — is implicitly a square wave for nearest neighbor interpolation, and a triangle wave for linear interpolation. For the remainder of this post we consider ONLY cubic interpolation, where the interpolation function has width four and looks like a finite approximation to sinc(). Because the very same interpolation function is always used it can be pre-computed, at some granularity, and the convolution filter values come from table look-ups.
So: How is that function derived? I think it's rather cute.
The filter is a spline of four cubic polynomials, but it is symmetric so only two cubics are derived. A cubic ax^3 + bx^2 + cx + d has 4 degrees of freedom; two cubics mean 8 degrees of freedoms, but seven of those freedoms are made to disappear via constraints:
* We want the function to be
continuous — this constraint kills 2 freedoms
* We choose the function to be differentiable everywhere! — this constraint kills 3 freedoms
* We want g(x) = f(j), i.e. the interpolation doesn't change known samples — this constraint kills 2 freedoms.
This leaves 1 degree of freedom I call "sharpness": it effects the magnitude of the side-lobes. With the usual parametric formula, the parameter 0.5 is usually used, but in the experiment described below I found that the "sharper" 0.6 worked somewhat better.
This concludes my explication of cubic interpolation. Raise your hand if this makes any sense at all!
~ ~ ~ ~ ~ ~
Now let's talk about rotating an image. (Or more generally, any affine transform with a rotational component.)
The obvious way to rotate is simply to map the target pixel to a location among the input pixels, extract the 4x4 pixel array near that location, and apply the 4-point interpolation twice, once on an X axis, once on a Y axis. Straightforward ... but there is a problem. Or rather, there was a problem in the olden days when memory was very limited.
With limited memory, we'd like to do the image processing just a few lines at a time, but when rotating by, say, 45 degrees, the input and output images do not line up. We'd have to either hopscotch through the input or hopscotch through the output.
Paeth's clever approach to the rescue! The idea is to replace Rotation with a 3-step process: Shear, Transpose by 90 degrees, Shear again. (Never mind that the Transpose has a similar problem to the hopscotching just mentioned!) This is called Paeth's Two-Shear algorithm.
Much superior, depending on the rotation angle, may be Paeth's Three-Shear algorithm, which has 5 steps: Shear, Transpose by 90 degrees, Shear, Transpose by 90 degrees, Shear. No, I'm too lazy to go into all the details!
Anyway, I did some experiments. I started with a particular image, rotated it a few dozens of times by some arbitrary angles
which summed to a multiple of 360 degrees, and then compared the result with the original image. (I did this three times, using Paeth 2-Shear, Paeth 3-shear and the pristine Rotate.) Paeth 2-shear didn't do well, but with the other techniques the resultant image was very close to the original, despite that it had gone through dozens of interpolations. (It did get a bit blurry with the standard cubic parameter of 0.5, but not with 0.6.) Fun!
I'm pretty sure nobody here cares about, or even reads, these posts of mine; but it's fun for me to recollect and write about some of the things I worked with almost three decades ago!