Imagine a pinhole camera with a piece of graph paper on the back instead of film. Point the camera at a scene, grab a box of crayons, and then start coloring in each cell of the graph paper with the color you see the most of in that cell before the scene changes. You've just created a very low-tech, slow, and inaccurate digital camera, congrats. Now you have the basics of how a pixel is formed. Sensors are laid out in a grid, light enters for a given amount of time (exposure), and the value of each sensor becomes a pixel. Of course it's more complex but, for our discussions this is enough. So a pixel is the mean of all the light that struck its area.
Semantics (or being really pedantic):
time-varying signal - information transmitted over time
signal sample - a unit of information of a signal over a given amount of time
space-varying signal - nonsense, used to describe an N dimensional sample set, designed to confuse people, and drive me nuts.
So where did this all go wrong. At some point it seems that someone confounded an image made of pixels with a signal made of samples. An image is not a signal, unless it's being transmitted as such. If you're dealing with transmission errors then reconstruction filters, Fourier analysis, and all sorts of signal processing on the signal makes perfect sense. That's not to say you can't get some cool effects using those techniques on an image but, then you are only applying an effect to an image, nothing else. (By now, I've made some enemies.) A lot of research has gone into signal processing techniques applied to images, some of it really interesting but, that's not what I'm talking about. I have no beef with those people, they do some great work.
My beef is with the image resizing folks. Where resizing is treated as if an image is a signal. A signal cannot be 2D because, time is a single dimension, last time I checked anyway. So let's break the image into 2 sets of signals: horizontal and vertical, and apply our filters first in one direction then the next.
Yeah, we've solved our time problem, except pixels are gone from the equation. OK, so by now, I've lost everyone. Let's get some examples to help clear up the picture (bad pun,bad).
You want to reduce the resolution of an image by half in each direction (also known as a fourth the total size). If we take our graph paper from the example above, and place a new sheet of graph paper over it, place it on a window or overhead projector, grab the crayons again, and color in each set of 2x2 cells as if it was a single cell, in the same fashion that you did when you made the first image. You've just downsized your first image by hand, wasn't that fun...
During image reduction each pixel is the average of all the light inside of its boundary or the integral over the pixel area for infinite population sizes. This also means that during up-scaling, the pixels in the up-scaled image must contribute to the original pixel that they overlap in the same fashion that occurs during down-scaling. The reason why this is true is because energy will be either lost or added to the image if we don't, and since we're only talking about resizing, the energy should remain constant. Any change in energy can also be considered a loss of information. Remember a pixel is the average of all the energy contained in its boundary, it is not a point sample. Grab any modern paint or photo editing program, scale up an image by 2x and then scale down back to the original size, you will not have the same image you started with. Technically there is one true up-sizing algorithm, it's call nearest neighbor. It is perfectly reasonable that the reverse, scaling down and then back up, will not give you the same image because, you are averaging when you down-scale. If you're losing information during up-scaling, you are not up-scaling, you are doing something else. Interpolation is often used to achieve this effect but, it can't even be called interpolation because, a pixel is not a point sample, it's a little square. In fact since the pixel is an average, the value itself may never actually appear in the pixel. Remember interpolation passes through the data point, and since that value may not even exists, it's pointless to try to force the original pixel value to exist in the new image. What actually needs to happen is a more generalized form of curve fitting. To be more accurate, what we really want is surface fitting but, let's take it one step at a time.