Digital post-processing of images for video sequences and movies : Noise Reduction / why is it needed? why is it complicated to achieve?
An image is a collection of coloured pixels located at spatial coordinates (x, y). Every pixel is composed of three values that are its projection on the components (R, G, B).
R is component of red.
G is component of green.
B is component of blue.
The three basic colours proportions (R, G, B), mixed (superimposed, and then, added) lead to the impression of a global only colour that draws on the resulting colour.
The relation between the three values R, G and B and the resulting colour is not totally intuitive, except in obvious cases (eg when R is very close to max value while G and B are near zero, the colour is RED).
When the R, G and B are equal, the pixel is gray (goes from black, for coordinates 0, 0, 0, to white for coordinates max, max, max).
The axis R = G = B is dotted in the drawing below : we draw in red the “triangle” that intercepts the sphere of centre (R, G, B) = (0, 0, 0) and radius ρ.
ρ is called the luminance of the pixel.
The more the plot is away from the dotted line, the more the pixel is coloured (angle θ determines the hue of the pixel. distance r from the axis determines the intensity of the colour effect. r is called saturation.
Those coordinates are called H, S, L :
Hue = H
Saturation = S
Luminance = L
Note that we can use several definitions of distance, as r can be calculated in a straight line or by arc-length along the sphere of radius ρ, in direction θ, etc. …,
Moreover, the human eye is not equally sensitive to a variation of the angle Δθ depending on the value of θ. This is why we perceive a rainbow as discontinuous colours, although it is a continuous variation of colours.
For all these reasons, there are many transformations of the coordinate system (H, S, L) which are intended to simplify the use of coordinates and to be as close as it can to the colours perceived by the human brain.
But these systems are only transformations of systems (R, G, B) or (H, S, L) described above.
The coordinates R, G, B have the disadvantage of being difficult to interpret (it is difficult, by reading the three coordinates, to deduce the seen colour, except in trivial cases), but they have the advantage of being always defined.
Conversely, the coordinates H, S, L, (and their variants) have the advantage of being easy to interpret, but they are not always defined :
– When a pixel is almost black (L near zero), the angle (H) can take any value, for exactly the same colour impression. For the absolute black (L = 0), the angle is formally not defined.
Improving Luminance contrast
Improving the contrast is to increase the distance between two or more classes of luminance. We draw below the cone of colours (cutting):
We define thresholds (arbitrarily) that define classes of luminance. For example, we can define a high threshold beyond which it is considered as ‘white’, a low threshold below which it is considered as “black.”
This configuration is often encountered : we want the blacks blacker, whites whiter, and we want that the intermediate gray levels remain unchanged. Such an operation is represented as follows:
What happens there in the presence of noise in the image (eg electronic noise of the sensor for a digital camera)?
The diagram above clearly displays that the noise is amplified when increasing the contrast.
Even if the noise was barely noticeable on the initial images, it is quite possible that it is widely visible on the post processed images.
Thus, the level of initial noise on images determines there the ability to enhance contrast.
Improving colours contrast
You can increase the colour contrast by saturating them (increase of S).
If you consider the triangle R, G, B, for a given luminance, this is as follows:
This leads to amplify
– Noise of saturation:
Occurrence of gray pixels within the saturated colour
– Noise of colour :
Occurrence of other colours within the saturated colour
Again, it is clear that the noise of the initial image is the factor that scales the ability to post process colours. Even if the noise was barely noticeable on the initial images, it is quite possible that it is widely visible on the post processed images.
Perception of noise in the image by the human brain
NEXYAD is involved for many years now in human factor analysis, and especially in human perception. The noise on the image is perceived via two main factors:
– On a still image, and is detected because of the contrast between the “clean” pixels and the noisy pixels.
Human vision has a very complex way of taking into account the contrast : our sensitivity to contrast depends on the global lightning level, the size of objects, and colours.
The analysis of the CSF (Contrast Sensitivity Function) of the human vision can show that the flat areas (eg a door, a smooth wall, a vehicle body, …) are the areas on which the spatial high frequency noise is the most detectable.
– On a video sequence, we also detect noise with motion detectors that we have in the periphery of the eye.
The noise is random (ie : electronic noise of a digital camera), and each image is an independent release of this noise, so noisy pixels change their hue, saturation, and luminance, for each image.
The apparent movement of the noise allows us to detect it even with very low contrast that wouldn’t allow us to detect it on still images of this sequence.
Noise reduction in images
We can reduce the noise in the two ways above:
– reduce the contrast of the noise in each image (approach called spatial noise reduction)
– reduce the amplitude of the noise temporal variations (approach called temporal noise reduction)
The removal of spatial noise will tend to eliminate both the contours of objects (and hence make the overall image blurred).
Similarly, the elimination of temporal noise generates “ghosts” images due to group delay time of signal processing systems.
The reduction of noise in the videos is therefore a complex matter that requires the use of nonlinear mathematical methods whose application must be constrained by a psycho-senrory model of human vision (to distinguish which is perceived by our brain of what is not, and prioritize tasks of images cleaning). A preliminary analysis of the structure of images is usually required (using methods such as wavelet transform analysis).