Video noise reduction device and method
The invention relates to a device for noise reduction of a video signal, comprising filtering means for filtering the video signal and delivering a filtered signal, the filtering means being a local regression filtering means.
The invention also relates to a method for noise reduction of a video signal.
Such noise reduction devices are well known in the art. For filtering a video signal they basically process pixels composing the video signal one by one, each one being called the current reference pixel, and consider a certain range of pixels around the current reference pixel to compute an estimation of said current reference pixel. This principle is also called the principle of the sliding window.
The estimation of the current reference pixel is usually based on a weighted average of the feature value of the surrounding pixels, the weights depending on the relative position to the current reference pixel. This technique is called a local regression. Depending on the weighting coefficients, this may also be seen as a local regression of a certain order. The feature value of a pixel may be its luminance, its chrominance or any measure depending on its intensity or color composition.
The techniques of local regression are well known in the field of data analysis (W.S. Cleveland and S.J. Devlin "Locally weighted regression: An approach to regression analysis by local fitting." Journal of the American Statistical Association 83, 596-610) and artificial intelligence (CG. Atkeson, A.W. Moore, S. Schaal, "Locally Weighted Learning." Artificial Intelligence Review, 11:11-73, 1997).
When all weighting coefficients are equal, this is a linear local regression, which minimizes the least squares criterion and which is also called the least square local regression. It is one of the simplest ways to perform the computation of the current reference pixel estimation because it simply takes an average of the feature values of the surrounding pixels, the current reference pixel being included or not in said surrounding pixels.
The linear local regression filter can be implemented for instance on integrated circuits in a simple manner, but tends to blur picture details, or in other words high frequency
picture content. Indeed the linear local regression filter amounts to averaging the feature values of the surrounding pixels, and, consequently, the filtering degrades the image quality by smoothing or blurring details.
In order to tackle this problem, one approach is well known in the art: the use of robust local regression filters. A local regression filter is called "robust" if, when filtering the noise on a current pixel by performing local regression, it can handle a certain amount of picture details and still returns a filtered image without excessively blurring these picture details. A filter is not robust while it is influenced by the presence of picture details. Many of these filters are described in the literature for application in the field of data analysis, and use for example methods such as the least absolute deviation method, the iteratively reweighted least square method (P.W. Holland and R.E. Welsch, "Robust regression using iteratively reweighted least-squares", Communications in Statistics: Theory and Methods, A6, 813-827, 1977), the least median of squares method (P.J. Rousseeuw, "Least Median of Squares Regression", Journal of the American Statistical Association, 79, 871-880, 1984.") and the least trimmed squares method.
Though such robust local regression filters generally work well for providing an estimation of a pixel in an image containing details, they are relatively unpractical because of their high implementation cost, for instance on integrated circuits.
It is an object of the invention to provide a device for noise reduction of a video signal, which substantially preserves picture details in the image and which presents a reduced implementation cost, for instance on an integrated circuit. It is a further object of the invention to provide a device for noise reduction of a video signal which produces an image substantially preserved from the presence of artifacts in regions of the image with soft gradients of the feature value.
To this end, the noise reduction device according to the invention is characterized in that it comprises means for comparing the video signal and the filtered signal, and for delivering a control signal depending on said comparison, means for combining the filtered signal and outputs of at least one additional filtering means, each of which takes as input the video signal, and for delivering an output signal, the combination depending on the control signal, and in that one of the at least one additional filtering means is a discriminating averaging filtering means.
When in operation, the behavior of a device according to the invention is such that in picture regions presenting few details or no details at all, i.e. in regions with small or no variations of the feature value and or of the gradient of the feature value, these regions being commonly called "flat regions", the comparison between the video signal and the filtered signal is such that the control signal has a value indicating that the region is flat. The control signal then transfers this indication to the combining means that combines the filtered signal and the outputs of the other filtering means in such a way as to produce the best output signal in regard to the value of the control signal. Knowing that the local regression filter is suitable to handle such flat regions, the output signal depends mainly on the filtered signal in flat regions of the image.
In picture regions presenting a certain amount of picture details, the behavior of the device according to the invention is such that, when in operation, the comparison between the video signal and the filtered signal is such that the control signal has a value indicating that the region contains picture details, e.g. sharp edges. The control signal then transfers this indication to the combining means that combines the filtered signal and the outputs of the other filtering means in such a way as to produce the best output signal in regard to the value of the control signal. Knowing that the discriminating averaging filtering means is appropriate to treat such regions with a certain amount picture details, the output signal depends mainly on the output signal of the discriminating averaging filtering means in these regions of the image.
In intermediate types of regions between the two types of picture regions mentioned above, the device according to the invention applies the same mechanism save that the combining means operates in a such way that the outputs of several filters contribute in a more balanced manner to producing the best output signal in regard to the value of the control signal.
A device according to the invention is thus such that its output signal substantially preserves picture details, since the presence of said picture details is monitored and since, in the presence of said picture details, one or several suitable filtering means, in particular the discriminating averaging filtering means, counterbalances the poor performance of the local regression filtering means in the presence of such picture details. At the same time the device according to the invention still lives up to expectations with a simple local regression filtering means and simple additional filtering means, i.e. the discriminating averaging filtering means and the potential other filtering means, so that it leads to low implementation cost.
The control signal may advantageously be the sum of the squares of the differences between the video signal and the filtered signal in the sliding window around the current reference pixel. The advantage is twofold: On the one hand, when a certain amount of picture details are present in the image, the control signal, which can be called in this case the residual, becomes very large and, when picture details are substantially absent from the image, the control signal becomes very small. This thus provides a control signal having a good sensitivity to the quality of the local regression filtering means. On the other hand, the implementation cost of such a comparison is reduced, for instance on integrated circuits. The control signal may advantageously be the sum of the magnitudes, or absolute values, of the differences between the video signal and the filtered signal in the sliding window around the current reference pixel. The implementation cost of such a comparison is reduced, for instance on integrated circuits, while maintaining a good sensitivity to the quality of the local regression filtering means.
Furthermore, the output signal may also advantageously be a weighted sum of the filtered signal and the outputs of the at least one additional filtering means. The advantage is the simplicity of implementation.
Still furthermore, the device according to the invention may also advantageously contain a single additional filtering means, being the discriminating averaging filtering means. The advantage is also the simplicity of implementation.
These and further aspects of the invention will be explained in greater detail by way of example and with reference to the accompanying drawings in which:
Fig. 1 is a schematic description of an embodiment of a noise reduction device according to the invention;
Fig. 2 is a schematic description of a detail of an embodiment of the noise reduction device according to the invention wherein the discriminating averaging filter is a three-dimensional SWAN filter;
Fig. 3 a is a schematic view of a 15 pixels window around the current reference pixel in the current field, used in the three-dimensional SWAN filter of Fig. 2;
Fig. 3b is a schematic view of a 9 pixels window in the previous field, used in the three-dimensional SWAN filter of Fig. 2;
Fig. 4 is a schematic description of another embodiment of the noise reduction device according to the invention with an advantageous amplifier applied to the control signal;
Fig. 5 is a schematic description of yet another embodiment of the noise reduction device according to the invention with a lookup table implementing a nonlinear function applied to the control signal;
Fig. 6 is a schematic description of a further embodiment of the noise reduction device according to the invention with the discriminating averaging filter participating to the computation of the filtered signal. The figures are not drawn to scale. In general, like references numerals refer to like parts.
According to a first aspect, the invention concerns a noise reduction device. Fig. 1 shows a schematic description of an embodiment of the noise reduction device, which constitutes an example of the invention. The device takes as input a video signal (2), which may have the form of a sliding window representing a certain number of pixels. The video signal (2) may also be a flow of discrete values, each of which represents a pixel, accompanied then by one or several means for transforming this continuous flow of values into windows of pixels, this or these transforming means being either independent from the components of the device or integrated into them. Usually the transforming means combine buffers or delay elements applied to the values in order to obtain an appropriate sliding window. The video signal (2) may also be an analog signal, accompanied with means to make the signal discrete, said means being integrated in a way or another to the present device, and means for obtaining a window of discrete values. These means for making the signal discrete are well known in the art. For instance, the video signal input (2) may be a standard definition 8- or 9-bit wide video stream, which is sampled at a 13.5 MHz clock frequency.
In brief, it will be clear for the person skilled in the art that the video signal (2) may be either discrete (digital) or analog, accompanied by the adequate transformation means.
The video signal (2) is the input of a local regression filter (4) which produces a filtered signal (6). When the local regression is done according to the least squares criterion, i.e. by minimizing the quadratic error between the video signal (2) and the filtered
signal (6) in the window around the current reference pixel (44), one obtains a simple two dimensional filter: the linear local regression. The linear local regression amounts to computing the coefficients a , b and c , that define the surface of the linear local regression, the coefficient being the best according to the least square criterion, in the function: z = a -x + b - y + c where x and y are the pixel coordinates and z the estimation of the feature value. When the current reference pixel (44) is assigned the coordinates (0,0), the estimation z becomes equals to c , which is nothing more than the average when the current reference pixel (44) is the center of a symmetric sliding window, i.e. a sliding window with the same amount of pixels around the current reference pixel (44), both on the vertical and horizontal axis. Higher polynomial regression models such as the quadratic local regression may be used as well.
The filtered signal (6) is then introduced along with the video signal (2) into a comparing means (8) in order to produce a control signal (10). The comparing means (8) plays the role of the control block of the device. It tries to detect how valuable the local regression estimation of the pixel is. When there is a certain amount of details in the pixel neighborhood, i.e. within the sliding window, the output of the comparing means (8), i.e. the control signal (10), indicates a bad match between the filtered signal (6) and the video signal (2), forcing the combining means (16), or fader, to produce an output signal (18) which depends much more on the so-called additional filtered signal (14), defined as the output of an edge-preserving algorithm, in particular in the device according to the invention a discriminating averaging filter (12). The device may be viewed as an improved local regression filtering system which locally discriminates between flat regions and regions with picture details to decide whether the local regression filtering method is suitable in the region of the current reference pixel (44) and to use another filtering method while necessary. The comparing means (8) performs a comparison between the video signal (2) and the filtered signal (6) in order to produce the control signal (10), which is a figure of merit as explained above. The comparison means (8) may be based on the residuals, which are the differences between the pixel values and their approximation obtained by the regression method and comprised in the filtered signal (6). Referring to the same notations as above, this may be written as: r. = z, - z{ = z, - (aj t + b.y, + c) where i refers to a particular pixel of the sliding window around the current reference pixel (44), of which the residual r. is calculated. It is important to note that the
estimation values z. of all the pixels around the current reference pixel (44), including said current reference pixel (44), are taken as input of the comparing means (8), which computes the residuals in respect of each of these estimation values. The filtered signal (6) should thus be understood as this set of values, the combining means (16) taking only into account the estimation value of the current reference pixel (44) while the comparing means (8) taking all of the values zi into account. Another way of explaining how it works is to say that the filtered signal (6) comprises the coefficients a, b and c, but the combining means (16) only needs the coefficient c while the comparing means (8) needs the three coefficients a, b and c. For instance, when the criterion is to minimize the sum of the squares of the residues and in a neighborhood of 5 x 3 pixels, the coefficients coefficients a, b and c may be a - — Yz b = — Yz,v. c = — Yz.
30 ^ ' ' 10 ^ 15 , where z. = z{χ.,y.) is the noisy feature value on the pixel coordinate (x,-, y.), and x. = -2, - 1, 0, 1, 2 and y. = - 1, 0, 1. In other words:
( ,., )= (- 2,-l), 2,0), (- 2,l), (-l,-l), (- l,0),..., (2,0), (25l). Furthermore, the control signal (10) may be the sum of the squares of the magnitudes of the residuals, which makes the figure of merit. Mathematically, this may be written as ^ r .
In another embodiment of the invention, the control signal (10) may be the sum of the magnitudes of the residuals, i.e. the absolute values of the residuals, which makes the figure of merit. Mathematically, this may be written as
It will be clear for the person skilled in the art that the control signal (10) may be any function of the residuals.
Still furthermore, in this example of the invention described on Fig. 1, the output signal (18) may be a weighted sum of the filtered signal (6) and the additional filtered signal (14). In other words the output signal (18) is a linear combination of the filtered signal (6) and the additional filtered signal (14).
The output signal (18) may also be a weighted sum of the filtered signal (6) and the outputs of the at least one additional filtered means.
In another embodiment of the invention, the discriminating averaging filter (12) is a spatial weighted averaging noise filter, also abbreviated by SWAN. The SWAN may be spatial, either one dimensional or two dimensional, or spatiotemporal, i.e. three
dimensional. The three dimensional SWAN algorithm is very powerful in the presence of picture details. The core functionality of a discriminating averaging filter, which is the heart of the SWAN filter, is to assign to each pixel of the neighborhood, i.e. to the pixels around the current reference pixel (44) selected for the filtering or in other words to the pixels of the sliding window, a weighting coefficient based on the magnitude, or absolute value, of the difference of feature value of said each pixel with respect to the current reference pixel (44). This approach averages the pixels belonging to the same structure in a picture without being influenced by pixels belonging to a different structure. This assures good noise reduction in large homogenous areas while preserving the edges and fine picture details. However, in areas with soft gradients of the feature value, the SWAN leads to an unnatural "plastic" image. It is why the SWAN is preferably not used for flat regions, but only for regions with picture details where it works well.
In yet another embodiment of the invention, the discriminating averaging filter (12) is a three-dimensional SWAN, also called Spatial and temporal Weighted Averaging Noise filter, which combines a spatial filter and a temporal recursive filter. It is used to reduce white Gaussian noise added to luma and chroma components of the video data, the luma component representing the lightness and the chroma component representing the color disregarding lightness. In this embodiment of the invention a three-dimensional neighborhood or window is defined around each pixel of the video data which has to be filtered.
An example of this embodiment of the invention which comprises a discriminating averaging filter (12) being a three-dimensional SWAN is schematically shown in Fig. 2 and is described also in reference to Fig. 3a and 3b. In order to filter a given pixel, called the current reference pixel (44), the video signal (2) is delayed twice in order to introduce the whole streaming video input at three different time moments into a means for selecting neighborhood pixels, called spatial pixels (28), these spatial pixels (28) being one of the two inputs of a basic discriminating averaging filter (30), and the resulting signal of this filter, i.e. the additional filtered signal (14), being further used in a recursive loop to producing temporal pixels (42) being the second input of the basic discriminating averaging filter (30).
In more details, the video signal (2) is introduced into a means for selecting the spatial pixels (20), and into a line memory component (22). The output of said line memory component (22) is called the first delayed video signal (24), further introduced into the means for selecting spatial pixels (20). The first delayed video signal (24) is also the input
of a second line memory component (22). Similarly, the output of this second line memory component (22) is called the second delayed video signal (26) and is introduced into said means for selecting spatial pixels (20). The line memory components (22) are all introducing the same delay, this delay being equal to the number of pixels on a line. Consequently, three video streams, each of them representing the video signal (2) at a different time moment, are introduced in the means for selecting spatial pixels (20), these spatial pixels (28) being represented in Fig. 3 a wherein 5 pixels are selected per line. The 3 lines are the previous line of current field (50), the current line of current field (48) and the next line of current field (46). The selected spatial pixels (28) are one of the inputs of the basic discriminating averaging filter (30), the output of which is the additional filtered signal (14) as explained above, used as well as the input of a field delay component (32). This component introduces a delay equal to the number of pixels contained into one image or field, and produces a signal, called recursive signal (34). This one is subjected to two delay operations in series, producing respectively the first delayed recursive signal (38) and the second delayed recursive signal (40), these signals respectively corresponding to the recursive signal (34) at two other time moments. The recursive signal (34), the first delayed recursive signal (38) and the second delayed recursive signal (40) are introduced in a means for selecting temporal pixels (36) which produces the temporal pixels (42) and constitutes the second input of the basic discriminating averaging filter (30). Fig. 3b shows 9 temporal pixels (42), 3 pixels of 3 different lines of the previous field, these 3 lines being the previous line of previous field (58), the current line of previous field (56) and the next line of previous field (54). The second pixel of the current line of previous field (56) is the reference pixel in previous field (52), i.e. the pixel having the same coordinates as the current reference pixel (44) but corresponding to the preceding image.
It will be clear for the person skilled in the art that buffering means or delay components may be used inside both selecting means, the means for selecting spatial pixels (20) and the means for selecting temporal pixels (36), in order to perform the selection of pixels from the input pixel lines. These buffers may be arranged according to a layout similar to the one according to which the line memory components (22) as described above are arranged.
In this particular embodiment of the invention including a three-dimensional SWAN filter, the basic discriminating averaging filter (30) works according to the same principle as described above, i.e. to each selected pixel one weighting coefficient is assigned,
based on the magnitude, or absolute value, of the difference between the selected pixel and the current reference pixel (44). Then the sum of the weighted pixels is calculated and normalized.
Fig. 4 is a schematic description of another embodiment of the noise reduction device according to the invention, working in the same way as described regarding to Fig. 1, but with an advantageous first amplifier (60) applied to the control signal (10) and producing an amplified control signal (62), the combining means (16) now depending on the amplified control signal (62) and no more on the control signal (10). This first amplifier (60) provides a gain for the control signal (10), which is in a particular embodiment of the device according to the invention the sum of the residues. With this it can be given more weight to the local regression filter or the edge-preserving filter operation. In addition to that, the gain of the first amplifier (60) may be tuned with respect to the signal to noise ratio.
Fig. 5 is a schematic description of yet another an embodiment of the noise reduction device according to the invention, working in the same way as described regarding to Fig. 1, but with a lookup table (64) implementing a nonlinear function applied to the control signal (10) and producing a corrected control signal (66), the combining means (16) now depending on the corrected control signal (66) and no more on the control signal (10).
This nonlinear function and the associated lookup table (64) may be advantageously optimized to get the best balance between the additional filtered signal (14) and the filtered signal (6), or more generally to get the best balance between the outputs of the at least one filtering means and the filtered signal (6).
In another embodiment of the noise reduction device according to the invention, the mechanisms described regarding to both Fig. 4 and 5 are combined, the control signal firstly subjected to the amplifying operation of the first amplifier (60) and secondly subjected to the nonlinear function operation provided by the lookup table (64). This helps to maximize the overall picture quality, i.e. the noise, the sharpness and the lack of unnatural undesirable effects or artifacts, given some signal to noise ratio.
It will be clear for the person skilled in the art that the lookup table is nothing more than an easy way for implementing a non-linear function and furthermore that the first amplifier (60) and/or the lookup table (64) may be replaced by any means for implementing a function, such as a non-linear function. Still furthermore the function may be programmed on said means for implementing a function.
Fig. 6 is a schematic description of a further embodiment of the noise reduction device according to the invention, wherein there are two additional filters, one
being the discriminating averaging filter (12) and the second being a high-pass filter (76), and wherein the output of the high-pass filter (76), called the high-pass filtered signal (78), modifies the filtered signal (6). The high-pass filtered signal (78) is amplified by a second amplifier (68) to produce an amplified high-pass filtered signal (70), which is added in an adder (72) to the filtered signal (6) or some part of it to produce the composite filtered signal (74). This advantageously improves the sharpness and the subjective noise impression. The degree of amplification of the second amplifier (68) controls how large is the contribution of the high-pass filtered signal (78) to the composite filtered signal (74).
The components (72) and (16) as referenced on Fig. 6 may be seen altogether as the means for combining of the device according to the invention. Another way of considering these components is to say that the adder (72) may be integrated into the combining means (16). More generally, it will be clear for the person skilled in the art that the components and means of the device according to the invention should be considered broadly, with or without changes to these components and means and/or addition of further components or means, for instance inherently to the implementation on integrated circuit or for optimization, parameterizing or tuning of said components and means and their operation.
In the embodiment of the device according to the invention as described in Fig. 6, the high-pass filter (76) may be integrated in the discriminating averaging filter (12), the high-pass filtered signal (78) being an additional output of the discriminating averaging filter (12). This advantageously reduces the implementation cost.
It will be clear for the person skilled in the art that all the filters described above may be purely spatial, either in one dimension or two dimensions, purely temporal, or spatiotemporal, i.e. three-dimensional. Nevertheless, the local regression filter (4) is essentially spatial. It will be clear as well for the person skilled in the art that the term "filter" must be understood in the broadest meaning of "any means for filtering".
According to a second aspect, the invention concerns a method for reducing the noise of a video signal (2), comprising a step of filtering the video signal (2) and delivering a filtered signal (6). The method according to the invention further comprises a step of comparing the video signal (2) and the filtered signal (6) and delivering a control signal (10), depending on said comparison. The filtered signal (6) is then combined depending on the control signal (10) with one or several signals, each of them being the output of an additional step of filtering applied to the video signal (2). Another characteristic
of the method according to the invention is that at least one additional filtering step is a discriminating averaging filtering step.
The discriminating averaging filtering step is based on the following method: for a given pixel, the filtered pixel is a weighted average of feature values of surrounding pixels, each weight associated with one of the surrounding pixel depending, inter alia, on the difference of magnitude, or absolute value, between the feature value of the given pixel and the feature value of said surrounding pixel.
In a second embodiment of the method according to the second aspect of the invention, the control signal (10) is the sum of the squares of the residues, i.e. the differences between a certain number of pixels of the video signal (2), said pixels being in the sliding window around the current reference pixel (44), and the corresponding pixel of the filtered signal (6). The control signal (10) may also be the sum of the magnitudes of the residues. In a third embodiment of the method according to the second aspect of the invention, the output signal (18) is a linear combination, i.e. a weighted sum of the filtered signal (6) and the outputs of the at least one additional filtering means, the weights depending on the control signal (10).
In short, the invention may be described as follows : Device and method for reducing the noise of a video signal (2) wherein the video signal (2) is filtered into a local regression filter (4) and into an edge-preserving filter (2), for example a discriminating averaging filter (12), and wherein the output of both filters are combined in order to produce an output signal (18), wherein the combination (16) depends on how valuable the local regression filtering (4) is in such way as to apply some optimal filtering means according to the local characteristics of the picture, in particular the presence or not of picture details.