WO2024079264A1

WO2024079264A1 - Wiener-filter-based signal restoration with learned signal-to-noise ratio estimate

Info

Publication number: WO2024079264A1
Application number: PCT/EP2023/078344
Authority: WO
Inventors: Johannes Meyer
Original assignee: Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.
Priority date: 2022-10-14
Filing date: 2023-10-12
Publication date: 2024-04-18
Also published as: DE102022210839A1

Abstract

The disclosure relates to a method for Wiener-filter-based signal restoration, comprising the following method steps: receiving a signal (g); estimating a signal-to-noise ratio for a Wiener-filter-based restoration algorithm (v) by a processing algorithm (φ) obtained by means of a machine learning processing, depending on a spectral power density calculated for the received signal; and generating a restored signal (ŝ) from the received signal (g) and from the signal-to-noise ratio estimated for the Wiener-filter-based restoration algorithm (v) by means of the Wiener-filter-based restoration algorithm (v) in order to improve the filter-based signal restoration, in particular the result of a Wiener-filter-based signal restoration.

Description

Fraunhofer Society 239PCT 1678 GD

Wiener filter-based signal restoration with learned signal-to-noise ratio estimation

The present disclosure relates to a method and apparatus for Wiener filter-based signal restoration, in which a signal is received, a signal-to-noise ratio of the signal is estimated for use in a Wiener filter-based restoration algorithm, and then an original signal is restored from the received signal by means of the Wiener filter-based restoration algorithm, ie a (restored) signal that is as similar as possible to the original signal, taking into account the estimated signal-to-noise ratio. In general, signals transmitted on signal or reception paths are degraded, ie an original or original signal is corrupted on the one hand by non-ideal transmission to a corresponding receiving sensor, mathematically represented by a non-ideal mapping function, and on the other hand by external interference, mathematically represented by an interference signal. As a result, the observed or received signal always deviates from the original signal. Usually, a restoration filter function is therefore applied to the observed signal and a restored signal is generated. The restored signal is an estimate of the original signal, since various assumptions must be made when choosing the restoration function and thus a perfect restoration is not achieved, and as a restored signal it is treated as the same as the original signal in further use.

If, for example, images are captured using a camera system, depending on the situation, physically caused image deterioration can occur. Some image deteriorations can be formulated as linear, shift-invariant systems and thus completely described using their impulse response. Examples of this are blurred images, image errors caused by suboptimal optics, motion blur and the like. From a system theory perspective, the captured image as an observed signal then corresponds to a convolution of the undisturbed image, the original signal, with the impulse response of the existing image deterioration, the non-ideal imaging function. In such cases, depending on the severity of the image deterioration and the image noise present as an additional interference signal, it is possible to a certain extent to use image recovery or restoration methods to calculate an image as a restored signal that is very close to the original image. In theory, this task can be optimally solved using the so-called Wiener filter. In practice, however, the Wiener filter has the crucial disadvantage that the signal-to-noise ratio required for filtering with the Wiener filter is not known and can basically only be estimated. As a result, the filter result of the Wiener filter is usually unsatisfactory and is generally post-processed to achieve a better result. In the article "A Data Driven Approach to A Priori SNR Estimation" by Suhadi S. et al., published in 2011 in the IEEE Transactions on Audio, Speech, and Language Processing 19, on pages 186 to 195, the Wiener filter is used to improve signals in speech processing. Two convolutional neural networks are trained that can detect areas with and without speech in the time signal. Assuming that the noise in both areas is similar, the signal-to-noise ratio, or SNR for short, can be estimated by calculating the corresponding signal components. However, this approach is not applicable to image signals, for example, because the Wiener filter is described in the spatial frequency domain and not with respect to individual pixels or image regions.

In the article "An Iterative SNR Estimation Algorithm for Wiener Deconvolution of Self-Similar Images Distorted by Camera Shake Blurring" by Marcelo A. P. et al., published in 2008 in the Proceedings of the 8th Conference on Signal, Speech and Image Processing on pages 97 to 100, an initial estimate of the SNR is first used to restore the input image using the Wiener filter. The resulting image is compared as a restored image with the input image in terms of the similarity of the gradients in the x and y directions in order to then adjust the SNR accordingly. This is followed by the next iteration.

In the article "SNR-Aware Convolutional Neural Network Modelling for Speech Enhancement" by Fu S.-W. et al., published in 2016 in Interspeech on pages 3268 to 3772, a speech signal is processed by a convolutional neural network in order to estimate the SNR for each time period considered. However, only an average value for the SNR is estimated here and not separate SNR values for all available frequencies, as is required for the Wiener filter.

The task is therefore to improve filter-based signal restoration, in particular the result of Wiener filter-based signal restoration.

This task is solved by the subject-matter of the independent patent claims solved. Advantageous embodiments emerge from the dependent patent claims, the description and the figures.

The approach presented below is based on the usual signal model for signal restoration, as is known, for example, from image restoration. An original signal is transformed by a non-ideal mapping function h, and the transformed signal is additionally distorted by a disturbance n, thus yielding the observed or received signal g. Applying a restoration function v to the observed or received signal yields a restored signal s. The signals s, g, s as well as functions h and v and the disturbance n can, as is typically the case with image signals, exhibit a dependency on a location x, in other areas of application, for example, also a dependency on a frequency f and the like. With the nomenclature presented, the Wiener filter for the case of an image signal in the frequency domain results in the following:

Here, /7(f) = T{h(x)} describes the transfer function of the image degradation, i.e. the Fourier transform of the impulse response as a non-ideal mapping function h(x). In order to be able to use the Wiener filter, the expression SNR(f) = S _s s(f)/S _n n(f ) must be determined or estimated as accurately as possible. Here, S _ss (f) denotes the unknown and thus to be estimated spectral power density of the undisturbed original signal s and S _nn (f) denotes the unknown and thus to be estimated spectral power density of, for example, noise as disturbance n.

One aspect of the approach presented relates to a method for Wiener filter-based signal restoration, also referred to as data signal restoration, with the method steps of receiving a signal, the observed signal g, estimating the signal-to-noise ratio for restoring the original signal s underlying the received signal g in the form of a restored signal s, and generating the restored signal s from the received signal g and the estimated SNR. The method steps are by a signal processing unit, which can contain, for example, a microprocessor and corresponding further electronic elements. The signal belongs to a respective signal type, so it can be, for example, an image signal, in particular a single- or multi-channel image signal, and/or an audio signal, and/or a digital data transmission signal, or the signal can comprise one or more signals of the corresponding signal type "image signal" and/or "audio signal" and/or "data transmission signal". Accordingly, the received signal can be generated and/or received by an image sensor unit and/or audio sensor unit and/or a data transmission unit. The signal is received on a respective reception path, whereby the received or observed signal is formed by a falsification of the original signal by or on the reception path. The falsification can occur due to the nature of the reception path itself, which is then described by the non-ideal mapping function h, or due to additional interference which is described by the interference factor n.

The signal-to-noise ratio is estimated for a Wiener filter-based recovery algorithm by a processing algorithm obtained by means of a machine learning method. The processing algorithm obtained by means of the machine learning method can be or comprise a neural network, in particular a deep neural network with two or more, preferably three or more hidden layers. However, other machine learning methods such as pixel-by-pixel support vector regression can also be used. The estimation is carried out as a function of a spectral power density S _gg calculated for the received signal.

The restored signal s is generated from the received, i.e. observed signal g and the signal-to-noise ratio SNR estimated for the Wiener filter-based restoration algorithm v by means of the Wiener filter-based restoration algorithm v. The signal-to-noise ratio SNR estimated by the processing algorithm obtained in machine learning methods forms the basis of the Wiener filter of the Wiener filter-based restoration algorithm v. In contrast to known methods in which the result of a Wiener filter-based restoration algorithm is subsequently optimized, the method presented here directly addresses the weakness of the Wiener filter, namely the signal-to-noise ratio, which is often difficult to estimate correctly in practice. As a result, the theoretical optimality of the Wiener filter also comes into full effect in practical applications - various experiments have shown that the approach presented here typically achieves the restoration of signals in a quality that exceeds the performance of known approaches in common quality metrics by 10%, i.e. 10 percentage points.

Accordingly, in an advantageous embodiment, the method also includes training the processing algorithm obtained by means of the machine learning method with a large number of training signal-data pairs. These training signal-data pairs each comprise or contain a spectral power density calculated for a received training signal of the same signal type as the signal later received in the application and a training signal-to-noise ratio calculated as a function of an original training signal and a predetermined noise training signal. The training described here and below can also be carried out independently of the signal restoration itself, i.e. spatially and/or temporally separated from the actual Wiener filter-based signal restoration. This has the advantage that the processing algorithm obtained by means of the machine learning method can quickly estimate an SNR in practice, since only the observed signal is required to estimate the respective SNR. Since very large existing databases of signals such as images, audio signals, and other signals and corresponding non-ideal mapping functions such as impulse responses of reception paths can be used for training, such training is also suitable for practical use.

In an advantageous embodiment, it is provided that the spectral power density calculated for the received signal during the estimation is a logarithmic power density, ie the calculated spectral power density after the calculation and before the further processing is logarithmic. and the spectral power density calculated during training for the received training signal is a logarithmic power density, just as the training signal-to-noise ratio calculated as a function of the original training signal and the specified noise training signal is a logarithmic training signal-to-noise ratio, so the SNR is also logarithmized after calculation before further processing. Then, before restoring the original signal, the signal-to-noise ratio estimated for the Wiener filter-based restoration algorithm is exponentiated in order to compensate for distortions induced by the logarithmization of the input variable. This has the advantage that the machine learning method converges better, especially when it is a neural network, in particular a deep neural network, since, especially in the case of image data, the convergence behavior of the machine learning methods mentioned is impaired when estimating the spectral power density via the advantageous square of the discrete Fourier transform.

In a further advantageous embodiment, it is provided that the respective received training signal is calculated as a function of the respective associated original training signal, i.e. the original training signal of the same pair, and a respective impulse response training signal. This means that with access to the different databases, the amount of training data can be increased again in a relevant way and thus the performance of the processing algorithm can be increased. In addition, the respective received training signal can also depend on the specified noise training signal.

In another advantageous embodiment, it is provided that the (non-logarithmic) training signal-to-noise ratio calculated as a function of the original training signal and the predetermined noise training signal comprises the quotient of the spectral power density calculated for the original training signal with the spectral power density calculated for the predetermined noise training signal, in particular is proportional to this quotient or is the quotient. The SNR is therefore calculated with the quotient or as the quotient of the respective spectral power densities estimated or calculated. This leads to good recovery results, especially in combination with the calculation method of the received training signal with the associated spectral power density described in the last paragraph.

A further aspect relates to a signal processing unit for Wiener filter-based signal restoration, which is designed to carry out a method according to one of the described embodiments, i.e. the Wiener filter-based signal restoration and/or the training of the processing algorithm obtained by means of machine learning methods described for this purpose.

Advantages and advantageous embodiments of the signal processing unit correspond to advantages and advantageous embodiments of the respective methods.

The features and combinations of features mentioned above in the description, including in the introductory part, as well as the features and combinations of features mentioned below in the description of the figures and/or shown in the figures alone can be used not only in the combination specified in each case, but also in other combinations without departing from the scope of the invention. Thus, embodiments are also to be regarded as encompassed and disclosed by the invention that are not explicitly shown and explained in the figures, but which emerge and can be produced by separate combinations of features from the explained embodiments. Embodiments and combinations of features are also to be regarded as disclosed that do not have all the features of an originally formulated independent claim. In addition, embodiments and combinations of features, in particular through the embodiments set out above, which go beyond or deviate from the combinations of features set out in the references to the claims are to be regarded as disclosed.

Showing: Fig. 1 shows a signal path for a reception path with subsequent restoration according to a known signal model; and

Fig. 2 shows a schematic overview of an exemplary training procedure for a processing algorithm obtained by means of machine learning.

In the figures, identical or functionally identical elements are provided with the same reference symbols.

Fig. 1 shows a well-known signal model for signal restoration. An original signal s is shaped on the receiving path by its specific properties, which is modeled by a non-ideal mapping function h, which is applied to the original signal s, for example by convolution. The signal is additionally distorted by an external disturbance n, resulting in a signal g, which is then observed or received. This observed or received signal g is transformed by a restoration filter function v, which can also be referred to as a restoration algorithm v, so that the restoration result is a restored signal s, an estimate of the original or original signal s. With image signals as exemplary signals, and thus signals or functions s, h, n, g, v, s dependent on a location x, the Wiener filter in the frequency domain results in the formula already presented:

The decisive factor for the quality of the restoration result is the most accurate determination of the signal-to-noise ratio SNR = S _s s/S _n n, where in the present example the respective terms SNR, S _ss and S _nn are linked to the frequency f via a Fourier transformation by the vector x.

Fig. 2 schematically shows an exemplary embodiment of a method for training a processing algorithm obtained by means of machine learning methods, here a neural network 4>. The neural network 4> is trained in such a way that, based on an estimate of the spectral power density S^ of the received signal g, for example of an observed image g(x), the desired SNR is estimated as SNR (f). Thus, 4>(^) = SNR applies. Here, S _gg (f) = |T'(s ^r (x))| ² with the Fourier transformation T{.} and the estimation of the signal-to-noise ratio SNR. In Fig. 2, the Fourier transformation T{.} is chosen as an example as a discrete Fourier transformation DFT {.}.

In the example shown, an original signal s, in this case an image s(x), is selected from a first database D1 to train the neural network 4>. A corresponding impulse response as a non-linear mapping function h, here h(x), is selected from a second database D2, which can be any signal degradation database. The original signal s is convolved with the impulse response as a non-ideal mapping function h in order to simulate the signal degradation during training. For image data, for example, the image degradation databases from the article "Understanding and Evaluating Blind Deconvolution Algorithms" by Levin A. et al., published in 2009 in the IEEE Conference on Computervision and Pattern Recognition on pages 1964 to 1971 or from the article "Edge-Based Blur Kernel Estimation using Patch Priors" by Libin Sun et al., published in 2013 in the IEEE International Conference on Computational Photography on pages 1 to 8, can be used. The disturbance n, for example simulated as normally distributed noise n(x), is added to the convolution result. The result is a simulated received signal g, here g(x). From this simulated received signal g, the logarithm of the square of the discrete Fourier transform DFT {.} is calculated, which represents the input signal log S _gg for the neural network 4>.

In addition, the logarithmic quotient log

the logarithmized signal-to-noise ratio log SNR, which is later estimated during signal recovery by the processing algorithm, here the neural network 4>, is calculated, which determines or forms a reference input for the training of the neural network 4>.

Using the logarithms log S^and log SNR to train the neuro- nal network instead of S^ and SNR serves to reduce the dynamics of the resulting values. Accordingly, after the evaluation of 4>, the value output by the neural network 4> must be exponentiated, and the sought-after signal-to-noise ratio SNR obtained by the processing algorithm is then given by

The transfer function h required for reconstruction or its Fourier transform H of the signal deterioration can be calculated using other existing methods. If the signals are image data, for example, motion blur can be estimated using the data from an acceleration sensor or a gyroscope of the recording device, for example a smartphone.

Claims

Expectations

1. Method for Wiener filter-based signal recovery, with the following steps:

- receiving a signal (g);

- estimating a signal-to-noise ratio for a Wiener filter-based recovery algorithm (v) by a processing algorithm (4>) obtained by means of a machine learning method, depending on a spectral power density calculated for the received signal;

- generating a restored signal (s) from the received signal (g) and the signal-to-noise ratio estimated for the Wiener filter-based restoration algorithm (v) by means of the Wiener filter-based restoration algorithm (v).

2. Method according to claim 1, characterized in that the signal is or comprises an image signal and/or an audio signal and/or a digital data transmission signal.

3. Method according to one of the preceding claims, characterized in that the signal is generated by an image sensor unit and/or by an audio sensor unit and/or a data transmission unit.

4. Method according to one of the preceding claims, characterized in that the processing algorithm (4>) obtained by means of the machine learning method comprises a neural network, in particular a deep neural network.

5. Method according to one of the preceding claims, characterized by a

- Training the processing algorithm (4>) obtained by means of the machine learning method with a plurality of training signal-data pairs, each of which comprises a spectral power density calculated for a received training signal (g) and a training signal-to-noise ratio calculated as a function of an original training signal (s) and a predetermined noise training signal (n).

6. Method according to claim 5, characterized in that

- the spectral power density calculated in the estimation for the received signal (g) is a logarithmic power density, and

- the spectral power density calculated during training for the received training signal (g) is a logarithmic power density and the training signal-to-noise ratio calculated as a function of the original training signal (s) and the predetermined noise training signal (s) is a logarithmic training signal-to-noise ratio, where

- before generating the recovered signal (s), the signal-to-noise ratio estimated for the Wiener filter-based recovery algorithm (v) is exposed.

7. Method according to claim 5 or 6, characterized in that the respective received training signal (g) is calculated as a function of the respectively associated original training signal (s) and a respective impulse response training signal (h).

8. Method according to claim 5 or 6 or 7, characterized in that the training signal-to-noise ratio calculated as a function of the original training signal (s) and the predetermined noise training signal (n) comprises the quotient of the spectral power density calculated for the original training signal (s) with the spectral power density calculated for the predetermined noise training signal (n), in particular is proportional to this or is the quotient.

9. Method for training the processing algorithm (4>) obtained by machine learning for a Wiener filter-based recovery algorithm (v) according to one of the preceding claims.

10. Signal processing unit for Wiener filter-based signal restoration, which is designed to carry out a method according to one of the preceding claims.