CN112750447B

CN112750447B - Method for removing wind noise

Info

Publication number: CN112750447B
Application number: CN202011504399.9A
Authority: CN
Inventors: 关海欣; 梁家恩
Original assignee: Unisound Intelligent Technology Co Ltd; Xiamen Yunzhixin Intelligent Technology Co Ltd
Current assignee: Unisound Intelligent Technology Co Ltd; Xiamen Yunzhixin Intelligent Technology Co Ltd
Priority date: 2020-12-17
Filing date: 2020-12-17
Publication date: 2023-01-24
Anticipated expiration: 2040-12-17
Also published as: CN112750447A

Abstract

The invention relates to a method for removing wind noise, on one hand, frame-level channel energy is used for selecting, and signals with small noise interference of each channel are selected, so that compromise caused by beam forming is avoided; on the other hand, the low-frequency band noise component is corrected by using the PR, so that the noise interference is removed more accurately.

Description

Method for removing wind noise

Technical Field

The invention relates to the technical field of voice recognition, in particular to a method for removing wind noise.

Background

In outdoor voice product application, wind noise is often generated easily, the wind noise is often very strong in energy, a voice spectrum structure can be submerged, and the voice listening feeling and the intelligibility are greatly reduced.

Disclosure of Invention

The invention provides a method for removing wind noise, which can solve the technical problem of 2mic wind noise.

The technical scheme for solving the technical problems is as follows:

according to an aspect of an embodiment of the present invention, there is provided a method of removing wind noise, including:

firstly, transforming a first path of input signal Y1 (t) and a second path of input signal Y2 (t) to a frequency domain Y1 (l, k), Y2 (l, k) through a framing/windowing/FFT module;

secondly, calculating the minimum value of the frequency spectrums of the two paths of signals and the phase angle of the first path of signals through a frame-level signal selection module to obtain selected signals;

secondly, calculating a difference sum ratio by using a module PR (l, k);

thirdly, estimating steady-state noise V (l, k) through a steady-state noise tracking module;

step four, selectively correcting noise through a noise correcting module;

a fifth step of determining a final noise estimate based on the noise stage PR (l, k) value;

and sixthly, filtering by using a filter through a wiener to obtain the audio signal with the wind noise removed.

Preferably, the calculating of the minimum value of the frequency spectrums of the two paths of signals is as follows: i Ys (l, k) | = min (| Y1 (l, k) |, | Y2 (l, k) |).

Preferably, the phase angle of the first path of signal is: θ (l, k) = angle (Y1 (l, k)).

Preferably, after said obtainingThe signals of (a) are: ys (l, k) = | Ys (l, k) |. Exp (i · θ (l, k)), wherein

Preferably, the fourth step comprises: when the value of the signal stage PR (l, k) is lower than a specific value, noise correction is not carried out; when the signal phase PR (l, k) value is higher than a specific value, noise correction is performed.

Preferably, the particular value is 0.1667.

Preferably, the fifth step comprises, when the first condition is met, reevaluating the noise and obtaining a final noise estimate; when the first condition is not satisfied, noise correction is not performed.

Preferably, the first condition is that the spectrum is below the 3kHz band interval and PR (l, k) is greater than the signal deviation 0.1667.

Preferably, said reestimating noise is implemented as:

the final noise estimate is implemented as:

preferably, the wiener filtering is implemented as:

wherein phi _ss (l, k) are estimated using a decision directed DD method.

Therefore, on one hand, frame level channel energy competition selection is used, signals with small noise interference of each channel are selected, and compromise caused by beam forming is avoided; on the other hand, by correcting the low-band noise component using PR, noise interference is more accurately removed.

Drawings

Fig. 1 is a schematic flow chart of a method for removing wind noise according to an embodiment of the present invention.

Fig. 2 is a waveform diagram of outdoor 2mic actual data of the electric vehicle according to the embodiment of the invention.

Fig. 3 is a waveform diagram of data processed by the method for removing wind noise according to the embodiment of the invention in fig. 2.

Detailed Description

The principles and features of this invention are described below in conjunction with the following drawings, which are set forth to illustrate, but are not to be construed to limit the scope of the invention.

Wind noise suppression is classified into single-channel and multi-channel methods, and the multi-channel method generally uses beam forming and then cascading post-filtering. In actual speech there are usually two problems, one of which is: unlike the situation that the energy of signals received indoors is roughly equivalent, the noise energy received by two mics is sometimes greatly different due to extremely unbalanced and stable environment outdoors, for example, one mics has good tone quality, and the other mics has extremely strong noise, and at this time, the mixed signal obtained by using the beam forming technology is inferior to the channel with better quality in the original signal; the second step is as follows: a commonly used form of filter for post-filtering techniques is

Wherein

For the prior snr, λ (l, k) =1 is the standard wiener filter, and since the wind noise is a fast-varying unsteady signal, the noise tracking algorithm usually cannot track effectively, and a compensation form has been proposed

Wherein PR (l, k) = Φ _diff (l，k)/Φ _sum (l，k).，

Ydiff(l,k)＝Yi(l,k)-Yj(l,k),Ysum(l,k)＝Yi(l,k)+Yj(l,k),

Φdiff(l,k)＝E{|Ydiff(l,k)| ² },Φsum(l,k)＝E{|Ysum(l,k)| ² }.

Because wind noise is mainly distributed at low frequency, coherence among channels is weak, difference and energy ratio of wind noise signals are high, and the low frequency correlation of voice is strong, the difference and energy ratio is low, the method can be used as a compensation form of wind noise post-filtering, more noise is removed during wind noise, but PR is associated with wind noise, accurate estimation of the wind noise ratio cannot be obtained, only an empirical value is obtained, rho in a formula is also set through experience, and performance is not stable under different environments.

To solve the above problem, a method for removing wind noise according to an embodiment of the present invention is shown in fig. 1. The method comprises the following specific steps:

firstly, converting two input signals Y1 (t) and Y2 (t) into frequency domains Y1 (l, k) and Y2 (l, k) through a framing/windowing/FFT module, wherein l is a frame and k is a frequency point;

secondly, calculating two paths of signal spectrum minimum values | Ys (l, k) | = min (| Y1 (l, k) |, | Y2 (l, k) |) and a phase angle θ (l, k) = angle (Y1 (l, k)) of the first path of signal through a frame level signal selection module to obtain a selected signal Ys (l, k) = | Ys (l, k) | exp (i · θ (l, k)), wherein

Second, the difference sum ratio is calculated. The difference sum ratio is calculated using a module PR (l, k), where PR (l, k) = Φ diff (l, k)/Φ sum (l, k).

And thirdly, estimating the steady-state noise V (l, k) through a steady-state noise tracking module.

The module can use the module described in Rainer Martin, noise power estimation based on optimal smoothing and minimum statistics, IEEE Trans, speech and Audio Processing,9 (5): 504-512, july 2001, which will not be described in detail herein.

And fourthly, selectively correcting the noise through a noise correction module.

The microphone usually cannot achieve ideal consistency, i.e. the difference of the signals is not 0, the energy deviation of the silicon microphone is generally within 1dB, the polar microphone is within 3dB, the signal phase PR (l, k) calculated according to the polar microphone is at most 0.1667, and no noise correction should be made when PR (l, k) is lower than the specific value, so as to avoid damaging the voice.

In some embodiments, the particular value may be set to 0.2.

In a fifth step, a final noise estimate is determined based on the noise stage PR (l, k) values.

The noise stage PR (l, k) value may be greater than 1, and if greater than 1 occurs, the constraint is forced to be 1, and in addition, since wind noise is concentrated below 3kHz, the spectrum above 3kHz does not need to be corrected for noise. In summary, in the frequency band interval below 3kHz, when PR (l, k) is larger than the signal deviation 0.1667, the noise is reestimated

Final noise estimation

Sixth, wiener filtering is used by the filter:

wherein phi _ss (l, k) are estimated using a decision directed DD method.

According to the scheme, the problem of delay alignment between mics is not considered, errors caused by small low-frequency delay deviation are not large when the mics are small, but if the mics are large, the errors are increased, so that the PR value of the signal is increased, and noise overestimation is caused. Therefore, in some embodiments, a delay estimation module may be added in the previous stage to align the inter-mic signals.

Therefore, on one hand, frame-level channel energy match selection is used, signals with small noise interference of each channel are selected, and compromise caused by beam forming is avoided; on the other hand, by correcting the low-band noise component using PR, noise interference is more accurately removed. From the comparison between fig. 2 and fig. 3, it can be seen that the noise interference can be more accurately removed by using the method for removing wind noise of the present invention.

The method for removing wind noise provided by the embodiment of the invention can be realized in the form of a software functional module, can be sold or used as an independent product, and can be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

While the invention has been described with reference to specific embodiments, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A method of removing wind noise, comprising:

firstly, converting a first path of input signals Y1 (t) and a second path of input signals Y2 (t) into frequency domains Y1 (l, k) and Y2 (l, k) through a framing/windowing/FFT module, wherein l is a frame and k is a frequency point;

secondly, calculating the minimum value of the frequency spectrums of the two paths of signals through a frame level signal selection module: i Ys (l, k) | = min (| Y1 (l, k) |, | Y2 (l, k) |), and the phase angle of the first signal: θ (l, k) = angle (Y1 (l, k)), a signal selected by comparison is obtained: ys (l, k) = | Ys (l, k) |. Exp (i · θ (l, k)), where

Third, the sum of differences is calculated using a module PR (l, k), where PR (l, k) = Φ diff (l, k)/Φ sum (l, k), ydiff (l, k) = Yi (l, k) -Yj (l, k), ysum (l, k) = Yi (l, k) + Yj (l, k), Φ diff (l, k) = E { | Ydiff (l, k) } ² }，Φsum(l,k)＝E{|Ysum(l,k)| ² }；

Fourthly, estimating steady-state noise V (l, k) through a steady-state noise tracking module;

fifthly, selectively correcting noise through a noise correction module;

a sixth step of determining a final noise estimate based on the noise stage PR (l, k) value;

and seventhly, filtering by using a filter through a wiener to obtain the audio signal with the wind noise removed.

2. The method of removing wind noise according to claim 1,

the fifth step includes:

when the value of the signal stage PR (l, k) is lower than a specific value, noise correction is not carried out;

when the value of the signal period PR (l, k) is higher than a certain value, noise correction is performed.

3. The method of removing wind noise according to claim 2,

the particular value is 0.1667.

4. The method of removing wind noise according to claim 1,

the sixth step comprises, when the first condition is satisfied, reestimating the noise and obtaining a final noise estimate;

when the first condition is not satisfied, noise correction is not performed.

5. The method of removing wind noise according to claim 4,

the first condition is that the spectrum is below the 3kHz band interval and PR (l, k) is greater than the signal deviation 0.1667.

6. The method of removing wind noise according to claim 4,

the reestimated noise is implemented as:

wherein l is a frame and k isFrequency point

The final noise estimate is implemented as:

7. the method of removing wind noise according to any one of claims 1 to 6,

the wiener filtering is implemented as:

where l is the frame, k is the frequency point, phi _ss (l, k) are estimated using a decision directed DD method.