CN110875054B

CN110875054B - Far-field noise suppression method, device and system

Info

Publication number: CN110875054B
Application number: CN201811012141.XA
Authority: CN
Inventors: 余涛; 银鞍
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2018-08-31
Filing date: 2018-08-31
Publication date: 2023-07-25
Anticipated expiration: 2038-08-31
Also published as: CN110875054A

Abstract

The application discloses a far-field noise suppression method, and a far-field noise suppression device and system. The far-field noise suppression method comprises the following steps: according to the method, the energy of a voice beam and the energy of a noise beam are obtained, the energy of the voice beam and the energy of the noise beam are compared and analyzed, two-path noise energy is obtained, according to the two-path noise energy and the energy of the voice beam, a first parameter for eliminating non-stationary noise is obtained, and according to the first parameter, the voice beam is subjected to filtering processing, so that first enhanced voice data are obtained. On the basis of the voice wave beam, the noise wave beam is further combined to accurately estimate the non-stationary noise in the voice signal, so that the elimination of the non-stationary noise is realized, and the accuracy of far-field voice recognition can be improved in a complex environment.

Description

Far-field noise suppression method, device and system

Technical Field

The application relates to the field of far-field voice signal processing, in particular to a far-field noise suppression method. The application also relates to a far-field noise suppression device and a far-field noise suppression system.

Background

With the continuous development of artificial intelligence technology, far-field speech recognition is becoming more and more important as a key technology of man-machine interaction, and people hope that a machine can understand human speech instructions, so that control over the machine is realized. Despite the rapid development of speech recognition technology in the past decades, the dependence of far-field speech recognition on the environment is still strong, and the accuracy of far-field speech recognition is severely degraded by a large amount of environmental noise. As the application scenarios of the voice communication device (e.g., malls, stations, streets, etc.) are increasing, the types of noise signals accompanying the voice signals are also increasing, wherein there are not only a large number of stationary noise signals but also a large number of non-stationary noise signals. This requires a better adaptation of the method for noise estimation of the speech signal, and an accurate estimate of the noise in the speech signal is made, thus suppressing the noise signal in the far-field speech signal.

In order to solve the above-mentioned problems, in the prior art, noise in a speech signal is generally estimated by using a single-path noise estimation method, so that the noise signal in the speech recognition process can be eliminated to a certain extent, and the accuracy of speech signal recognition is improved. However, the single-pass noise estimation method also exposes the following drawbacks: because only one path of target voice wave beam is used as a basis, noise estimation has certain limitation, and noise can not be eliminated well, and especially in complex environmental noise, the reliability of the noise estimation is low, the result of the noise estimation is inaccurate, and therefore, the accuracy of far-field voice recognition is low.

Disclosure of Invention

The application provides a far-field noise suppression method, which aims to solve the problems of low reliability of noise estimation and inaccuracy of noise estimation results in complex environmental noise in the prior art. The present application additionally provides a far-field noise suppression device and system.

The far-field noise suppression method provided by the application comprises the following steps:

acquiring a voice beam acquired by a first microphone and a noise beam acquired by a second microphone;

according to the voice beam and the noise beam, obtaining the energy of the voice beam and the energy of the noise beam;

comparing and analyzing the energy of the voice beam and the energy of the noise beam to obtain two-path noise energy, wherein the two-path noise energy is the energy of non-stationary noise;

obtaining a first parameter for eliminating non-stationary noise according to the two-way noise energy and the energy of the voice beam;

and filtering the voice beam according to the first parameter to obtain first enhanced voice data.

Optionally, the far-field noise suppression method further includes:

carrying out single-path noise estimation on the voice wave beam to obtain single-path noise energy, wherein the single-path noise energy is the energy of stable noise;

Performing noise comprehensive analysis according to the two-path noise energy, the one-path noise energy and the energy of the voice beam to obtain comprehensive noise energy;

obtaining a second parameter for eliminating non-stationary noise and stationary noise according to the integrated noise energy and the energy of the voice beam;

and filtering the voice beam according to the second parameter to obtain second enhanced voice data.

Optionally, the first parameter is a first nanofiltration coefficient for suppressing non-stationary noise data;

the second parameter is a second wiener filter coefficient for suppressing stationary noise data and non-stationary noise data.

Optionally, the obtaining the energy of the voice beam and the energy of the noise beam specifically includes: obtaining the energy of the voice beam at the first time point and the energy of the noise beam at the first time point;

the comparing and analyzing the energy of the voice beam and the energy of the noise beam to obtain the double-path noise energy specifically comprises:

obtaining the probability of occurrence of the voice beam at the first time point according to the comparison result of the energy of the voice beam at the first time point and the energy of the noise beam at the first time point;

And obtaining double-path noise energy according to the occurrence probability of the voice wave beam at the first time point.

Optionally, the acquiring the voice beam acquired by the first microphone and the noise beam acquired by the second microphone specifically includes:

acquiring position information of the first microphone and the second microphone and a distance value between the first microphone and the second microphone;

and acquiring a voice beam acquired by the first microphone and a noise beam acquired by the second microphone according to the position information and the distance value.

Optionally, the far-field noise suppression method further includes:

obtaining the occurrence probability of the voice wave beam at the first time point;

and determining a smoothing factor of the first microphone at the first time point according to the probability of occurrence of the voice wave beam at the first time point.

Optionally, the obtaining the two-way noise energy according to the probability of occurrence of the voice beam at the first time point specifically includes: :

determining a smoothing factor of the first microphone at a first time point according to the occurrence probability of the voice wave beam at the first time point;

and determining double-path noise energy according to the smoothing factor of the first time point and the energy of the voice beam of the first time point.

Optionally, the performing noise comprehensive analysis according to the two-way noise energy, the one-way noise energy and the energy of the voice beam to obtain comprehensive noise energy specifically includes:

obtaining a noise comprehensive analysis algorithm:

wherein,,for the single-path noise energy, +.>For the two-way noise energy, alpha is the energy of the voice beam, and J is the comprehensive noise energy;

and inputting the two-path noise energy, the one-path noise energy and the energy of the voice beam into the noise comprehensive analysis algorithm to perform noise comprehensive analysis, so as to obtain comprehensive noise energy.

Optionally, the obtaining a second parameter for eliminating non-stationary noise and stationary noise according to the integrated noise energy and the energy of the voice beam specifically includes:

obtaining a ratio of the speech beam energy to the noise synthesis energy and the speech beam energy;

determining the second parameter for eliminating non-stationary noise and stationary noise according to the ratio;

wherein, the ratio is: g=speech beam energy/(speech beam energy + noise integrated energy), where g is the second parameter.

Optionally, the performing single-path noise estimation on the voice beam to obtain single-path noise energy specifically includes:

Preliminary estimation is carried out on the voice wave beam, and the minimum value of the voice wave beam is obtained;

obtaining the probability of occurrence of the voice wave beam according to the minimum value of the voice wave beam;

and estimating the energy of the single-path noise by using recursive average according to the occurrence probability of the voice wave beam.

Correspondingly, the application also provides a far-field noise suppression device, which comprises:

the acquisition unit is used for acquiring the voice beam acquired by the first microphone and the noise beam acquired by the second microphone;

the first obtaining unit is used for obtaining the energy of the voice beam and the energy of the noise beam according to the voice beam and the noise beam;

the first analysis unit is used for comparing and analyzing the energy of the voice beam and the energy of the noise beam to obtain two-path noise energy, wherein the two-path noise energy is the energy of non-stationary noise;

a second obtaining unit configured to obtain a first parameter for eliminating non-stationary noise according to the two-way noise energy and the energy of the voice beam;

and the first processing unit is used for carrying out filtering processing on the voice wave beam according to the first parameter to obtain first enhanced voice data.

Optionally, the far-field noise suppression device further includes:

The third obtaining unit is used for carrying out single-path noise estimation on the voice wave beam to obtain single-path noise energy, wherein the single-path noise energy is the energy of stationary noise;

the second analysis unit is used for carrying out noise comprehensive analysis according to the two-path noise energy, the one-path noise energy and the energy of the voice beam to obtain comprehensive noise energy;

a fourth obtaining unit for obtaining a second parameter for eliminating non-stationary noise and stationary noise according to the integrated noise energy and the energy of the voice beam;

and the second processing unit is used for carrying out filtering processing on the voice wave beam according to the second parameter to obtain second enhanced voice data.

Optionally, the far-field noise suppression device further includes:

obtaining a noise comprehensive analysis algorithm:

Correspondingly, the application also provides a far-field noise suppression system, which comprises: the far-field noise suppression device according to any one of the above technical solutions.

Correspondingly, the application also provides electronic equipment, which comprises:

a processor; and

a memory for storing a program of a far-field noise suppression method, the apparatus, after powering on and running the program of the far-field noise suppression method by the processor, performing the steps of:

Accordingly, the present application further provides a storage device storing a program of the far-field noise suppression method, the program being executed by a processor to perform the steps of:

Compared with the prior art, the application has the following advantages:

the application provides a far-field noise suppression method, in particular to a far-field noise suppression method under a complex environment. The method comprises the steps of obtaining a voice beam collected by a first microphone and a noise beam collected by a second microphone, obtaining the energy of the voice beam and the energy of the noise beam according to the voice beam and the noise beam, comparing and analyzing the energy of the voice beam and the energy of the noise beam to obtain two-way noise energy, wherein the two-way noise energy is the energy of non-stationary noise, obtaining a first parameter for eliminating non-stationary noise according to the two-way noise energy and the energy of the voice beam, and performing filtering processing on the voice beam according to the first parameter to obtain first enhanced voice data. On the basis of the voice wave beam, the noise wave beam is further combined to accurately estimate the non-stationary noise in the voice signal, so that the elimination of the non-stationary noise is realized, and the accuracy of far-field voice recognition can be improved in a complex environment.

Drawings

FIG. 1 is a flow chart of an embodiment of a far field noise suppression method of the present application;

FIG. 2 is a schematic diagram of an embodiment of a far field noise suppression device of the present application;

FIG. 3 is a schematic diagram of an embodiment of a far field noise suppression electronic device of the present application;

FIG. 4 is a flowchart of an embodiment of the far field noise suppression system of the present application;

fig. 5 is a flow chart of an embodiment of a method of suppressing stationary noise and non-stationary noise according to the present application.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is, however, susceptible of embodiment in many other ways than those herein described and similar generalizations can be made by those skilled in the art without departing from the spirit of the application and, therefore, the application is not limited to the specific embodiments disclosed below.

Embodiments thereof are described in detail below based on the far-field noise suppression method of the present application. In addition, in the following description, each step of the present method will be described in detail. Please refer to fig. 1, which is a flowchart illustrating an embodiment of a far-field noise suppression method of the present application.

Step S101: and acquiring a voice beam acquired by the first microphone and a noise beam acquired by the second microphone.

In the far-field voice recognition process, people need to acquire voice data through a multi-microphone array, and then further process the acquired voice data, so that the far-field voice data can be accurately recognized. However, in the actual voice data acquisition process, a large amount of non-stationary noise data is often mixed with the acquired voice signal data due to interference of background noise, so that the use requirement cannot be met. Therefore, in this embodiment, a specific method for eliminating noise data in a speech signal by using a speech enhancement manner is provided, where, to effectively suppress non-stationary noise in a complex background noise environment, a first microphone is required to collect a speech beam, and on the basis of the speech beam, a second microphone is further utilized to collect a noise beam to perform two-way noise estimation, so as to track and eliminate the non-stationary noise. It should be noted that, in this embodiment, the classical beam forming method MVDR/LCMV is used to generate the voice beam and the noise beam, where constraint conditions for generating the voice beam coefficient are as follows:

w _speech ＝armin _w (w ^H R _NN w)，s.t w ^H a(θ)＝1

the noise beam is obtained from coefficients of the noise beam, the noise beam coefficients resulting from the following constraints:

Wherein R is _NN A (θ) is a steering vector corresponding to the target speech direction θ, which is a noise correlation matrix.

Step S102: and obtaining the energy of the voice beam and the energy of the noise beam according to the voice beam and the noise beam.

In this embodiment, the main process of the two-way noise estimation method is: the existence probability of the voice wave beam at the frequency point in the current frame is judged through the energy difference of the same frame and the same frequency point among channels, a smoothing factor is determined according to the existence probability of the voice wave beam, and noise estimation is carried out by combining the voice spectrum information containing noise signals collected by the microphone. When the frequency point is determined to have voice, the noise estimation result is updated in real time, the energy value of the current frame is used as the noise estimation result of the current frame, and when the frequency point is determined to have voice, the noise estimation result is obtained from the noise estimation result of the previous frame of the current frame. The energy of the voice beam and the energy of the noise beam obtained according to the voice beam and the noise beam described in the present embodiment are specifically the energy of the voice beam at the first time point and the energy of the noise beam at the first time point. After the energy of the voice beam at the first time point and the energy of the noise beam at the first time point are obtained, further comparing and analyzing the energy of the voice beam at the first time point and the energy of the noise beam at the first time point to obtain the probability of occurrence of the voice beam at the first time point, and further obtaining the double-path noise energy according to the probability of occurrence of the voice beam at the first time point.

Step S103: and comparing and analyzing the energy of the voice beam and the energy of the noise beam to obtain double-path noise energy, wherein the double-path noise energy is the energy of non-stationary noise.

In this embodiment, the specific process of comparing and analyzing the energy of the voice beam and the energy of the noise beam is: assuming that the moment of the first time point is t, the probability of occurrence of the voice beam at the first time point is obtained according to the following formula:

wherein atan is an arctangent function,for the energy of the speech beam at said first point in time,/for the first point in time>The energy of the noise beam at the first time point, P (t) is the probability of occurrence of the speech beam at the first time point.

Further, the two-way noise energy is obtained according to the following formula:

wherein, gamma is a smoothing factor;for the single-path noise energy, +.>And t is the first time point for the two-way noise energy.

The energy of the two-way noise in this embodiment is the energy of the non-stationary noise.

Step S104: and obtaining a first parameter for eliminating non-stationary noise according to the two-path noise energy and the energy of the voice beam.

In this embodiment, the first parameter is a first nanofiltration coefficient for suppressing non-stationary noise data, and the obtaining the first parameter for eliminating non-stationary noise according to the two-way noise energy and the energy of the voice beam specifically includes: obtaining a solving formula of wiener filter coefficients: g=speech beam energy/(speech beam energy+noise energy), where g is the obtained first wiener coefficient, and inputting the two-way noise energy and the speech beam energy into a derivative formula of the wiener filter coefficient, thereby obtaining the first wiener filter coefficient for suppressing non-stationary noise data.

Step S105: and filtering the voice beam according to the first parameter to obtain first enhanced voice data.

In this embodiment, the filtering process is performed on the speech beam according to the first wiener enhancement coefficient, and the specific process is to obtain a wiener filtering process algorithm formula: s=g×y, where g is the first nanofiltration coefficient, y is the speech data before removing the non-stationary noise, and s is the speech data after removing the non-stationary noise. And inputting the first wiener enhancement coefficient to the algorithm formula of the wiener filtering processing process to obtain first enhanced voice data after suppressing non-stationary noise data.

In addition, the embodiment of the present application further provides a preferred implementation manner, and please refer to fig. 5, which is a flowchart of an embodiment of a method for suppressing stationary noise and non-stationary noise in the present application.

On the basis of suppressing non-stationary noise in the above embodiment, the preferred embodiment further performs single-path noise estimation on the single-path voice beam to obtain single-path noise energy. Wherein the single-path noise energy is the energy of stationary noise. And carrying out noise comprehensive analysis according to the obtained two-way noise energy, the single-way noise energy and the energy of the voice beam to obtain comprehensive noise energy, so as to obtain second parameters for eliminating non-stationary noise and stationary noise according to the comprehensive noise energy, and carrying out filtering processing on the voice beam to be identified through the second parameters to obtain second enhanced voice data. Wherein the second parameter is a second wiener filter coefficient for suppressing stationary noise data and non-stationary noise data.

It should be noted that, the performing the one-way noise estimation on the speech beam to obtain one-way noise energy specifically includes: and carrying out preliminary estimation on the voice beam by using a minimum tracking method to obtain the minimum value of the voice beam, obtaining the occurrence probability of the voice beam through the minimum value of the voice beam, and estimating and obtaining the energy of the single-path noise by using recursive average according to the occurrence probability of the voice beam.

The noise comprehensive analysis process is carried out according to the two-way noise energy, the one-way noise energy and the energy of the voice beam, and specifically comprises the following steps: obtaining a noise comprehensive analysis algorithm:wherein (1)>For the single-path noise energy, +.>And inputting the two-path noise energy, the single-path noise energy and the energy of the voice beam into a noise comprehensive analysis algorithm to perform noise comprehensive analysis to obtain comprehensive noise energy, so that non-stationary noise and stationary noise are eliminated at the same time, and the accuracy of voice recognition is further improved.

Corresponding to the far-field noise suppression method, the present application further provides a far-field noise suppression device, where the far-field noise suppression method can be applied to the device, and please refer to fig. 2, which is a schematic diagram of an embodiment of the far-field noise suppression device of the present application. Since the present apparatus embodiment is similar to the method embodiment, the description is relatively simple, and the relevant points are only required to be referred to in the system embodiment section, and the apparatus embodiment is described below only schematically.

An acquisition unit 201, configured to acquire a voice beam acquired by the first microphone and a noise beam acquired by the second microphone.

In this embodiment, a specific method for eliminating noise data in a speech signal by using a speech enhancement manner is provided, where, to effectively suppress non-stationary noise in a complex background noise environment, a first microphone is required to collect a speech beam, and on the basis of the speech beam, a second microphone is further utilized to collect a noise beam to perform two-way noise estimation, so as to track and eliminate the non-stationary noise. It should be noted that, in this embodiment, the classical beam forming method MVDR/LCMV is used to generate the voice beam and the noise beam, where the voice beam coefficient is generated from the following constraint conditions:

w _speech ＝armin _w (w ^H R _NN w)，s.t w ^H a(θ)＝1

A first obtaining unit 202, configured to obtain the energy of the voice beam and the energy of the noise beam according to the voice beam and the noise beam.

In this embodiment, the obtaining the energy of the voice beam and the energy of the noise beam according to the voice beam and the noise beam specifically includes obtaining the energy of the voice beam at the first time point and the energy of the noise beam at the first time point, comparing and analyzing the energy of the voice beam at the first time point and the energy of the noise beam at the first time point to obtain the probability of occurrence of the voice beam at the first time point, and further obtaining the two-way noise energy according to the probability of occurrence of the voice beam at the first time point.

The first analysis unit 203 is configured to perform a comparative analysis on the energy of the voice beam and the energy of the noise beam, so as to obtain two-path noise energy, where the two-path noise energy is energy of non-stationary noise.

In this embodiment, the specific process of performing the comparative analysis on the energy of the voice beam and the energy of the noise beam is to obtain the probability of occurrence of the voice beam at the first time point according to the following formula, assuming that the time of the first time point is t:

Wherein atan is an arctangent function,for the first time pointIs +.>The energy of the noise beam at the first time point, P (t) is the probability of occurrence of the speech beam at the first time point.

A second obtaining unit 204, configured to obtain a first parameter for eliminating non-stationary noise according to the two-way noise energy and the energy of the voice beam.

In this embodiment, the first parameter is a first wiener filter coefficient for suppressing non-stationary noise data, and the specific process according to the two-path noise energy and the energy of the voice beam is to obtain a solution formula of the wiener filter coefficient: g=speech beam energy/(speech beam energy+noise energy), where g is the obtained first wiener coefficient, and inputting the two-way noise energy and the speech beam energy into a derivative formula of the wiener filter coefficient, thereby obtaining the first wiener filter coefficient for suppressing non-stationary noise data.

The first processing unit 205 is configured to perform filtering processing on the voice beam according to the first parameter, so as to obtain first enhanced voice data.

Corresponding to the far-field noise suppression device, the present application further provides a far-field noise suppression system, where the far-field noise suppression device can be applied to the system, and please refer to fig. 4, which is a flowchart illustrating an embodiment of the far-field noise suppression system of the present application. Since the present system embodiment is similar to the device embodiment, the description is relatively simple, and the relevant points are only required to be referred to in the system embodiment section, and the device embodiment is described below only by way of illustration.

In this embodiment, if the far-field noise suppression system is to suppress far-field noise data, firstly, dereverberation processing is required to be performed on multi-channel far-field speech data, data of each channel is selected and sent to an endpoint detection module to obtain an estimate of start-stop time of the noise data and the speech data, further multi-channel data after dereverberation is sent to a sound source positioning module to obtain speaker position information, meanwhile, beam forming processing is performed on the multi-channel data after dereverberation together with the endpoint detection information and angle information to obtain two audio streams of a speech beam and a noise beam, wherein the first microphone collects the speech beam, the second microphone collects the noise beam, noise estimation is performed on the two audio streams of the collected speech beam and the noise beam, so that noise data is suppressed, and a processed clean speech is obtained.

Corresponding to the method for recording database information provided above, the embodiment of the present application further provides an electronic device for far-field noise suppression, including: the device comprises a processor and a memory, wherein the memory is specifically used for storing a program of a far-field noise suppression method, and after the device is powered on and the program of the far-field noise suppression method is run by the processor, the following steps are executed:

It should be noted that, for the detailed description of an electronic device provided in the embodiments of the present application, reference may be made to the related description of a far-field noise suppression method provided in the embodiments of the present application, which is not repeated here.

Corresponding to the method for recording database information provided above, the embodiment of the present application further provides a storage device for far-field noise suppression, including: the far-field noise suppression method described in the above embodiment.

In the storage device, a program of a far-field noise suppression method is stored, which is executed by a processor, performing the steps of:

It should be noted that, for the detailed description of a storage device provided in the embodiments of the present application, reference may be made to the related description of a far-field noise suppression method provided in the embodiments of the present application, which is not repeated here.

While the preferred embodiment has been described, it is not intended to limit the invention thereto, and any person skilled in the art may make variations and modifications without departing from the spirit and scope of the present invention, so that the scope of the present invention shall be defined by the claims of the present application.

Claims

1. A method of far-field noise suppression, comprising:

2. The far-field noise suppression method according to claim 1, further comprising:

3. The method of far-field noise suppression according to claim 2, wherein the first parameter is a first nanofiltration coefficient for suppressing non-stationary noise data;

4. The far-field noise suppression method according to claim 1, wherein the obtaining the energy of the speech beam and the energy of the noise beam specifically comprises: obtaining the energy of the voice beam at the first time point and the energy of the noise beam at the first time point;

5. The method of claim 1, wherein the acquiring the voice beam acquired by the first microphone and the noise beam acquired by the second microphone specifically comprises:

6. The far-field noise suppression method according to claim 4, further comprising:

7. The method of far-field noise suppression according to claim 6, wherein the obtaining the two-way noise energy according to the probability of occurrence of the voice beam at the first time point specifically includes: :

8. The far-field noise suppression method according to claim 2, wherein the performing noise comprehensive analysis according to the two-way noise energy, the one-way noise energy and the energy of the voice beam to obtain comprehensive noise energy specifically includes:

obtaining a noise comprehensive analysis algorithm:

9. The far-field noise suppression method according to claim 2, wherein the obtaining the second parameter for eliminating the non-stationary noise and stationary noise based on the integrated noise energy and the energy of the voice beam specifically includes:

10. The far-field noise suppression method according to claim 2, wherein the performing single-path noise estimation on the voice beam to obtain single-path noise energy specifically comprises:

11. A far-field noise suppression device, comprising:

12. The far-field noise suppression device of claim 11, further comprising:

13. The far-field noise suppression device according to claim 12, wherein the first parameter is a first nanofiltration coefficient for suppressing non-stationary noise data;

14. The far-field noise suppression device according to claim 11, wherein the obtaining the energy of the speech beam and the energy of the noise beam comprises: obtaining the energy of the voice beam at the first time point and the energy of the noise beam at the first time point;

15. The far-field noise suppression device according to claim 11, wherein the acquiring the speech beam acquired by the first microphone and the noise beam acquired by the second microphone specifically comprises:

16. The far-field noise suppression device of claim 14, further comprising:

17. The far-field noise suppression device according to claim 16, wherein the obtaining the two-way noise energy according to the probability of occurrence of the voice beam at the first time point specifically comprises: :

18. The far-field noise suppression device according to claim 12, wherein the noise synthesis analysis is performed according to two-way noise energy, one-way noise energy and the energy of the voice beam to obtain synthesized noise energy, and specifically comprises:

Obtaining a noise comprehensive analysis algorithm:

19. The far-field noise suppression device according to claim 12, wherein the obtaining a second parameter for canceling non-stationary noise and stationary noise from the integrated noise energy and the energy of the speech beam comprises:

20. The far-field noise suppression apparatus according to claim 12, wherein the performing a one-way noise estimation on the voice beam to obtain one-way noise energy specifically comprises:

21. A far field noise suppression system, comprising: far field noise suppression apparatus according to any one of the preceding claims 11-20.

22. An electronic device, comprising:

a processor; and

23. A storage device storing a program of a far-field noise suppression method, the program being executed by a processor to perform the steps of: