CN110875054A

CN110875054A - Far-field noise suppression method, device and system

Info

Publication number: CN110875054A
Application number: CN201811012141.XA
Authority: CN
Inventors: 余涛; 银鞍
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2018-08-31
Filing date: 2018-08-31
Publication date: 2020-03-10
Anticipated expiration: 2038-08-31
Also published as: CN110875054B

Abstract

The application discloses a far-field noise suppression method, a far-field noise suppression device and a far-field noise suppression system. The far-field noise suppression method comprises the following steps: the method comprises the steps of obtaining the energy of a voice beam and the energy of a noise beam according to the voice beam collected by a first microphone and the noise beam collected by a second microphone, carrying out comparative analysis on the energy of the voice beam and the energy of the noise beam to obtain double-path noise energy, obtaining a first parameter for eliminating non-stationary noise according to the double-path noise energy and the energy of the voice beam, and carrying out filtering processing on the voice beam according to the first parameter to obtain first enhanced voice data. On the basis of the voice beam, the noise beam is further combined to accurately estimate the non-stationary noise in the voice signal, so that the non-stationary noise is eliminated, and the accuracy of far-field voice recognition can be improved in a complex environment.

Description

Far-field noise suppression method, device and system

Technical Field

The application relates to the field of far-field speech signal processing, in particular to a far-field noise suppression method. The application also relates to a far-field noise suppression device and a far-field noise suppression system.

Background

With the continuous development of artificial intelligence technology, as a key technology of human-computer interaction, far-field speech recognition is becoming more and more important, and people hope that a machine can understand human speech instructions, so that the control of the machine is realized. Although the speech recognition technology has been developed rapidly in the past decades, the current far-field speech recognition still has strong dependency on the environment, and the accuracy of the far-field speech recognition is seriously reduced due to a large amount of environmental noise. As the application scenarios of voice communication devices (e.g., shopping malls, stations, streets, etc.) increase, the types of noise signals accompanying voice signals also increase, wherein not only a large number of stationary noise signals but also a large number of non-stationary noise signals exist. This requires that the method for noise estimation of speech signal has better adaptability, and can accurately estimate the noise in speech signal, so as to suppress the noise signal in far-field speech signal.

In order to solve the above problems, in the prior art in the art, a single-path noise estimation method is usually used to estimate noise in a speech signal, so that the noise signal in the speech recognition process can be eliminated to a certain extent, and the accuracy of speech signal recognition is improved. However, the single-pass noise estimation method also reveals the following drawbacks: because only one path of target voice beam is taken as a basis, noise estimation has certain limitation, and further noise cannot be well eliminated, especially in complex environmental noise, the reliability of noise estimation is low, and the noise estimation result is inaccurate, so that the accuracy of far-field voice recognition is low.

Disclosure of Invention

The application provides a far-field noise suppression method, which aims to solve the problems that in complex environmental noise, the reliability of noise estimation is low and the noise estimation result is inaccurate in the prior art. The present application additionally provides a far-field noise suppression apparatus and system.

The application provides a far-field noise suppression method, which comprises the following steps:

acquiring a voice beam acquired by a first microphone and a noise beam acquired by a second microphone;

obtaining the energy of the voice beam and the energy of the noise beam according to the voice beam and the noise beam;

comparing and analyzing the energy of the voice wave beam and the energy of the noise wave beam to obtain double-path noise energy, wherein the double-path noise energy is the energy of non-stationary noise;

obtaining a first parameter for eliminating non-stationary noise according to the energy of the two-way noise and the energy of the voice wave beam;

and filtering the voice wave beam according to the first parameter to obtain first enhanced voice data.

Optionally, the far-field noise suppression method further includes:

performing single-path noise estimation on the voice wave beam to obtain single-path noise energy, wherein the single-path noise energy is the energy of stationary noise;

performing noise comprehensive analysis according to the two-way noise energy, the one-way noise energy and the voice beam energy to obtain comprehensive noise energy;

obtaining a second parameter for eliminating non-stationary noise and stationary noise according to the integrated noise energy and the energy of the voice beam;

and filtering the voice wave beam according to the second parameter to obtain second enhanced voice data.

Optionally, the first parameter is a first wiener filter coefficient for suppressing non-stationary noise data;

the second parameter is a second wiener filter coefficient for suppressing stationary noise data and non-stationary noise data.

Optionally, the obtaining the energy of the voice beam and the energy of the noise beam specifically includes: obtaining the energy of a voice beam at a first time point and the energy of a noise beam at the first time point;

the comparing and analyzing the energy of the voice beam and the energy of the noise beam to obtain two-way noise energy specifically includes:

obtaining the probability of the voice beam appearing at the first time point according to the comparison result of the energy of the voice beam at the first time point and the energy of the noise beam at the first time point;

and obtaining double-path noise energy according to the probability of the voice wave beam at the first time point.

Optionally, the acquiring a voice beam collected by the first microphone and a noise beam collected by the second microphone specifically includes:

acquiring position information of the first microphone and the second microphone and a distance value between the first microphone and the second microphone;

and acquiring the voice beam acquired by the first microphone and the noise beam acquired by the second microphone according to the position information and the distance value.

Optionally, the far-field noise suppression method further includes:

obtaining the probability of the voice wave beam appearing at the first time point;

and determining a smoothing factor of the first microphone at the first time point according to the probability of the voice beam at the first time point.

Optionally, the obtaining, according to the probability of occurrence of the voice beam at the first time point, two-way noise energy specifically includes: :

determining a smoothing factor of the first microphone at the first time point according to the probability of the occurrence of the voice beam at the first time point;

and determining double-path noise energy according to the smoothing factor of the first time point and the energy of the voice wave beam of the first time point.

Optionally, the performing noise comprehensive analysis according to the two-way noise energy, the one-way noise energy, and the energy of the voice beam to obtain comprehensive noise energy specifically includes:

obtaining a noise comprehensive analysis algorithm:

wherein,

for the purpose of the single-pass noise energy,

α is the energy of the speech beam, J is the composite noise energy;

and inputting the two-way noise energy, the one-way noise energy and the voice beam energy into the noise comprehensive analysis algorithm for noise comprehensive analysis to obtain comprehensive noise energy.

Optionally, the obtaining a second parameter for eliminating non-stationary noise and stationary noise according to the synthesized noise energy and the energy of the voice beam specifically includes:

obtaining a ratio of the voice beam energy to the noise integrated energy to the voice beam energy;

determining the second parameter for eliminating non-stationary noise and stationary noise according to the ratio;

wherein the ratio is: and g is the voice beam energy/(voice beam energy + noise integrated energy), wherein g is the second parameter.

Optionally, the performing single-path noise estimation on the voice beam to obtain single-path noise energy specifically includes:

performing preliminary estimation on the voice wave beam to obtain the minimum value of the voice wave beam;

obtaining the probability of the voice wave beam according to the minimum value of the voice wave beam;

and estimating the energy of the one-way noise by using recursive average according to the probability of the occurrence of the voice beam.

Correspondingly, the present application also provides a far-field noise suppression device, including:

the acquiring unit is used for acquiring a voice beam acquired by the first microphone and a noise beam acquired by the second microphone;

a first obtaining unit, configured to obtain, based on the voice beam and the noise beam, an energy of the voice beam and an energy of the noise beam;

the first analysis unit is used for carrying out comparison analysis on the energy of the voice wave beam and the energy of the noise wave beam to obtain double-path noise energy, and the double-path noise energy is the energy of non-stationary noise;

a second obtaining unit, configured to obtain a first parameter for eliminating non-stationary noise according to the two-way noise energy and the energy of the voice beam;

and the first processing unit is used for carrying out filtering processing on the voice beam according to the first parameter to obtain first enhanced voice data.

Optionally, the far-field noise suppression device further includes:

a third obtaining unit, configured to perform single-path noise estimation on the voice beam to obtain single-path noise energy, where the single-path noise energy is energy of stationary noise;

the second analysis unit is used for carrying out noise comprehensive analysis according to the two-way noise energy, the one-way noise energy and the energy of the voice wave beam to obtain comprehensive noise energy;

a fourth obtaining unit configured to obtain a second parameter for eliminating non-stationary noise and stationary noise based on the integrated noise energy and the energy of the voice beam;

and the second processing unit is used for carrying out filtering processing on the voice beam according to the second parameter to obtain second enhanced voice data.

Optionally, the far-field noise suppression device further includes:

obtaining a noise comprehensive analysis algorithm:

wherein,

for the purpose of the single-pass noise energy,

α is the energy of the speech beam, J is the composite noise energy;

Correspondingly, the present application also provides a far-field noise suppression system, including: the far-field noise suppression device according to any one of the above technical solutions.

Correspondingly, the present application also provides an electronic device, comprising:

a processor; and

a memory for storing a program of a far-field noise suppression method, the apparatus performing the following steps after being powered on and running the program of the far-field noise suppression method by the processor:

Accordingly, the present application also provides a storage device storing a program of a far-field noise suppression method, the program being executed by a processor and performing the steps of:

Compared with the prior art, the method has the following advantages:

the application provides a far-field noise suppression method, in particular to a far-field noise suppression method under a complex environment. The method comprises the steps of obtaining the energy of a voice beam and the energy of a noise beam according to the voice beam and the noise beam by obtaining the voice beam collected by a first microphone and the noise beam collected by a second microphone, carrying out comparative analysis on the energy of the voice beam and the energy of the noise beam to obtain double-path noise energy, wherein the double-path noise energy is the energy of non-stationary noise, obtaining a first parameter for eliminating the non-stationary noise according to the double-path noise energy and the energy of the voice beam, and carrying out filtering processing on the voice beam according to the first parameter to obtain first enhanced voice data. On the basis of the voice beam, the noise beam is further combined to accurately estimate the non-stationary noise in the voice signal, so that the non-stationary noise is eliminated, and the accuracy of far-field voice recognition can be improved in a complex environment.

Drawings

FIG. 1 is a flow chart of an embodiment of a far-field noise suppression method of the present application;

FIG. 2 is a schematic diagram of an embodiment of the far-field noise suppression apparatus of the present application;

FIG. 3 is a schematic diagram of an embodiment of far-field noise suppression electronics of the present application;

FIG. 4 is a flowchart of the operation of an embodiment of the far field noise suppression system of the present application;

fig. 5 is a flowchart of an embodiment of a method for suppressing stationary noise and non-stationary noise according to the present application.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is capable of implementation in many different ways than those herein set forth and of similar import by those skilled in the art without departing from the spirit of this application and is therefore not limited to the specific implementations disclosed below.

The following describes an embodiment of the far-field noise suppression method in detail based on the present application. In addition, in the following description, detailed explanation will be made separately for each step of the present method. Please refer to fig. 1, which is a flowchart illustrating an embodiment of a far-field noise suppression method according to the present application.

Step S101: a voice beam acquired by a first microphone and a noise beam acquired by a second microphone are acquired.

In the process of far-field speech recognition, people need to acquire speech data through a multi-microphone array, and then further process the acquired speech data, so that the far-field speech data is accurately recognized. However, in the actual voice data collection process, due to the interference of background noise, the collected voice signal data is often mixed with a large amount of non-stationary noise data, and thus the use requirement cannot be met. Therefore, in the present embodiment, a specific method for eliminating noise data in a speech signal by using a speech enhancement mode is provided, wherein, to achieve effective suppression of non-stationary noise in a complex background noise environment, a speech beam needs to be collected by a first microphone first, and on the basis of the speech beam, a second microphone is further used to collect a noise beam for two-way noise estimation, so as to achieve tracking and elimination of the non-stationary noise. It should be noted that, in this embodiment, the speech beam and the noise beam are generated by using a classical beam forming method MVDR/LCMV, where the constraint conditions generated by the speech beam coefficients are:

w_speech＝armin_w(w^HR_NNw)，s.t w^Ha(θ)＝1

the noise beam is obtained from the coefficients of the noise beam, which are generated from the following constraints:

wherein R is_NNAnd a (theta) is a steering vector corresponding to the target voice direction theta.

Step S102: and obtaining the energy of the voice beam and the energy of the noise beam according to the voice beam and the noise beam.

In this embodiment, the two-way noise estimation method mainly includes: judging the existence probability of the voice wave beam at the frequency point in the current frame through the energy difference of the same frame and the same frequency point among channels, determining a smoothing factor according to the existence probability of the voice wave beam, and carrying out noise estimation by combining the voice frequency spectrum information which is collected by a microphone and contains noise signals. When the voice beam does not exist at the frequency point, the noise estimation result is updated in real time, the energy value of the current frame is used as the noise estimation result of the current frame, and when the voice exists at the frequency point, the noise estimation result is the noise estimation result of the previous frame of the current frame. The energy of the voice beam and the energy of the noise beam obtained from the voice beam and the noise beam described in this embodiment are specifically the energy of the voice beam at the first time point and the energy of the noise beam at the first time point. After the energy of the voice beam at the first time point and the energy of the noise beam at the first time point are obtained, the energy of the voice beam at the first time point and the energy of the noise beam at the first time point are further compared and analyzed, the probability of the voice beam at the first time point is obtained, and then the two-way noise energy is obtained according to the probability of the voice beam at the first time point.

Step S103: and comparing and analyzing the energy of the voice wave beam and the energy of the noise wave beam to obtain double-path noise energy, wherein the double-path noise energy is the energy of non-stationary noise.

In this embodiment, the specific process of performing the comparative analysis on the energy of the voice beam and the energy of the noise beam is as follows: assuming that the moment of a first time point is t, obtaining the probability of the occurrence of a voice beam at the first time point according to the following formula:

wherein atan is an arctan function,

is the first time pointThe energy of the voice beam of (a),

the energy of the noise beam at the first time point, p (t), is the probability of the occurrence of a speech beam at the first time point.

Further, the two-way noise energy is obtained according to the following formula:

wherein γ is a smoothing factor;

for the purpose of the single-pass noise energy,

and t is the first time point.

It should be noted that the energy of the duplex noise described in this embodiment is the energy of the non-stationary noise.

Step S104: and obtaining a first parameter for eliminating non-stationary noise according to the two-way noise energy and the energy of the voice beam.

In this embodiment, the obtaining the first parameter for eliminating non-stationary noise according to the two-way noise energy and the energy of the voice beam includes: obtaining a solving formula of the wiener filter coefficient: and g is the obtained first wiener enhancement coefficient, and inputting the two-way noise energy and the energy of the voice beam into a derivation formula of a wiener filter coefficient so as to obtain the first wiener filter coefficient for suppressing non-stationary noise data.

Step S105: and filtering the voice wave beam according to the first parameter to obtain first enhanced voice data.

In this embodiment, the speech beam is filtered according to the first wiener enhancement coefficient, and the specific process is to obtain a wiener filtering process algorithm formula: and g is the first wiener filter coefficient, y is the voice data before the non-stationary noise is eliminated, and s is the voice data after the non-stationary noise is eliminated. And inputting the first wiener enhancement coefficient to the speech wave beam into the wiener filtering processing process algorithm formula to obtain first enhanced speech data after non-stationary noise data is suppressed.

In addition, an embodiment of the present application also provides a preferred implementation manner, please refer to fig. 5, which is a flowchart of an embodiment of a method for suppressing stationary noise and non-stationary noise according to the present application.

On the basis that the above-mentioned embodiment suppresses non-stationary noise, the preferred embodiment further performs single-path noise estimation on the single-path speech beam to obtain single-path noise energy. Wherein, the single-path noise energy is the energy of stationary noise. And performing noise comprehensive analysis according to the obtained two-way noise energy, the one-way noise energy and the energy of the voice beam to obtain comprehensive noise energy, so as to obtain a second parameter for eliminating non-stationary noise and stationary noise according to the comprehensive noise energy, and performing filtering processing on the voice beam to be recognized according to the second parameter to obtain second enhanced voice data. Wherein the second parameter is a second wiener filter coefficient for suppressing stationary noise data and non-stationary noise data.

It should be noted that the performing single-path noise estimation on the voice beam to obtain single-path noise energy specifically includes: the method comprises the steps of carrying out preliminary estimation on a voice beam by using a minimum tracking method to obtain the minimum value of the voice beam, obtaining the occurrence probability of the voice beam through the minimum value of the voice beam, and estimating and obtaining the energy of single-path noise by using recursive averaging according to the occurrence probability of the voice beam.

The process of comprehensively analyzing the noise according to the two-way noise energy, the one-way noise energy and the energy of the voice beam specifically comprises the following steps: obtaining a noise comprehensive analysis algorithm:

wherein,

for the purpose of the single-pass noise energy,

and for the double-path noise energy, α is the energy of the voice beam, J is the comprehensive noise energy, the double-path noise energy, the single-path noise energy and the energy of the voice beam are input into a noise comprehensive analysis algorithm to carry out noise comprehensive analysis, and the comprehensive noise energy is obtained, so that non-stationary noise and stationary noise are eliminated simultaneously, and the accuracy of voice recognition is further improved.

In correspondence with the far-field noise suppression method, the present application also provides a far-field noise suppression device to which the far-field noise suppression method can be applied, please refer to fig. 2, which is a schematic diagram of an embodiment of the far-field noise suppression device of the present application. Since the embodiment of the apparatus is similar to the embodiment of the method, the description is simple, and the related points should be referred to the description of the embodiment of the system, and the following description of the embodiment of the apparatus is only illustrative.

An acquiring unit 201 is configured to acquire a voice beam acquired by the first microphone and a noise beam acquired by the second microphone.

In this embodiment, a specific method for eliminating noise data in a speech signal by using a speech enhancement mode is provided, wherein, to achieve effective suppression of non-stationary noise in a complex background noise environment, a speech beam needs to be collected by a first microphone first, and on the basis of the speech beam, a two-way noise estimation is further performed by using a second microphone to collect a noise beam, so as to achieve tracking and elimination of the non-stationary noise. It should be noted that, in this embodiment, the speech beam and the noise beam are generated by using a classical beam forming method MVDR/LCMV, where the speech beam coefficients are generated from the following constraints:

w_speech＝armin_w(w^HR_NNw)，s.t w^Ha(θ)＝1

A first obtaining unit 202, configured to obtain the energy of the voice beam and the energy of the noise beam according to the voice beam and the noise beam.

In this embodiment, the obtaining of the energy of the voice beam and the energy of the noise beam according to the voice beam and the noise beam specifically includes obtaining the energy of the voice beam at a first time point and the energy of the noise beam at the first time point, then performing comparative analysis on the energy of the voice beam at the first time point and the energy of the noise beam at the first time point, obtaining a probability of occurrence of the voice beam at the first time point, and further obtaining the two-way noise energy according to the probability of occurrence of the voice beam at the first time point.

The first analysis unit 203 is configured to perform a comparative analysis on the energy of the voice beam and the energy of the noise beam to obtain two-path noise energy, where the two-path noise energy is energy of non-stationary noise.

In this embodiment, a specific process of performing a comparative analysis on the energy of the voice beam and the energy of the noise beam is to assume that a time of a first time point is t, and obtain a probability of occurrence of the voice beam at the first time point according to the following formula:

wherein atan is an arctan function,

is the energy of the speech beam at the first point in time,

wherein γ is a smoothing factor;

for the purpose of the single-pass noise energy,

and t is the first time point.

A second obtaining unit 204, configured to obtain a first parameter for removing non-stationary noise according to the two-way noise energy and the energy of the voice beam.

In this embodiment, the first parameter is a first wiener filter coefficient for suppressing non-stationary noise data, and the specific process according to the two-way noise energy and the energy of the voice beam is to obtain a solution formula of the wiener filter coefficient: and g is the obtained first wiener enhancement coefficient, and inputting the two-way noise energy and the energy of the voice beam into a derivation formula of a wiener filter coefficient so as to obtain the first wiener filter coefficient for suppressing non-stationary noise data.

A first processing unit 205, configured to perform filtering processing on the voice beam according to the first parameter, so as to obtain first enhanced voice data.

Corresponding to the far-field noise suppression device, the present application also provides a far-field noise suppression system to which the far-field noise suppression device can be applied, please refer to fig. 4, which is a flowchart of an embodiment of the far-field noise suppression system of the present application. Since the embodiment of the system is similar to the embodiment of the apparatus, the description is simple, and the related points should be referred to the partial description of the embodiment of the system, and the following description of the embodiment of the apparatus is only illustrative.

In this embodiment, if the far-field noise suppression system is to suppress far-field noise data, it first needs to perform dereverberation processing on multi-channel far-field voice data, select data of each channel and send the data to the endpoint detection module to obtain estimation of the start-stop time of the noise data and the voice data, and further send the multi-channel data after dereverberation to the sound source positioning module to obtain speaker position information, meanwhile, the multi-channel data after dereverberation is processed with beam forming together with the end point detection information and the angle information to obtain two audio streams of voice beam and noise beam, the first microphone collects voice beams, the second microphone collects noise beams, and two audio streams of the collected voice beams and the collected noise beams are subjected to noise estimation, so that noise data are suppressed, and then 'clean' voice is obtained after processing.

Corresponding to the method for recording database information provided above, an embodiment of the present application further provides an electronic device for far-field noise suppression, including: the device is specifically used for storing a program of a far-field noise suppression method, and after the device is powered on and the program of the far-field noise suppression method is run by the processor, the following steps are executed:

It should be noted that, for a detailed description of an electronic device provided in the embodiments of the present application, reference may be made to the related description of a far-field noise suppression method provided in the embodiments of the present application, and details are not repeated here.

Corresponding to the method for recording database information provided above, an embodiment of the present application further provides a far-field noise suppression storage device, including: the far-field noise suppression method according to the above embodiment.

The storage device stores a program of a far-field noise suppression method, which is executed by a processor and performs the steps of:

It should be noted that, for the detailed description of a memory device provided in the embodiments of the present application, reference may be made to the related description of a far-field noise suppression method provided in the embodiments of the present application, and details are not repeated here.

Although the present application has been described with reference to the preferred embodiments, it is not intended to limit the present application, and those skilled in the art can make variations and modifications without departing from the spirit and scope of the present application, therefore, the scope of the present application should be determined by the claims that follow.

Claims

1. A far-field noise suppression method, comprising:

2. The far-field noise suppression method according to claim 1, further comprising:

3. The far-field noise suppression method according to claim 2, wherein the first parameter is a first wiener filter coefficient for suppressing non-stationary noise data;

4. The far-field noise suppression method according to claim 1, wherein the obtaining of the energy of the voice beam and the energy of the noise beam specifically comprises: obtaining the energy of a voice beam at a first time point and the energy of a noise beam at the first time point;

5. The far-field noise suppression method according to claim 1, wherein the acquiring the voice beam collected by the first microphone and the noise beam collected by the second microphone specifically comprises:

6. The far-field noise suppression method according to claim 4, further comprising:

7. The far-field noise suppression method according to claim 6, wherein obtaining the two-way noise energy according to the probability of the occurrence of the voice beam at the first time point specifically comprises: :

8. The far-field noise suppression method according to claim 2, wherein the performing noise synthesis analysis according to the two-way noise energy, the one-way noise energy, and the energy of the voice beam to obtain a synthesized noise energy specifically comprises:

obtaining a noise comprehensive analysis algorithm:

wherein,

for the purpose of the single-pass noise energy,

α is the energy of the speech beam, J is the composite noise energy;

9. The far-field noise suppression method according to claim 2, wherein the obtaining of the second parameter for eliminating non-stationary noise and stationary noise based on the combined noise energy and the energy of the speech beam specifically comprises:

10. The far-field noise suppression method according to claim 2, wherein the performing single-path noise estimation on the voice beam to obtain single-path noise energy specifically comprises:

11. A far-field noise suppression apparatus, comprising:

12. The far-field noise suppression apparatus according to claim 11, further comprising:

13. The far-field noise suppression apparatus according to claim 12, wherein the first parameter is a first wiener filter coefficient for suppressing non-stationary noise data;

14. The far-field noise suppression device according to claim 11, wherein the obtaining of the energy of the voice beam and the energy of the noise beam specifically comprises: obtaining the energy of a voice beam at a first time point and the energy of a noise beam at the first time point;

15. The far-field noise suppression device according to claim 11, wherein the acquiring of the voice beam collected by the first microphone and the noise beam collected by the second microphone specifically comprises:

16. The far-field noise suppression apparatus according to claim 14, further comprising:

17. The far-field noise suppression device according to claim 16, wherein the obtaining of the two-way noise energy according to the probability of the occurrence of the voice beam at the first time point specifically comprises: :

18. The far-field noise suppression device according to claim 12, wherein the performing a noise synthesis analysis according to the two-way noise energy, the one-way noise energy, and the energy of the voice beam to obtain a synthesized noise energy specifically comprises:

obtaining a noise comprehensive analysis algorithm:

wherein,

for the purpose of the single-pass noise energy,

α is the energy of the speech beam, J is the composite noise energy;

19. The far-field noise suppression apparatus according to claim 12, wherein the obtaining of the second parameter for eliminating non-stationary noise and stationary noise based on the combined noise energy and the energy of the voice beam specifically comprises:

20. The far-field noise suppression device according to claim 12, wherein the performing single-path noise estimation on the voice beam to obtain single-path noise energy specifically comprises:

21. A far-field noise suppression system, comprising: the far-field noise suppression device of any one of the preceding claims 11-20.

22. An electronic device, comprising:

a processor; and

23. A storage device storing a program of a far-field noise suppression method, the program being executed by a processor to perform the steps of: