CN110661510A - Beam former forming method, beam forming device and electronic equipment - Google Patents

Beam former forming method, beam forming device and electronic equipment Download PDF

Info

Publication number
CN110661510A
CN110661510A CN201910991943.8A CN201910991943A CN110661510A CN 110661510 A CN110661510 A CN 110661510A CN 201910991943 A CN201910991943 A CN 201910991943A CN 110661510 A CN110661510 A CN 110661510A
Authority
CN
China
Prior art keywords
signal
target
noise
beam former
suppression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910991943.8A
Other languages
Chinese (zh)
Other versions
CN110661510B (en
Inventor
李楠
雷欣
李志飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mobvoi Innovation Technology Co Ltd
Original Assignee
Chumen Wenwen Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chumen Wenwen Information Technology Co Ltd filed Critical Chumen Wenwen Information Technology Co Ltd
Priority to CN201910991943.8A priority Critical patent/CN110661510B/en
Publication of CN110661510A publication Critical patent/CN110661510A/en
Application granted granted Critical
Publication of CN110661510B publication Critical patent/CN110661510B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03HIMPEDANCE NETWORKS, e.g. RESONANT CIRCUITS; RESONATORS
    • H03H21/00Adaptive networks

Abstract

The invention discloses a beam former forming method, a beam forming device and electronic equipment. In the beam former forming method, a white noise signal in a pre-suppression direction of the beam former is obtained; determining a reference signal and an expected signal of a relative double-microphone array according to the position relation of a white noise sound source and the microphones; and then, based on the reference signal and the expected signal, the self-adaptive filter algorithm is utilized to suppress the pre-suppression directional white noise signal of the beam former, so that the target beam former is obtained. In the embodiment of the adaptive beam forming method, the target beam forming set is used for filtering the audio signals output by the double sound transmission arrays, and the spatial domain signal-to-noise ratio algorithm and the stable noise reduction algorithm are combined to enhance the voice signals in the target direction, so that the complexity of the adaptive beam forming method is reduced, and the good directional sound receiving performance of the adaptive beam forming method can be realized on the premise of ensuring the good practicability and robustness of the algorithm.

Description

Beam former forming method, beam forming device and electronic equipment
Technical Field
The present invention relates to the field of noise reduction technologies, and in particular, to a method for forming a beam former, a method and an apparatus for forming a self-adaptive beam, and an electronic device.
Background
Beamforming is a method of using an array of sensors to achieve spatially directed reception of signals. In the field of acoustic signal processing, the sensor is typically a microphone and the target signal is an acoustic wave.
The beamforming method can be divided into fixed beamforming and adaptive beamforming. The existing design method for fixed beam forming generally relies on a multi-microphone array with high unit consistency, and when the array units are few or the unit consistency is difficult to ensure due to the inherent problems of the acoustic structure of the equipment, the effect of the fixed beam forming method is obviously reduced, so that the directional sound receiving capability is insufficient. The existing adaptive beam forming method often has the problem of insufficient robustness in practical application, and some improved schemes can improve the robustness to a certain extent, but cannot avoid bringing extremely high algorithm complexity, so that the algorithm practicability is greatly reduced.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method, an apparatus and an electronic device for forming a beamformer, which are capable of suppressing signals in a pre-suppression direction of the beamformer while preserving signals in a non-suppression direction of the beamformer.
To achieve the above object, according to a first aspect of embodiments of the present invention, there is provided a method for forming a beamformer, the method including: acquiring a white noise signal in a pre-suppression direction of a beam former; determining a reference signal and an expected signal of a relative double-microphone array according to the position relation of a white noise sound source and a microphone; and based on the reference signal and the expected signal, utilizing an adaptive filter algorithm to suppress a white noise signal in the pre-suppression direction of the beam former to obtain a target beam former.
Optionally, the determining the reference signal and the desired signal of the relative dual microphone array according to the position relationship between the white noise source and the microphones includes: taking the audio signal output by the microphone with the distance from the white noise sound source smaller than a first distance threshold value as a reference signal of the relative double-microphone array; taking the audio signal output by the microphone which is more than a second distance threshold value away from the white noise sound source as an expected signal of the relative double-microphone array; wherein the second distance threshold is greater than or equal to the first distance threshold.
Optionally, the method further includes: filtering the reference signal by using the target beam former to obtain a filtered reference signal; and subtracting the filtered reference signal from the expected signal to obtain a signal in the non-pre-suppression direction of the beam former.
In order to achieve the above object, according to a second aspect of the embodiments of the present invention, a method for forming a self-adaptive beam is further provided, where the method can achieve good directional sound receiving performance of the self-adaptive beam forming method on the premise of ensuring good practicability and robustness of an algorithm, and greatly improve a signal-to-noise ratio of a target speech signal relative to a noise signal.
An adaptive beamforming method, comprising: acquiring an audio signal output by each microphone in a relative double-microphone array; based on the audio signals, performing signal suppression in each target beam former pre-suppression direction of the target beam former group by using an adaptive filter algorithm to obtain signals in non-suppression directions of N beam formers; wherein the target beam former group comprises a combination of a first beam former with a pre-suppression direction as a target voice enhancement direction and N-1 second beam formers with pre-suppression directions as non-target voice enhancement directions; accordingly, the N beamformer non-suppressed direction signals include noise signals corresponding to the first beamformer and N-1 target speech signals corresponding to the N-1 second beamformers, respectively.
Optionally, the method further includes: selecting a signal with the minimum amplitude from the N-1 target voice signals as a pre-noise reduction target voice signal; performing noise reduction processing on the pre-noise-reduced target speech signal by using a stationary noise reduction algorithm to obtain a noise-reduced target speech signal; and applying an updating gain factor to the target voice signal after the noise reduction to obtain a target voice enhancement signal.
Optionally, before applying the updated gain factor to the target speech signal after noise reduction to obtain the target speech enhancement signal, the method further includes: calculating a space domain signal-to-noise ratio according to the pre-denoising target speech signal and the noise signal; and smoothing the gain factor and the space domain signal-to-noise ratio to suppress a noise signal to obtain an updated gain factor.
To achieve the above object, according to a third aspect of the embodiments of the present invention, there is also provided a beam former forming apparatus, including: the acquisition module is used for acquiring a white noise signal in the pre-suppression direction of the beam former; the determining module is used for determining a reference signal and an expected signal of a relative double-microphone array according to the position relation of a white noise sound source and the microphones; and the target beam former module is used for suppressing the white noise signal in the pre-suppression direction of the beam former by using an adaptive filter algorithm based on the reference signal and the expected signal to obtain the target beam former.
To achieve the above object, according to a fourth aspect of the embodiments of the present invention, there is provided an adaptive beamforming apparatus, including: the acquisition module is used for acquiring the audio signal output by each microphone in the relative double-microphone array; the suppression module is used for performing signal suppression in the pre-suppression direction of each target beam former of the target beam former group by using an adaptive filter algorithm based on the audio signal to obtain signals in the non-suppression directions of the N beam formers; wherein the target beam former group comprises a combination of a first beam former with a pre-suppression direction as a target voice enhancement direction and N-1 second beam formers with pre-suppression directions as non-target voice enhancement directions; accordingly, the N beamformer non-suppressed direction signals include noise signals corresponding to the first beamformer and N-1 target speech signals corresponding to the N-1 second beamformers, respectively.
To achieve the above object, according to a fifth aspect of the embodiments of the present invention, there is also provided an electronic apparatus, including: one or more processors; a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the beamformer forming method of the first aspect or the adaptive beamforming method of the second aspect.
To achieve the above object, according to a sixth aspect of the embodiments of the present invention, there is also provided a computer readable medium having stored thereon a computer program which, when executed by a processor, implements the beamformer forming method according to the first aspect or the adaptive beamforming method according to the second aspect.
Based on the technical scheme, the embodiment of the invention designs a target beam former group by using a beam former forming method, performs filtering processing on audio signals output by the relative double sound transmission arrays by using the target beam former group, and enhances voice signals in a target direction by combining a space domain signal-to-noise ratio algorithm and a stable noise reduction algorithm; therefore, the complexity of the adaptive beam forming method is reduced, the good directional sound receiving performance of the adaptive beam forming method can be realized on the premise of ensuring the good practicability and robustness of the algorithm, and the signal-to-noise ratio of the target voice signal relative to the noise signal is greatly improved.
Further effects of the above-described non-conventional alternatives will be described below in connection with specific embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein: in the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.
Fig. 1 is a flow chart of a method of forming a beamformer in accordance with an embodiment of the present invention;
fig. 2 is a schematic diagram of a beamformer formation method described in fig. 1;
FIG. 3 is a flow chart of an adaptive beamforming method according to another embodiment of the present invention;
FIG. 4 is a flow chart of an adaptive beamforming method according to yet another embodiment of the present invention;
FIG. 5 is a schematic diagram of the adaptive beamforming method of FIG. 4;
fig. 6 is a schematic diagram of a beam former forming apparatus according to an embodiment of the present invention;
fig. 7 is a schematic diagram of an adaptive beamforming apparatus according to another embodiment of the present invention;
fig. 8 is a schematic structural diagram of a computer system suitable for implementing a terminal device or a server according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a flowchart of a method for forming a beam former according to an embodiment of the present invention, the method including:
s101: acquiring a white noise signal in a pre-suppression direction of a beam former;
specifically, a white noise signal which is pre-suppressed by the beam former and is played by the directional active loudspeaker box is obtained. The beamformer here includes a spatial filter or a wiener filter, etc.
S102: determining a reference signal and an expected signal of a relative double-microphone array according to the position relation of a white noise sound source and a microphone;
exemplarily, the audio signal output by the microphone which is less than a first distance threshold from the white noise source is used as the reference signal of the relative double microphone array; taking the audio signal output by the microphone which is more than a second distance threshold value away from the white noise sound source as an expected signal of the relative double-microphone array; wherein the second distance threshold is greater than or equal to the first distance threshold.
It should be noted that, in different application scenarios, the setting values of the first distance threshold and the second distance threshold are different, but the second distance threshold is always greater than or equal to the first distance threshold.
S103: and based on the reference signal and the expected signal, utilizing an adaptive filter algorithm to suppress a white noise signal in the pre-suppression direction of the beam former to obtain a target beam former.
Illustratively, based on the reference signal and the desired signal, an adaptive filter algorithm is used for adaptive system identification to suppress a white noise signal in the pre-suppression direction of the beamformer, resulting in a vector estimate of beamformer coefficients; generating a target beamformer based on the vector estimates of the beamformer coefficients.
Specifically, based on the reference signal and the expected signal, performing adaptive system identification by using an adaptive filter algorithm to suppress a white noise signal in a pre-suppression direction of the beam former, and obtaining a transfer function between the reference signal and the expected signal; performing mathematical fitting on the transfer function to obtain a beam former coefficient; obtaining an error signal of a beam former based on the reference signal, the expected signal and the beam former coefficient; calculating by using an adaptive filter algorithm based on the reference signal, the beamformer coefficient and the error signal to obtain a vector estimation value of the beamformer coefficient at any moment; generating a target beamformer based on the vector estimates of the beamformer coefficients.
It should be noted that the adaptive filter algorithm includes, but is not limited to, the following algorithms: normalized Least Mean Square (NLMS) algorithm, a projection of radiation (AP) algorithm, an iterative least squares (RLS) algorithm, and the like.
It should be understood that, in various embodiments of the present invention, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and the inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
In an optional embodiment, the method further comprises: filtering the reference signal by using the target beam former to obtain a filtered reference signal; and subtracting the filtered reference signal from the expected signal to obtain a signal in the non-pre-suppression direction of the beam former.
The embodiment of the invention obtains a white noise signal, determines a reference signal and an expected signal of a relative double-microphone array according to the position relation of the white noise sound source and the microphones, and then suppresses the white noise signal in the pre-suppression direction of the beam former by using an adaptive filter algorithm to obtain a target beam former; it is achieved thereby that signals in the non-suppressed directions of the beamformer are preserved while the beamformer pre-suppressed direction signals are suppressed by the target beamformer.
Fig. 2 is a schematic diagram of the adaptive beamforming method shown in fig. 1. Taking a spatial filter as an example, the reference signal is represented by x (n), and the desired signal is represented by d (n); carrying out self-adaptive system identification by utilizing a Normalized Least Mean Square (NLMS) algorithm to obtain a transfer function between a reference signal x (n) and an expected signal d (n); after the adaptive filter converges, the vector estimate w (n) of the spatial filter coefficients at time n is recorded:
w(n)=[w0(n),w1(n),......,wM-1(n)]T
wherein, w0(n),w1(n),......,wM-1(n) is the coefficient of different spatial filters, M represents the length of the spatial filter;
the error signal e (n) obtained in the filter identification process is obtained by the following formula:
e(n)=d(n)-wT(n)x(n),
wherein x (n) ═ x (n), x (n-1),.. times.x (n-M +1)]TN in the above formula and symbol represents a timestamp, i.e. a reference signal or a spatial filter coefficient corresponding to n times;
the iterative update formula of the weight coefficient vector estimation of the spatial filter is as follows:
where μ is the filter iteration step size, typically a positive real number close to zero, e.g., 0.1.
Specifically, taking a white noise signal played by the active speaker as a noise source in the pre-suppression direction of the spatial filter, and performing adaptive system identification by using a Normalized Least Mean Square (NLMS) algorithm is a process of minimizing an output error signal. Since the white noise signal is a broadband noise signal, any other signal can be suppressed in the direction as long as the spatial filter identified by the adaptive system can converge on the white noise signal. Meanwhile, because only broadband noise signals in the pre-suppression direction exist in the self-adaptive system identification process, the finally obtained spatial filter cannot generate suppression effect on signals in other directions, so that the effect of weakening the pre-suppression direction signals of the spatial filter and keeping the non-suppression direction signals of the spatial filter is achieved.
It should be appreciated that since the adaptive filter algorithm eventually converges to a wiener solution, the wiener filter-based beamforming method is equally applicable to the method of the present embodiment; since the method of this embodiment aims to eliminate the difference between the two microphones of the beamformer pre-suppression directional signal, the beamforming method based on the delay-and-sum algorithm is also applicable to the method of this embodiment.
Fig. 3 is a flowchart of an adaptive beamforming method according to another embodiment of the present invention, the method includes:
s301: acquiring an audio signal output by each microphone in a relative double-microphone array;
s302: based on the audio signal, carrying out signal suppression in each spatial filter pre-suppression direction in a spatial filter bank by using a self-adaptive filter algorithm to obtain signals in the non-suppression directions of N spatial filters; the spatial filter bank comprises a combination of a first spatial filter with a pre-suppression direction as a target voice enhancement direction and N-1 second spatial filters with a pre-suppression direction as a non-target voice enhancement direction; correspondingly, the N spatial filter non-suppression direction signals comprise noise signals corresponding to the first spatial filter and N-1 target voice signals corresponding to N-1 second spatial filters respectively.
It should be noted that the spatial filter bank of the present embodiment may also be other target beamforming banks, such as a wiener filter bank.
It should be understood that, in various embodiments of the present invention, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and the inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
According to the embodiment of the invention, the audio signals output by the relative double microphone arrays are filtered by the spatial filter bank, so that the signals in the inhibition direction of the spatial filter bank are inhibited and the signals in the non-inhibition direction are output.
Fig. 4 is a flowchart illustrating an adaptive beamforming method according to another embodiment of the present invention. Fig. 5 is a schematic diagram of the adaptive beamforming method shown in fig. 4. The following describes an adaptive beamforming method according to this embodiment with reference to fig. 4 and 5, where the method includes:
s401: acquiring an audio signal output by each microphone in a relative double-microphone array;
illustratively, the audio signals X output by two microphones which are opposite in position are respectively obtained1(z) and X2(z)。
S402: and based on the audio signal, performing signal suppression in the pre-suppression direction of each spatial filter in the spatial filter bank by using an adaptive filter algorithm to obtain a noise signal and N-1 target audio signals.
Illustratively, N spatial filters are combined to obtain a spatial filter bank, where the spatial filter bank includes a combination of a first spatial filter whose pre-suppression direction is the target speech enhancement direction and N-1 second spatial filters whose pre-suppression direction is the non-target speech enhancement direction.
The spatial filter group designed by the adaptive beam forming method in fig. 1 is used for filtering the audio signals output by the relative double microphone arrays and outputting signals in the non-suppression directions of the N spatial filters; wherein the noise signal is denoted as Etar(z), N-1 target speech signals are respectively denoted as Enontar1(z),Enontar2(z),......,EnontarN-1(z). The first spatial filter coefficient is Wtar(z), N-1 second spatial filter coefficients are Wnontar1(z)、Wnontar2(z)......,WnontarN-1(z), z being the transfer function of the audio signal in the spatial filter.
The noise signal and the N-1 target speech signals are obtained based on the following formula:
Etar(z)=X2(z)-Wtar(z)X1(z)
Enontar1(z)=X1(z)-Wnontar1(z)X2(z)
Enontar2(z)=X1(z)-Wnontar2(z)X2(z)
……,
EnontarN-1(z)=X1(z)-WnontarN-1(z)X2(z)。
s403: selecting a signal with the minimum amplitude from the N-1 target voice signals as a pre-noise reduction target voice signal;
illustratively, the pre-reduced target speech signal E is determined based on the following formulanontar(z):
Enontar(z)=min{Enontar1(z),Enontar2(z),......,EnontarN-1(z)}
EnontarAnd (z) the target speech signal of the pre-noise reduction is obtained after the main noise component in the non-target speech enhancement direction is eliminated.
S404: performing noise reduction processing on the pre-noise-reduced target speech signal by using a stationary noise reduction algorithm to obtain a noise-reduced target speech signal;
for example, the denoising of the pre-noise reduction target speech signal is performed by using a stationary noise reduction algorithm, so as to eliminate a stationary noise portion in the pre-noise reduction target speech signal. The target speech signal after noise reduction obtained by the noise reduction processing is Ens(z)。
It should be understood that the stationary noise reduction algorithm of the present embodiment includes, but is not limited to, a noise suppression algorithm based on spectral subtraction, a noise suppression algorithm based on wiener filtering, a noise reduction method based on noise estimation of the modified minimum control recursive averaging algorithm, and the like.
S405: calculating a space domain signal-to-noise ratio according to the pre-denoising target speech signal and the noise signal;
illustratively, the spatial domain signal-to-noise ratio SNR is calculated based on the following formulaest
Figure BDA0002238566730000101
S406: smoothing the gain factor and the space domain signal-to-noise ratio to suppress a noise signal to obtain an updated gain factor;
illustratively, the update Gain factor Gain' is calculated based on the following formula:
Gain′=smoothfactor×Gain+(1-smoothfactor)×SNRest
wherein, smoothfactor is a smoothing factor, generally a positive real number which is less than 1 and close to 1 is taken, such as 0.9, and the initialized value of Gain factor Gain is 1;
it should be understood that the smoothing process is to avoid unnatural speech perception due to large amplitude frequent jittering of the gain factor. The smoothing process can realize that a higher gain is applied to the signal with higher signal-to-noise ratio in the space domain, and a lower gain is applied to the signal with lower signal-to-noise ratio in the space domain, thereby achieving the purpose of suppressing the noise signal.
S407: and applying an updating gain factor to the target voice signal after the noise reduction to obtain a target voice enhancement signal.
Illustratively, the target speech enhancement signal E is determined based on the following formulaout(z):
Eout(z)=Ens(z)×Gain′
It should be understood that the noise reduction process needs to be performed on the pre-reduced target speech signal Enontar(z) before applying the updated gain factor, thereby avoiding that the fluctuation of the updated gain factor damages the stability characteristic of stationary noise, resulting in the performance of the stationary noise elimination algorithm being weakened. After the noise reduction processing is finished, in pair Ens(z) applying the updated gain factor to obtain the target speech enhancement signal E after removing the stationary noise componentout(z). Thus first adoptingThe stationary noise reduction algorithm carries out noise reduction processing, then applies updating gain factors to the target voice signals subjected to noise reduction, can simultaneously inhibit stationary noise in a target voice direction and non-stationary noise in a non-target voice direction, and further enables mutual interference to be low.
It should also be understood that, in various embodiments of the present invention, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and the inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
The embodiment of the invention utilizes a self-adaptive filter algorithm to carry out signal suppression in the pre-suppression direction of each spatial filter in a spatial filter group to obtain signals in the non-suppression directions of N spatial filters; then combining a space domain signal-to-noise ratio algorithm and a stable noise reduction algorithm to realize the enhancement of the voice signal in the target direction; therefore, the self-adaptive beam forming method based on the double microphone arrays with low memory occupation and low complexity calculation can be realized; the method not only has good algorithm practicability and robustness, but also can improve the signal-to-noise ratio of the target voice signal relative to the noise signal by more than 20dB in a low signal-to-noise ratio environment.
Fig. 6 is a schematic diagram of a beam former forming apparatus according to an embodiment of the present invention. The device comprises: the device 600 comprises: an obtaining module 601, configured to obtain a white noise signal in a pre-suppression direction of a beamformer; a determining module 602, configured to determine a reference signal and an expected signal of a relative dual microphone array according to a position relationship between a white noise source and microphones; and a target beamformer module 603, configured to suppress, by using an adaptive filter algorithm, a white noise signal in a pre-suppression direction of the beamformer based on the reference signal and the desired signal, so as to obtain a target beamformer.
In an alternative embodiment, the determining module includes: determining a reference signal unit, wherein an audio signal output by a microphone with the distance from the white noise sound source smaller than a first distance threshold value is used as a reference signal of the relative double-microphone array; determining an expected signal unit, and taking the audio signal output by the microphone which is more than a second distance threshold value away from the white noise sound source as an expected signal of the relative double-microphone array; wherein the second distance threshold is greater than or equal to the first distance threshold.
In an alternative embodiment, the forming device further comprises: the filtering processing module is used for performing filtering processing on the reference signal by using the target beam former to obtain a filtered reference signal; and the suppression module is used for subtracting the filtered reference signal from the expected signal to obtain a signal in the non-pre-suppression direction of the beam former.
Fig. 7 is a schematic diagram of an adaptive beamforming apparatus according to another embodiment of the present invention. The apparatus 700 comprises: an obtaining module 701, configured to obtain an audio signal output by each microphone in a relative dual-microphone array; a suppression module 702, configured to perform signal suppression in each spatial filter pre-suppression direction in the spatial filter bank by using an adaptive filter algorithm based on the audio signal, so as to obtain signals in non-suppression directions of N spatial filters; the spatial filter bank comprises a combination of a first spatial filter with a pre-suppression direction as a target voice enhancement direction and N-1 second spatial filters with a pre-suppression direction as a non-target voice enhancement direction; correspondingly, the N spatial filter non-suppression direction signals comprise noise signals corresponding to the first spatial filter and N-1 target voice signals corresponding to N-1 second spatial filters respectively.
In an optional embodiment, the apparatus further comprises: the selection module is used for selecting a signal with the minimum amplitude from the N-1 target voice signals as a pre-noise reduction target voice signal; the noise reduction module is used for carrying out noise reduction processing on the pre-noise-reduced target speech signal by utilizing a stationary noise reduction algorithm to obtain a noise-reduced target speech signal; and the updating gain factor module is used for applying an updating gain factor to the target voice signal after noise reduction to obtain a target voice enhancement signal.
In an optional embodiment, the apparatus further comprises: the signal-to-noise ratio calculating module is used for calculating a space domain signal-to-noise ratio according to the pre-denoising target speech signal and the noise signal; and the smoothing processing module is used for smoothing the gain factor and the airspace signal-to-noise ratio to suppress a noise signal and obtain an updated gain factor.
The device can execute the beam former forming method or the adaptive beam forming method provided by the embodiment of the invention, and has corresponding functional modules and beneficial effects for executing the beam former forming method or the adaptive beam forming method. For technical details that are not described in detail in this embodiment, reference may be made to the beam former forming method or the adaptive beam forming method provided by the embodiments of the present invention.
Referring now to FIG. 8, shown is a block diagram of a computer system suitable for use in implementing a terminal device or server of an embodiment. The terminal device shown in fig. 8 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 8, the computer system 900 includes a Central Processing Unit (CPU)901 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)902 or a program loaded from a storage section 908 into a Random Access Memory (RAM) 903. In the RAM 903, various programs and data necessary for the operation of the system 900 are also stored. The CPU 901, ROM 902, and RAM 903 are connected to each other via a bus 904. An input/output (I/0) interface 905 is also connected to bus 904. The following components are connected to the I/O interface 905: an input portion 906 including a keyboard, a mouse, and the like; an output section 907 including components such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 908 including a hard disk and the like; and a communication section 909 including a network interface card such as a LAN card, a modem, or the like. The communication section 909 performs communication processing via a network such as the internet. The drive 910 is also connected to the I/O interface 905 as necessary. A removable medium 911 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 910 as necessary, so that a computer program read out therefrom is mounted into the storage section 908 as necessary.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 909, and/or installed from the removable medium 911. The above-described functions defined in the system of the present invention are executed when the computer program is executed by a Central Processing Unit (CPU) 901.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes a sending module, an obtaining module, a determining module, and a first processing module. The names of these modules do not in some cases constitute a limitation on the unit itself, and for example, the sending module may also be described as a "module that sends a picture acquisition request to a connected server".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: s101: acquiring a white noise signal in a pre-suppression direction of a beam former; s102: determining a reference signal and an expected signal of a relative double-microphone array according to the position relation of a white noise sound source and a microphone; s103: and based on the reference signal and the expected signal, utilizing an adaptive filter algorithm to suppress a white noise signal in the pre-suppression direction of the beam former to obtain a target beam former.
The embodiment of the invention obtains the white noise signal in the pre-suppression direction of the beam former; determining a reference signal and an expected signal of a relative double-microphone array according to the position relation of a white noise sound source and the microphones; then, a white noise signal in the pre-suppression direction of the beam former is suppressed by using an adaptive filter algorithm to obtain a target beam former; thus, the signals in the pre-suppression direction of the beam former are suppressed, and the signals in the non-suppression direction are reserved. In another embodiment of the invention, the spatial filter group designed by the beam former forming method is used for filtering the audio signals output by the relative double microphone arrays, and the spatial signal-to-noise ratio algorithm and the stable noise reduction algorithm are combined to enhance the voice signals in the target direction; therefore, the complexity of the adaptive beam forming method is reduced, the good directional sound receiving performance of the adaptive beam forming method can be realized on the premise of ensuring the good practicability and robustness of the algorithm, and the signal-to-noise ratio of the target voice signal relative to the noise signal is greatly improved.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.
The above description is only an exemplary embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the present invention, and the present invention shall be covered thereby. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims (10)

1. A method of beamformer formation, comprising:
acquiring a white noise signal in a pre-suppression direction of a beam former;
determining a reference signal and an expected signal of a relative double-microphone array according to the position relation of a white noise sound source and a microphone;
and based on the reference signal and the expected signal, utilizing an adaptive filter algorithm to suppress a white noise signal in the pre-suppression direction of the beam former to obtain a target beam former.
2. The method of claim 1, wherein determining the reference signal and the desired signal relative to the dual microphone array based on a positional relationship of a white noise source to the microphones comprises:
taking the audio signal output by the microphone with the distance from the white noise sound source smaller than a first distance threshold value as a reference signal of the relative double-microphone array;
taking the audio signal output by the microphone which is more than a second distance threshold value away from the white noise sound source as an expected signal of the relative double-microphone array;
wherein the second distance threshold is greater than or equal to the first distance threshold.
3. The method of claim 1, further comprising:
filtering the reference signal by using the target beam former to obtain a filtered reference signal;
and subtracting the filtered reference signal from the expected signal to obtain a signal in the non-pre-suppression direction of the beam former.
4. An adaptive beamforming method, comprising:
acquiring an audio signal output by each microphone in a relative double-microphone array;
based on the audio signals, performing signal suppression in each target beam former pre-suppression direction of the target beam former group by using an adaptive filter algorithm to obtain signals in non-suppression directions of N beam formers;
wherein the target beam former group comprises a combination of a first beam former with a pre-suppression direction as a target voice enhancement direction and N-1 second beam formers with pre-suppression directions as non-target voice enhancement directions; accordingly, the N beamformer non-suppressed direction signals include noise signals corresponding to the first beamformer and N-1 target speech signals corresponding to the N-1 second beamformers, respectively.
5. The method of claim 4, further comprising:
selecting a signal with the minimum amplitude from the N-1 target voice signals as a pre-noise reduction target voice signal;
performing noise reduction processing on the pre-noise-reduced target speech signal by using a stationary noise reduction algorithm to obtain a noise-reduced target speech signal;
and applying an updating gain factor to the target voice signal after the noise reduction to obtain a target voice enhancement signal.
6. The method of claim 5, wherein before applying the updated gain factor to the noise-reduced target speech signal to obtain the target speech enhancement signal, further comprising:
calculating a space domain signal-to-noise ratio according to the pre-denoising target speech signal and the noise signal;
and smoothing the gain factor and the space domain signal-to-noise ratio to suppress a noise signal to obtain an updated gain factor.
7. A beamformer forming apparatus, comprising:
the acquisition module is used for acquiring a white noise signal in the pre-suppression direction of the beam former;
the determining module is used for determining a reference signal and an expected signal of a relative double-microphone array according to the position relation of a white noise sound source and the microphones;
and the target beam former module is used for suppressing the white noise signal in the pre-suppression direction of the beam former by using an adaptive filter algorithm based on the reference signal and the expected signal to obtain the target beam former.
8. An adaptive beamforming apparatus, comprising:
the acquisition module is used for acquiring the audio signal output by each microphone in the relative double-microphone array;
the suppression module is used for performing signal suppression in the pre-suppression direction of each target beam former of the target beam former group by using an adaptive filter algorithm based on the audio signal to obtain signals in the non-suppression directions of the N beam formers; wherein the target beam former group comprises a combination of a first beam former with a pre-suppression direction as a target voice enhancement direction and N-1 second beam formers with pre-suppression directions as non-target voice enhancement directions; accordingly, the N beamformer non-suppressed direction signals include noise signals corresponding to the first beamformer and N-1 target speech signals corresponding to the N-1 second beamformers, respectively.
9. The apparatus of claim 8, further comprising:
the selection module is used for selecting a signal with the minimum amplitude from the N-1 target voice signals as a pre-noise reduction target voice signal;
the noise reduction module is used for carrying out noise reduction processing on the pre-noise-reduced target speech signal by utilizing a stationary noise reduction algorithm to obtain a noise-reduced target speech signal;
and the updating gain factor module is used for applying an updating gain factor to the target voice signal after noise reduction to obtain a target voice enhancement signal.
10. An electronic device, comprising: one or more processors; a storage device to store one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-3 or the method of any of claims 4-6.
CN201910991943.8A 2019-10-18 2019-10-18 Beam former forming method, beam forming device and electronic equipment Active CN110661510B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910991943.8A CN110661510B (en) 2019-10-18 2019-10-18 Beam former forming method, beam forming device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910991943.8A CN110661510B (en) 2019-10-18 2019-10-18 Beam former forming method, beam forming device and electronic equipment

Publications (2)

Publication Number Publication Date
CN110661510A true CN110661510A (en) 2020-01-07
CN110661510B CN110661510B (en) 2021-05-11

Family

ID=69041492

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910991943.8A Active CN110661510B (en) 2019-10-18 2019-10-18 Beam former forming method, beam forming device and electronic equipment

Country Status (1)

Country Link
CN (1) CN110661510B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111402912A (en) * 2020-02-18 2020-07-10 云知声智能科技股份有限公司 Voice signal noise reduction method and device

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1914949A (en) * 2003-12-24 2007-02-14 诺基亚公司 Method for adjusting adaptation control of adaptive interference canceller
CN101466055A (en) * 2008-12-31 2009-06-24 瑞声声学科技(常州)有限公司 Minitype microphone array device and beam forming method thereof
CN101763858A (en) * 2009-10-19 2010-06-30 瑞声声学科技(深圳)有限公司 Method for processing double-microphone signal
CN101964934A (en) * 2010-06-08 2011-02-02 浙江大学 Binary microphone microarray voice beam forming method
CN101976565A (en) * 2010-07-09 2011-02-16 瑞声声学科技(深圳)有限公司 Dual-microphone-based speech enhancement device and method
US10089998B1 (en) * 2018-01-15 2018-10-02 Advanced Micro Devices, Inc. Method and apparatus for processing audio signals in a multi-microphone system
US10229698B1 (en) * 2017-06-21 2019-03-12 Amazon Technologies, Inc. Playback reference signal-assisted multi-microphone interference canceler
CN109616136A (en) * 2018-12-21 2019-04-12 出门问问信息科技有限公司 A kind of Adaptive beamformer method, apparatus and system
CN110085247A (en) * 2019-05-06 2019-08-02 上海互问信息科技有限公司 A kind of dual microphone noise-reduction method for complicated noise

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1914949A (en) * 2003-12-24 2007-02-14 诺基亚公司 Method for adjusting adaptation control of adaptive interference canceller
CN101466055A (en) * 2008-12-31 2009-06-24 瑞声声学科技(常州)有限公司 Minitype microphone array device and beam forming method thereof
CN101763858A (en) * 2009-10-19 2010-06-30 瑞声声学科技(深圳)有限公司 Method for processing double-microphone signal
CN101964934A (en) * 2010-06-08 2011-02-02 浙江大学 Binary microphone microarray voice beam forming method
CN101976565A (en) * 2010-07-09 2011-02-16 瑞声声学科技(深圳)有限公司 Dual-microphone-based speech enhancement device and method
US10229698B1 (en) * 2017-06-21 2019-03-12 Amazon Technologies, Inc. Playback reference signal-assisted multi-microphone interference canceler
US10089998B1 (en) * 2018-01-15 2018-10-02 Advanced Micro Devices, Inc. Method and apparatus for processing audio signals in a multi-microphone system
CN109616136A (en) * 2018-12-21 2019-04-12 出门问问信息科技有限公司 A kind of Adaptive beamformer method, apparatus and system
CN110085247A (en) * 2019-05-06 2019-08-02 上海互问信息科技有限公司 A kind of dual microphone noise-reduction method for complicated noise

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HENRY COX等: ""Robust Adaptive Beamforming"", 《IEEE TRANSACTIONS ON ACOUSTICS,SPEECH,AND SIGNAL PROCESSING》 *
查代奉: ""基于稳定分布白噪声的信号处理新方法研究"", 《万方学位论文》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111402912A (en) * 2020-02-18 2020-07-10 云知声智能科技股份有限公司 Voice signal noise reduction method and device

Also Published As

Publication number Publication date
CN110661510B (en) 2021-05-11

Similar Documents

Publication Publication Date Title
CN109102822B (en) Filtering method and device based on fixed beam forming
US20190320261A1 (en) Adaptive beamforming
EP3833041A1 (en) Earphone signal processing method and system, and earphone
CN111128210B (en) Method and system for audio signal processing with acoustic echo cancellation
JP4973655B2 (en) Adaptive array control device, method, program, and adaptive array processing device, method, program using the same
CN110660404B (en) Voice communication and interactive application system and method based on null filtering preprocessing
CN111078185A (en) Method and equipment for recording sound
CN109215672B (en) Method, device and equipment for processing sound information
CN111681665A (en) Omnidirectional noise reduction method, equipment and storage medium
Spriet et al. Stochastic gradient-based implementation of spatially preprocessed speech distortion weighted multichannel Wiener filtering for noise reduction in hearing aids
WO2007123048A1 (en) Adaptive array control device, method, and program, and its applied adaptive array processing device, method, and program
CN110661510B (en) Beam former forming method, beam forming device and electronic equipment
CN113050035B (en) Two-dimensional directional pickup method and device
KR102517939B1 (en) Capturing far-field sound
CN111755021B (en) Voice enhancement method and device based on binary microphone array
CN112997249B (en) Voice processing method, device, storage medium and electronic equipment
US11640830B2 (en) Multi-microphone signal enhancement
CN113491137B (en) Flexible differential microphone array with fractional order
CN113838472A (en) Voice noise reduction method and device
US11120814B2 (en) Multi-microphone signal enhancement
KR102649227B1 (en) Double-microphone array echo eliminating method, device and electronic equipment
CN113053408B (en) Sound source separation method and device
CN110211601B (en) Method, device and system for acquiring parameter matrix of spatial filter
CN113077809B (en) Echo cancellation method, device, equipment and storage medium
CN115512713A (en) Echo cancellation method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20210326

Address after: 210000 8th floor, building D11, Hongfeng science and Technology Park, Nanjing Economic and Technological Development Zone, Jiangsu Province

Applicant after: New Technology Co.,Ltd.

Address before: 100044 1001, 10th floor, office building a, 19 Zhongguancun Street, Haidian District, Beijing

Applicant before: Mobvoi Information Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant