CN102664023A - Method for optimizing speech enhancement of microphone array - Google Patents

Method for optimizing speech enhancement of microphone array Download PDF

Info

Publication number
CN102664023A
CN102664023A CN2012101277578A CN201210127757A CN102664023A CN 102664023 A CN102664023 A CN 102664023A CN 2012101277578 A CN2012101277578 A CN 2012101277578A CN 201210127757 A CN201210127757 A CN 201210127757A CN 102664023 A CN102664023 A CN 102664023A
Authority
CN
China
Prior art keywords
signal
output
microphone array
voice
array
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012101277578A
Other languages
Chinese (zh)
Inventor
王辉
张玲华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Post and Telecommunication University
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing Post and Telecommunication University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Post and Telecommunication University filed Critical Nanjing Post and Telecommunication University
Priority to CN2012101277578A priority Critical patent/CN102664023A/en
Publication of CN102664023A publication Critical patent/CN102664023A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses a method for optimizing speech enhancement of a microphone array, belongs to the technical field of speech signal processing and relates to speech enhancement technologies, in particular to the speech enhancement of the microphone array. According to the method, firstly a generalized sidelobe canceller (GSC) structure is utilized, and in order to solve the problem of speech leakage of the generalized sidelobe canceller caused by a wrong direction of arrival of signals, a blockmatrix is subjected to a self-adaption adjustment by means of a characteristic of relevance between the output of the GSC and the output of the blockmatrix, so that the blockmatrix can approach to the direction of a target speech, the target speech leakage in the blockmatrix is reduced, and the robustness of a system is enhanced.

Description

The optimization method that a kind of microphone array voice strengthen
Technical field
The present invention relates to speech enhancement technique, particularly the microphone array voice strengthen, and belong to the voice process technology field.
Background technology
It is the research focus of field of voice signal that voice strengthen always, and the introducing that microphone array is handled provides a new approach to carry out the voice enhancing.Microphone array not only provides the information of signal on time domain and frequency domain; A spatial domain also is provided, the signal from the space different directions has been carried out the sky time-frequency combination handle, it is a theoretical foundation with the algorithm of aerial array; Method in conjunction with the single channel speech processes; With the mode of spatial filter, the sound-source signal locus is provided, suppress the purpose of undesired signal when reaching the leaching sound-source signal.
The target that voice strengthen is to guarantee under the prerequisite of not damaging the target speech structure, the noise that exists in minimizing even the elimination acknowledge(ment) signal, thereby the sharpness of raising voice.
The realization that the microphone array voice strengthen can be divided into the auditory localization stage and voice strengthen the stage.In the auditory localization stage, system obtains the azimuth information on speaker's the space; Strengthen the stage at voice, utilize acquired azimuth information, adopt array signal processing method, the information of leaching Sounnd source direction suppresses the interfere information on other directions, realizes that voice strengthen.
The microphone array voice strengthen combination array treatment technology, and the research through a large amount of has mainly formed three kinds of main flow algorithms: the wave beam forming method of fixed beam forming method, adaptive beam forming method and postfilter at present.Wherein the adaptive beam forming method of GSC (Generalized Sidelobe Canceller, generalized sidelobe Canceller) structure relies on low calculated amount high-performance, therefore widely uses.But the problem that adopts ARRAY PROCESSING to occur the most easily is, when the target signal direction evaluated error occurring, causes the leakage of echo signal easily, has a strong impact on the performance of voice enhancing.In the GSC structure; Main constructing module is BM (Blocking Matrix, a blocking matrix) module, and it can utilize the directional information that estimates; Filtering target direction signal is so focus on the optimization of blocking matrix to the optimization of microphone array voice enhancement algorithm.
Summary of the invention
The object of the present invention is to provide a kind of optimization method of microphone array voice enhancement algorithm, improve the adaptive faculty of blocking matrix, the voice that blocking matrix leaks are reduced, improve voice enhanced robust property.
The technical solution that realizes the object of the invention is: the optimization method that a kind of microphone array voice strengthen, and step is following:
The first step, handle early stage, promptly accomplish the input array voice signal carried out pre-emphasis, divides frame and windowing process after, utilize delay time estimation method to obtain the directional information of sound source, utilize directional information to obtain the steering vector of signal;
Second step, utilize microphone array to build the GSC structural model, realize that at first fixed beam forms algorithm; Be different from conventional GSC structure treatment, it is with the FBF separated into two parts: signal alignment forms with wave beam, at first utilizes the directional information that early stage, processing obtained to carry out signal alignment; Signal alignment is to utilize the steering vector that obtains in aforementioned, will have the microphone array signals of direction time delay to become from array normal direction input signal, so in theory; Microphone array will be from 0 ° of direction incident; Signal after the alignment is divided into two-way, and one the tunnel proceeds the fixed beam forming process, adds up and asks average; Another road gets into the blocking matrix module echo signal is blocked;
In the 3rd step, realize the blocking matrix module, because through carrying out signal alignment in second step, sense does in theory
0 °, when adopting the even battle array of straight line, blocking matrix adopts following form;
Figure 815518DEST_PATH_IMAGE001
Wherein
Figure 2012101277578100002DEST_PATH_IMAGE002
Be blocking matrix,
Figure 794975DEST_PATH_IMAGE003
For blocking direction is that signal is estimated direction, dBe array element distance,
Figure 2012101277578100002DEST_PATH_IMAGE004
Be wave length of sound, MBe the input signal number, and though this moment arrival direction why, initial
Figure 637029DEST_PATH_IMAGE003
All be 0, through signal input MC module behind the blocking matrix;
The 4th step; Realize MC (Multiple-input Canceller, many input offsets device) module, in theory by FBF (Fixed BeamFomer; Fixed beam former) output deducts BM output; With obtaining pure target speech, consider at this moment to have speech leakage when the direction misjudgment takes place that the output of MC is not temporarily as final output;
In the 5th step, the output of extracting MC utilizes the correlativity between MC output and the BM output; When related function is big, exists and leak voice, pair correlation function value setting threshold; When surpassing threshold value, be 0 as initial parameter with
Figure 32238DEST_PATH_IMAGE005
, set the adjustment step-length; Be reduced to the adjustment direction with correlation function value; Through doubly taking advantage of mode to adjust parameter, finally make correlation function value less than threshold value, at last just at MC module output voice.
The present invention compared with prior art, its advantage is: weakened the influence that the direction evaluated error strengthens microphone array voice, improved the robustness of adaptive beam former.The direction that blocking matrix is pointed to converges on true directions, reduces target speech and leaks, and improves the output signal-to-noise ratio and the sharpness of output voice, and the Beam-former that overcomes the GSC structure is depended on the weakness that target signal direction is estimated unduly.
Below in conjunction with accompanying drawing the present invention is described in further detail.
Description of drawings
Fig. 1 is a GSC structure microphone array voice enhancement algorithm synoptic diagram among the present invention.
Embodiment
In conjunction with Fig. 1, the microphone array voice of GSC structure of the present invention strengthen optimization method, and step is following:
The first step is at first carried out pre-service, and is promptly right MThe road input speech signal
Figure 2012101277578100002DEST_PATH_IMAGE006
After carrying out pre-emphasis, dividing frame and windowing process, utilize time of arrival (toa) different, the phase-shift characterisitc that exists between the sampled signal of each microphone array element estimates the DOA (Direction Of Arrival, arrival direction) of signal.Detailed process is following:
(1) voice signal is carried out pre-service, pre emphasis factor is 0.96, and the 16kHz sampling divides frame by 512 sampled points, and it is 256 sampled points that frame moves, and uses Hamming window to carry out windowing process afterwards;
(2) correlativity between the comparison two array element frame signals is calculated phase shift and time delay between the two-way adjacent signals, estimates the DOA of signal; The DOA that utilization obtains; The array signal carries out angle compensation, makes signal DOA become the array normal direction, i.e. FBF step 1 among Fig. 1;
Second step; Realize FBF step 2; Signal added up ask average, the signal after obtaining fixed beam and forming is for two-way is up and down handled the output alignment; In actual the use, need carry out Q time delay to FBF output.Simultaneously with road under the input of the output among the FBF step 1.Realize the blocking matrix module, confirm initial blocking matrix, its matrix structure is as follows;
Wherein
Figure 2012101277578100002DEST_PATH_IMAGE009
Be blocking matrix,
Figure 2012101277578100002DEST_PATH_IMAGE010
Be the obstruction direction, dBe array element distance, Be the wave length of sound under the SF, dValue satisfies
Figure 2012101277578100002DEST_PATH_IMAGE011
, MBe the input signal number, and though this moment arrival direction why, initial
Figure 208811DEST_PATH_IMAGE005
All be 0, through signal input MC module behind the blocking matrix.
The 3rd step; Realize the MC module; To BM output carrying out weighted sum;
Figure 2012101277578100002DEST_PATH_IMAGE012
is adaptive filter coefficient, and general value is 1.Deduct BM output by FBF output in theory, can obtain pure target speech.At this moment consider to have speech leakage when the direction misjudgment takes place, the output of MC temporarily not as final output, is carried out auto adapted filtering to the MC module simultaneously, reduces speech leakage;
In the 4th step, the output of extracting MC utilizes the correlativity between MC output and the BM output, when related function is big, exists and leaks voice.Pair correlation function value setting threshold, when surpassing threshold value, with
Figure 407712DEST_PATH_IMAGE005
Be 0 as initial parameter, set the adjustment step-length μ, be reduced to the adjustment direction with correlation function value, adjust parameter through doubly taking advantage of mode, finally make correlation function value less than threshold value, at last just at MC module output voice.
During the misjudgment of GSC structure voice enhancement process generation direction; Blocking matrix can not the total blockage target speech; Cause the part voice through blocking matrix, cause the target speech of FBF output in later stage MC module and the target speech counteracting that the BM module is leaked, cause the loss of target speech.Microphone array adopts the uniform straight line array row, and the element in the blocking matrix can be parameter with the direction of arrival of signal.Export the related function between voice and the BM leakage target voice through calculating GSC, and setting threshold, as the foundation that starts adjustment blocking matrix parameter.Consider factors such as environment reverberation, the noise that passes through in the BM module can have certain correlativity with target speech, thus the threshold value setting of related function can not be too low can not be too high.When correlation function value is higher than threshold value, start adjustment blocking matrix parameter algorithm, the direction of arrival of signal that obtains with initial estimation is an initial parameter; Set the adjustment step-length, be reduced to the adjustment direction, adjust parameter through doubly taking advantage of mode with correlation function value; Finally make correlation function value less than threshold value; The blocking matrix pointing direction is tending towards target direction, reduces the speech leakage of blocking matrix, realizes reducing even eliminating speech leakage.

Claims (3)

1. the optimization method that strengthens of microphone array voice is characterized in that comprising following steps:
The first step, handle early stage, promptly accomplish the input array voice signal carried out pre-emphasis, divides frame and windowing process after, utilize delay time estimation method to obtain the directional information of sound source, utilize directional information to obtain the steering vector of signal;
Second step, utilize microphone array to build the GSC structural model, realize that at first fixed beam forms algorithm; Be different from conventional GSC structure treatment, it is with the FBF separated into two parts: signal alignment forms with wave beam, at first utilizes the directional information that early stage, processing obtained to carry out signal alignment; Signal alignment is to utilize the steering vector that obtains in the first step, will have the microphone array signals of direction time delay to become from array normal direction input signal, so in theory; Microphone array will be from 0 ° of direction incident; Signal after the alignment is divided into two-way, and one the tunnel proceeds the fixed beam forming process, adds up and asks average; Another road gets into the blocking matrix module echo signal is blocked;
The 3rd goes on foot, and realizes the blocking matrix module, because through carrying out signal alignment in second step, sense is 0 ° in theory, when adopting the even battle array of straight line, blocking matrix adopts following form:
Figure FDA0000157411340000011
B wherein 0Be blocking matrix, θ 0Be to block direction, d is an array element distance, and λ is a wave length of sound, and M is the input signal number, and though this moment arrival direction why, initial θ 0All be 0, through signal input MC module behind the blocking matrix;
The 4th step, realize the MC module, deduct BM output by FBF output in theory, with obtaining pure target speech, consider at this moment to have speech leakage when the direction misjudgment takes place that the output of MC is not temporarily as final output;
In the 5th step, the output of extracting MC utilizes the correlativity between MC output and the BM output, when related function is big, exists and leaks voice, and pair correlation function value setting threshold is when surpassing threshold value, with θ 0Be 0 as initial parameter, set the adjustment step-length, be reduced to the adjustment direction, adjust parameter, finally make correlation function value, at last just at MC module output voice less than threshold value through doubly taking advantage of mode with correlation function value.
2. the optimization method that microphone array voice according to claim 1 strengthen is characterized in that handling early stage, and detailed process is following:
The first step is carried out pre-service to voice signal, and pre emphasis factor is 0.96, with the 16kHz sampling, divides frame by 512 sampled points, and it is 50% that frame moves, and uses Hamming window to carry out windowing process afterwards;
Second step, utilize microphone array to receive signal, estimate signal direction information, generate the signal guide vector.
3. the optimization method that microphone array voice according to claim 1 strengthen is characterized in that building the GSC model, and detailed process is following:
The first step is split as two steps with the FBF process, at first carries out early stage and handles; Utilize resulting signal guide vector, signal is carried out alignment compensation, make the signal of array received become the array normal direction; Signal after will aliging then is divided into two-way, one tunnel input BM module, and the fixed beam forming process is proceeded on another road; Add up and ask average, obtain FBF output;
In second step,, be input as the signal after the said alignment based on the blocking matrix setting; Through signal and matrix multiple; Make blocking matrix block the signal on the estimating target direction, be output as at last, the M-1 road signal of exporting is synthesized 1 road signal except the signal on other directions of target direction;
The 3rd step; Realize the MC module, FBF is exported the output that deducts BM, promptly deduct the road signal that only contains interference with the road signal that comprises echo signal and interference; Last export target signal adopts sef-adapting filter further to reduce the target speech that wherein exists here among the MC.
CN2012101277578A 2012-04-26 2012-04-26 Method for optimizing speech enhancement of microphone array Pending CN102664023A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012101277578A CN102664023A (en) 2012-04-26 2012-04-26 Method for optimizing speech enhancement of microphone array

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012101277578A CN102664023A (en) 2012-04-26 2012-04-26 Method for optimizing speech enhancement of microphone array

Publications (1)

Publication Number Publication Date
CN102664023A true CN102664023A (en) 2012-09-12

Family

ID=46773488

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012101277578A Pending CN102664023A (en) 2012-04-26 2012-04-26 Method for optimizing speech enhancement of microphone array

Country Status (1)

Country Link
CN (1) CN102664023A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102938254A (en) * 2012-10-24 2013-02-20 中国科学技术大学 Voice signal enhancement system and method
CN104715758A (en) * 2015-02-06 2015-06-17 哈尔滨工业大学深圳研究生院 Branched processing array type speech positioning and enhancement method
CN105430587A (en) * 2014-09-17 2016-03-23 奥迪康有限公司 A Hearing Device Comprising A Gsc Beamformer
CN106409306A (en) * 2016-09-19 2017-02-15 宁波高新区敦和科技有限公司 Intelligent system obtaining human voice and obtaining method based on the system
CN107301869A (en) * 2017-08-17 2017-10-27 珠海全志科技股份有限公司 Microphone array sound pick-up method, processor and its storage medium
CN107369456A (en) * 2017-07-05 2017-11-21 南京邮电大学 Noise cancellation method based on generalized sidelobe canceller in digital deaf-aid
CN108538320A (en) * 2018-03-30 2018-09-14 广东欧珀移动通信有限公司 Recording control method and device, readable storage medium storing program for executing, terminal
CN109389991A (en) * 2018-10-24 2019-02-26 中国科学院上海微系统与信息技术研究所 A kind of signal enhancing method based on microphone array
CN111866665A (en) * 2020-07-22 2020-10-30 海尔优家智能科技(北京)有限公司 Microphone array beam forming method and device
WO2022012206A1 (en) * 2020-07-17 2022-01-20 腾讯科技(深圳)有限公司 Audio signal processing method, device, equipment, and storage medium
CN116320947A (en) * 2023-05-17 2023-06-23 杭州爱听科技有限公司 Frequency domain double-channel voice enhancement method applied to hearing aid

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1753084A (en) * 2004-09-23 2006-03-29 哈曼贝克自动系统股份有限公司 Multi-channel adaptive speech signal processing with noise reduction
JP2007147732A (en) * 2005-11-24 2007-06-14 Japan Advanced Institute Of Science & Technology Hokuriku Noise reduction system and noise reduction method
CN101369427A (en) * 2007-08-13 2009-02-18 哈曼贝克自动系统股份有限公司 Noise reduction by combined beamforming and post-filtering
US20090086578A1 (en) * 2006-04-20 2009-04-02 Nec Corporation Adaptive array control device, method and program, and adaptive array processing device, method and program using the same

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1753084A (en) * 2004-09-23 2006-03-29 哈曼贝克自动系统股份有限公司 Multi-channel adaptive speech signal processing with noise reduction
JP2007147732A (en) * 2005-11-24 2007-06-14 Japan Advanced Institute Of Science & Technology Hokuriku Noise reduction system and noise reduction method
US20090086578A1 (en) * 2006-04-20 2009-04-02 Nec Corporation Adaptive array control device, method and program, and adaptive array processing device, method and program using the same
CN101369427A (en) * 2007-08-13 2009-02-18 哈曼贝克自动系统股份有限公司 Noise reduction by combined beamforming and post-filtering

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
HUI WANG: "A Chinese speech compensation method for the leakage based on generalized sidelobe canceller in hearing aids", 《IEEE WCSP 2011》 *
何成林 杜利民 马昕: "麦克风阵列语音增强的研究", 《计算机工程与应用》 *
高杰 胡广书 张辉: "基于GSC结构的多麦克风数字助听器的自适应波束形成算法", 《北京生物医学工程》 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102938254B (en) * 2012-10-24 2014-12-10 中国科学技术大学 Voice signal enhancement system and method
CN102938254A (en) * 2012-10-24 2013-02-20 中国科学技术大学 Voice signal enhancement system and method
CN105430587A (en) * 2014-09-17 2016-03-23 奥迪康有限公司 A Hearing Device Comprising A Gsc Beamformer
CN105430587B (en) * 2014-09-17 2020-04-14 奥迪康有限公司 Hearing device comprising a GSC beamformer
CN104715758A (en) * 2015-02-06 2015-06-17 哈尔滨工业大学深圳研究生院 Branched processing array type speech positioning and enhancement method
CN106409306A (en) * 2016-09-19 2017-02-15 宁波高新区敦和科技有限公司 Intelligent system obtaining human voice and obtaining method based on the system
CN107369456A (en) * 2017-07-05 2017-11-21 南京邮电大学 Noise cancellation method based on generalized sidelobe canceller in digital deaf-aid
CN107301869A (en) * 2017-08-17 2017-10-27 珠海全志科技股份有限公司 Microphone array sound pick-up method, processor and its storage medium
CN108538320B (en) * 2018-03-30 2020-09-11 Oppo广东移动通信有限公司 Recording control method and device, readable storage medium and terminal
CN108538320A (en) * 2018-03-30 2018-09-14 广东欧珀移动通信有限公司 Recording control method and device, readable storage medium storing program for executing, terminal
CN109389991A (en) * 2018-10-24 2019-02-26 中国科学院上海微系统与信息技术研究所 A kind of signal enhancing method based on microphone array
WO2022012206A1 (en) * 2020-07-17 2022-01-20 腾讯科技(深圳)有限公司 Audio signal processing method, device, equipment, and storage medium
CN111866665A (en) * 2020-07-22 2020-10-30 海尔优家智能科技(北京)有限公司 Microphone array beam forming method and device
CN111866665B (en) * 2020-07-22 2022-01-28 海尔优家智能科技(北京)有限公司 Microphone array beam forming method and device
CN116320947A (en) * 2023-05-17 2023-06-23 杭州爱听科技有限公司 Frequency domain double-channel voice enhancement method applied to hearing aid
CN116320947B (en) * 2023-05-17 2023-09-01 杭州爱听科技有限公司 Frequency domain double-channel voice enhancement method applied to hearing aid

Similar Documents

Publication Publication Date Title
CN102664023A (en) Method for optimizing speech enhancement of microphone array
CN102969002B (en) Microphone array speech enhancement device capable of suppressing mobile noise
CN105355210B (en) Preprocessing method and device for far-field speech recognition
CN102509552B (en) Method for enhancing microphone array voice based on combined inhibition
CN102938254B (en) Voice signal enhancement system and method
CN101533091B (en) Space-time two-dimensional narrow band barrage jamming method
CN110085247B (en) Double-microphone noise reduction method for complex noise environment
US8370140B2 (en) Method of filtering non-steady lateral noise for a multi-microphone audio device, in particular a “hands-free” telephone device for a motor vehicle
CN103561185B (en) A kind of echo cancel method of sparse path
CN102324237A (en) Microphone array voice wave beam formation method, speech signal processing device and system
CN103197300B (en) Real-time processing method for cancellation of direct wave and clutter of external radiation source radar based on graphic processing unit (GPU)
WO2015196760A1 (en) Microphone array speech detection method and device
CN101900601B (en) Method for identifying direct sound in complex multi-path underwater sound environment
CN105679329A (en) Microphone array voice enhancing device adaptable to strong background noise
CN103777214B (en) Non-stationary suppression jamming signal inhibition method in satellite navigation system
CN110850445B (en) Pulse interference suppression method based on space-time sampling covariance inversion
CN111273237B (en) Strong interference suppression method based on spatial matrix filtering and interference cancellation
CN110632555B (en) TDOA (time difference of arrival) direct positioning method based on matrix eigenvalue disturbance
WO2019112467A1 (en) Method and apparatus for acoustic echo cancellation
CN105022268A (en) Linear constraint virtual antenna beam forming method
CN104345306A (en) Target wave arrival angle estimation method based on Khatri-Rao subspace
WO2007123048A1 (en) Adaptive array control device, method, and program, and its applied adaptive array processing device, method, and program
CN104865584A (en) Method for realizing space-frequency adaptive navigation anti-interference algorithm
CN104715758A (en) Branched processing array type speech positioning and enhancement method
CN106373588A (en) Adaptive microphone array calibration method based on variable step NLMS algorithm

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20120912