CN105702262A - Headset double-microphone voice enhancement method - Google Patents

Headset double-microphone voice enhancement method Download PDF

Info

Publication number
CN105702262A
CN105702262A CN201410702133.3A CN201410702133A CN105702262A CN 105702262 A CN105702262 A CN 105702262A CN 201410702133 A CN201410702133 A CN 201410702133A CN 105702262 A CN105702262 A CN 105702262A
Authority
CN
China
Prior art keywords
voice
probability
signal
mike
enhancement method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410702133.3A
Other languages
Chinese (zh)
Inventor
金剑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Aviation Electric Co Ltd
Original Assignee
Shanghai Aviation Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Aviation Electric Co Ltd filed Critical Shanghai Aviation Electric Co Ltd
Priority to CN201410702133.3A priority Critical patent/CN105702262A/en
Publication of CN105702262A publication Critical patent/CN105702262A/en
Pending legal-status Critical Current

Links

Landscapes

  • Soundproofing, Sound Blocking, And Sound Damping (AREA)

Abstract

The invention discloses a headset double-microphone voice enhancement method. The headset double-microphone voice enhancement method mainly comprises steps of pre-processing, FFT transformation, end point detection (VAD), microphone amplitude-frequency consistency calibration, voice signal-to-noise-ratio calculation, voice existence probability calculation, MMSE voice enhancement processing, reverse FFT transformation and time domain signal reconstruction. According to the headset double-microphone voice enhancement method, the headset communication recording equipment such as an earphone with microphones is provided with an effective voice enhancement method, so voice sharpness is improved, the two microphones are employed for processing, and the effect is better than the common headset single-microphone voice enhancement effect.

Description

A kind of wear-type dual microphone sound enhancement method
Technical field
The invention belongs to field of speech enhancement, be specifically related to a kind of wear-type dual microphone sound enhancement method, be mainly used under noise circumstance and improve call, the definition recorded。
Background technology
Speech enhan-cement be to sensor acquisition to voice signal process, eliminate noise therein, improve speech intelligibility, its function is widely used in the equipment such as mobile phone, computer, recording pen。Sound enhancement method generally includes single microphone processing scheme and microphone array processing scheme。Single microphone processing scheme adopts a mike gather signal and carry out speech enhan-cement algorithm process, and the stationary noise effects such as white noise is obvious, but poor for nonstationary noise effects such as coffee shop's noises。Namely Microphone Array Speech enhanced scheme is placed multiple mikes at diverse location and is gathered signal, compared to single microphone processing scheme, additionally have collected the spatial information of sound source, then adopt digital signal processing algorithm to carry out speech enhan-cement, stationary noise and nonstationary noise are had good inhibition。
Microphone array voice enhancement method is relatively conventional to select two mikes to carry out process in actual applications, to be most widely used in smart mobile phone。In smart mobile phone, generally place a mike respectively in the bottom of mobile phone and top, utilize when hand-held call two mikes to gather signal difference and carry out speech enhan-cement。For driving the headband receiver with mike under the environment such as call for computer recording, cockpit, a mike is generally adopted to process, under noisy environment, owing to the limitation of single microphone hardly results in voice clearly, therefore for the voice enhanced function of wear-type communication sound pick-up outfit, also it is particularly important。
Goal of the invention
It is an object of the invention to propose a kind of wear-type dual microphone sound enhancement method, mainly for the such as earphone with mike to wear-type communication sound pick-up outfit, it is provided that one is sound enhancement method effectively, improves speech intelligibility。
To achieve these goals, technical scheme is as follows: a kind of wear-type dual microphone sound enhancement method, wherein No. second mike in two-way mike is close to target sound source, and the first via is from target sound source farther out, it is characterized in that described Enhancement Method comprises the following steps: A, pretreatment, two paths of signals is carried out preemphasis process, low frequency signal is carried out suppression process, increase the weight of high-frequency signal to process;B, two paths of signals is carried out FFT;C, end-point detection, it is determined that voice segments and non-speech segment;D, result of determination according to previous module, be calibrated the concordance of the amplitude-frequency response of two mikes;E, voice signal-to-noise ratio computation, voice prior weight calculates with posteriori SNR;There is probability calculation in F, voice, voice prior weight is carried out smoothing on time frame, described balance is divided into local smoothing method to smooth with the overall situation, then calculate local sub-band speech probability, overall situation sub-band speech probability and frame voice and there is probability, there is probability calculation voice be absent from probability further according to local sub-band speech probability, overall situation sub-band speech probability and frame voice;G, MMSE speech enhan-cement processes, and calculates voice existence condition probability, then calculates gain, use as above gain to process No. second mike voice signal power spectrum;H, inverse FFT, carry out inverse transformation to enhanced voice signal power spectrum, obtain the time domain framing signal after enhancement process;I, time-domain signal reconstruct, and time domain framing signal carries out overlap-add and obtains enhanced speech data signal。
The method uses two mikes to process, and its effect is better than common wear-type single microphone speech enhan-cement effect。
Accompanying drawing explanation
Fig. 1 is the flow chart of the present invention。
Detailed description of the invention
The wear-type dual microphone sound enhancement method that the present invention proposes is mainly used in the scene closely said。The method selects two-way mike to gather signal, and the mike used is fixed on headband receiver support。General in order to make full use of each sound source different spatial information relative to mike, two-way mike Zhong No. mono-mike is as far as possible close to target sound source, and another road is from target sound source relatively far away from, requires to keep certain sexual relationship symmetrical above and below simultaneously。
The wear-type dual microphone sound enhancement method processing procedure that the present invention proposes specifically includes that pretreatment, FFT, end-point detection (VAD), the calibration of mike amplitude-frequency equalization, voice signal-to-noise ratio computation, voice exist probability calculation, the process of MMSE speech enhan-cement, inverse FFT and time-domain signal reconstruct。Having two-way mike in the present invention, be wherein the second tunnel from target sound source relative close, target sound source is often referred to face。The two-way microphone position first via is on the headband receiver crown, and the second tunnel is on face limit。The first via is used for assisting calculating the second road Reed signal to noise ratio, and voice exists probability etc., and therefore the second tunnel is strengthened by follow-up only need。
A, pretreatment
Pretreatment includes framing, windowing, 3 submodules of pretreatment。If the two ways of digital signals that A/D is converted to is X1(n) and X2(n)。Wherein X1N () expression is placed on the signal that the mike at headphone top collects, X2N () represents the signal that the mike of close face collects。Frame length L selects 400, and it is 160 sampling points that frame moves。Hamming window is selected in windowing:
When two paths of signals carries out preemphasis process, preemphasis processes and mainly low frequency signal is suitably suppressed, and high-frequency signal is suitably increased the weight of, so that voice signal becomes more smooth, is susceptible to the impact of computer limited wordlength during calculating。Preemphasis expression formula is as follows:
Wherein, α=0.9375. two paths of signals after framing windowing and pretreatment is X1(m, n) and X2(m, n), in order to represent m frame, the signal sample of the n-th sampling point。
Process is so, and first x1 (n) and x2 (n) framing obtain each frame signal a1 in left and right road, and (m, n) with a2 (m, n), m represents frame number, n representative sample period, then for a wherein road: windowing and p1 (m, n)=a1 (m, n) * w (n), n=0 to L-1, then preemphasis X1 (m, n)=p1 (m, n)-α * p (m, n-1)。I.e. first framing, then windowing, then preemphasis, obtain data X1 after pretreatment described in literary composition (m, n) and X2 (m, n)。W (n) represents window coefficient, is that corresponding point are multiplied during windowing, it is not necessary to summation, obtain L point p (n), then through preemphasis obtain X (m, n)。Time described herein, being only windowing and the method for preemphasis own of illustrating with formula, between formula, variable is absent from corresponding relation, and after final pretreatment, signal definition is X1(m, n) and X2(m,n)。
B, FFT
Two paths of signals is carried out FFT, and FFT points N _ FFT is taken as 512 points。If the two paths of signals after FFT is X1(m, k) and X2(m k), represents the kth frequency of m frame。Then have:
C, end-point detection (VAD)
End-point detection is used for judging voice segments and non-speech segment, and its result of determination will be used by follow-up multiple modules。When mainly utilizing target voice to speak here, the notable difference of two-way microphone signal energy carries out end-point detection judgement。If the VAD court verdict of m frame withRepresent。
If front 8 frames, thenFor initialization value, namely, jump directly to D step operation。
Otherwise,
Wherein,
,It it is the previous frame two-way mike amplitude-frequency gross energy calibrator quantity calculated。
BandE(m, k), the change from small to large of k represents frequency from low to high, and a k represents a frequency domain sub-band。Therefore bandE represents the energy of this subband k。
D, mike amplitude-frequency equalization are calibrated
This module is mainly the VAD result of determination according to previous module, and the concordance of the amplitude-frequency response of two mikes is calibrated。Calibrator quantity is divided into 2, gross energy calibrator quantity, with subband calibrator quantity。Wherein,Representing the calibrator quantity of the gross energy difference of m frame two-way mike, what it was measured is the difference value of mike 2 and the gross energy db of mike 1。
Being the kth sub belt energy calibrator quantity of m frame two-way mike, its tolerance is except after gross energy difference, the energy db difference value of each frequency subband。Calibration process is as follows:
First, two-way current frame signal noise total noise power is updated:
Two-way current frame signal subband noise energy is updated:
Gross energy calibrator quantityFor:
Wherein,
Subband calibrator quantityFor:
Wherein,
Calculate calibrator quantityWithAfterwards, according to them, the power spectrum of two-way mike can be calibrated, if the two-way mike subband power spectrum after calibration isWithThen there is F:
In above formula,For calibration factor, it is calculated as follows:
Wherein th2 is a threshold value of the ratio set of two paths of signals。
E, voice signal-to-noise ratio computation
Voice prior weight is as follows with posteriori SNR computational methods:
Initial value be 0。
There is probability calculation in F, voice
Voice prior weight is carried out smoothing on time frame:
Prior weight after time is above smoothed smooths on frequency domain, is divided into local smoothing method to smooth with the overall situation, smoothing windows window length respectively W1=3 and W2=15。WithHamming window for corresponding window length.
Then local sub-band speech probability and overall situation sub-band speech probabilityWithIt is calculated as follows:
Frame voice exists probability calculation to be needed according to reference quantity, reference quantity computational methods are as follows:
After calculating reference quantity, it is as follows to there is method for calculating probability in frame voice:
1) ifAndThen:
Present frame frame is designated as reference quantity peak value with reference to value simultaneously, it may be assumed that
2) ifAndThen:
3) if
Calculated voice by local sub-band speech probability, overall situation sub-band speech probability and frame speech probability and be absent from probability:
G, MMSE speech enhan-cement processes
Voice existence condition probability calculation is as follows:
Wherein,
Calculate preliminary gain:
Final OM-LSA gain is calculated as follows:
As above gain is used to process No. 2nd mike voice signal power spectrum:
H, inverse FFT
Enhanced voice signal power spectrum is carried out inverse transformation, obtains the time domain framing signal after enhancement process:
I, time-domain signal reconstruct
Time domain framing signal carries out overlap-add and obtains enhanced speech data signal。
In literary composition, the α of all subscriptings all represents smoothing factor, prevents parameter from suddenling change when calculating parameter, smooths typically by the number between 0 to 1。

Claims (1)

1. a wear-type dual microphone sound enhancement method, wherein the first mike in two-way mike is from target sound source farther out, and No. second mike is close to target sound source, it is characterized in that described Enhancement Method comprises the following steps: A, pretreatment, two paths of signals is carried out preemphasis process, low frequency signal is carried out suppression process, increases the weight of high-frequency signal to process;B, two paths of signals is carried out FFT;C, end-point detection, it is determined that voice segments and non-speech segment;D, result of determination according to previous module, be calibrated the concordance of the amplitude-frequency response of two mikes;E, voice signal-to-noise ratio computation, voice prior weight calculates with posteriori SNR;There is probability calculation in F, voice, voice prior weight is carried out smoothing on time frame, described balance is divided into local smoothing method to smooth with the overall situation, then calculate local sub-band speech probability, overall situation sub-band speech probability and frame voice and there is probability, there is probability calculation voice be absent from probability further according to local sub-band speech probability, overall situation sub-band speech probability and frame voice;G, MMSE speech enhan-cement processes, and calculates voice existence condition probability, then calculates gain, use as above gain to process No. second mike voice signal power spectrum;H, inverse FFT, carry out inverse transformation to enhanced voice signal power spectrum, obtain the time domain framing signal after enhancement process;I, time-domain signal reconstruct, and time domain framing signal carries out overlap-add and obtains enhanced speech data signal。
CN201410702133.3A 2014-11-28 2014-11-28 Headset double-microphone voice enhancement method Pending CN105702262A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410702133.3A CN105702262A (en) 2014-11-28 2014-11-28 Headset double-microphone voice enhancement method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410702133.3A CN105702262A (en) 2014-11-28 2014-11-28 Headset double-microphone voice enhancement method

Publications (1)

Publication Number Publication Date
CN105702262A true CN105702262A (en) 2016-06-22

Family

ID=56295674

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410702133.3A Pending CN105702262A (en) 2014-11-28 2014-11-28 Headset double-microphone voice enhancement method

Country Status (1)

Country Link
CN (1) CN105702262A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106535045A (en) * 2016-11-30 2017-03-22 中航华东光电(上海)有限公司 Audio enhancement processing module for laryngophone
CN106658329A (en) * 2016-12-02 2017-05-10 歌尔科技有限公司 Method and apparatus for calibrating microphones of electronic device, and electronic device
CN108022595A (en) * 2016-10-28 2018-05-11 电信科学技术研究院 A kind of voice signal noise-reduction method and user terminal
WO2018086444A1 (en) * 2016-11-10 2018-05-17 电信科学技术研究院 Method for estimating signal-to-noise ratio for noise suppression, and user terminal
CN108428456A (en) * 2018-03-29 2018-08-21 浙江凯池电子科技有限公司 Voice de-noising algorithm
CN109246517A (en) * 2018-10-12 2019-01-18 歌尔科技有限公司 A kind of noise reduction microphone bearing calibration, wireless headset and the charging box of wireless headset
CN112233657A (en) * 2020-10-14 2021-01-15 河海大学 Speech enhancement method based on low-frequency syllable recognition

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1967658A (en) * 2005-11-14 2007-05-23 北京大学科技开发部 Small scale microphone array speech enhancement system and method
CN101079266A (en) * 2006-05-23 2007-11-28 中兴通讯股份有限公司 Method for realizing background noise suppressing based on multiple statistics model and minimum mean square error
CN101763858A (en) * 2009-10-19 2010-06-30 瑞声声学科技(深圳)有限公司 Method for processing double-microphone signal
US20120263317A1 (en) * 2011-04-13 2012-10-18 Qualcomm Incorporated Systems, methods, apparatus, and computer readable media for equalization
CN103098132A (en) * 2010-08-25 2013-05-08 旭化成株式会社 Sound source separator device, sound source separator method, and program
CN103456310A (en) * 2013-08-28 2013-12-18 大连理工大学 Transient noise suppression method based on spectrum estimation

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1967658A (en) * 2005-11-14 2007-05-23 北京大学科技开发部 Small scale microphone array speech enhancement system and method
CN101079266A (en) * 2006-05-23 2007-11-28 中兴通讯股份有限公司 Method for realizing background noise suppressing based on multiple statistics model and minimum mean square error
CN101763858A (en) * 2009-10-19 2010-06-30 瑞声声学科技(深圳)有限公司 Method for processing double-microphone signal
CN103098132A (en) * 2010-08-25 2013-05-08 旭化成株式会社 Sound source separator device, sound source separator method, and program
US20120263317A1 (en) * 2011-04-13 2012-10-18 Qualcomm Incorporated Systems, methods, apparatus, and computer readable media for equalization
CN103456310A (en) * 2013-08-28 2013-12-18 大连理工大学 Transient noise suppression method based on spectrum estimation

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108022595A (en) * 2016-10-28 2018-05-11 电信科学技术研究院 A kind of voice signal noise-reduction method and user terminal
WO2018086444A1 (en) * 2016-11-10 2018-05-17 电信科学技术研究院 Method for estimating signal-to-noise ratio for noise suppression, and user terminal
CN108074582A (en) * 2016-11-10 2018-05-25 电信科学技术研究院 A kind of noise suppressed signal-noise ratio estimation method and user terminal
CN106535045A (en) * 2016-11-30 2017-03-22 中航华东光电(上海)有限公司 Audio enhancement processing module for laryngophone
CN106658329A (en) * 2016-12-02 2017-05-10 歌尔科技有限公司 Method and apparatus for calibrating microphones of electronic device, and electronic device
CN106658329B (en) * 2016-12-02 2019-06-07 歌尔科技有限公司 Calibration method, device and electronic equipment for electronic equipment microphone
CN108428456A (en) * 2018-03-29 2018-08-21 浙江凯池电子科技有限公司 Voice de-noising algorithm
CN109246517A (en) * 2018-10-12 2019-01-18 歌尔科技有限公司 A kind of noise reduction microphone bearing calibration, wireless headset and the charging box of wireless headset
CN112233657A (en) * 2020-10-14 2021-01-15 河海大学 Speech enhancement method based on low-frequency syllable recognition
CN112233657B (en) * 2020-10-14 2024-05-28 河海大学 Speech enhancement method based on low-frequency syllable recognition

Similar Documents

Publication Publication Date Title
CN105702262A (en) Headset double-microphone voice enhancement method
JP7011075B2 (en) Target voice acquisition method and device based on microphone array
CN109767783B (en) Voice enhancement method, device, equipment and storage medium
CN108735213B (en) Voice enhancement method and system based on phase compensation
US20140025374A1 (en) Speech enhancement to improve speech intelligibility and automatic speech recognition
CN109817209B (en) Intelligent voice interaction system based on double-microphone array
CN108172231B (en) Dereverberation method and system based on Kalman filtering
US9558755B1 (en) Noise suppression assisted automatic speech recognition
US11631421B2 (en) Apparatuses and methods for enhanced speech recognition in variable environments
CN111418010A (en) Multi-microphone noise reduction method and device and terminal equipment
CN106875938B (en) Improved nonlinear self-adaptive voice endpoint detection method
JP5153886B2 (en) Noise suppression device and speech decoding device
JPH0916194A (en) Noise reduction for voice signal
KR20130108063A (en) Multi-microphone robust noise suppression
EP2463856B1 (en) Method to reduce artifacts in algorithms with fast-varying gain
WO2011041738A2 (en) Suppressing noise in an audio signal
CN106887239A (en) For the enhanced blind source separation algorithm of the mixture of height correlation
US11373667B2 (en) Real-time single-channel speech enhancement in noisy and time-varying environments
CN105280193B (en) Priori signal-to-noise ratio estimation method based on MMSE error criterion
CN111667844A (en) Microphone array-based low-operand speech enhancement device
JP2021505933A (en) Voice enhancement of audio signals with modified generalized eigenvalue beamformer
CN103700375A (en) Voice noise-reducing method and voice noise-reducing device
CN114041185A (en) Method and apparatus for determining a depth filter
CN112530451A (en) Speech enhancement method based on denoising autoencoder
CN107680610A (en) A kind of speech-enhancement system and method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20160622

RJ01 Rejection of invention patent application after publication