CN103594093A - Method for enhancing voice based on signal to noise ratio soft masking - Google Patents

Method for enhancing voice based on signal to noise ratio soft masking Download PDF

Info

Publication number
CN103594093A
CN103594093A CN201210290074.4A CN201210290074A CN103594093A CN 103594093 A CN103594093 A CN 103594093A CN 201210290074 A CN201210290074 A CN 201210290074A CN 103594093 A CN103594093 A CN 103594093A
Authority
CN
China
Prior art keywords
noise
signal
noise ratio
voice
power spectrum
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201210290074.4A
Other languages
Chinese (zh)
Inventor
王景芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan International Economics University
Original Assignee
王景芳
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 王景芳 filed Critical 王景芳
Priority to CN201210290074.4A priority Critical patent/CN103594093A/en
Publication of CN103594093A publication Critical patent/CN103594093A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Circuit For Audible Band Transducer (AREA)

Abstract

The invention discloses a method for enhancing voice based on signal to noise ratio soft masking. The method comprises: establishing noise power spectrum updates of a sub-band time varying coefficient, using different threshold update smooth spectrums for different frequency points, enhancing voice and restraining noise; determining a posterior signal to noise ratio from a noise power spectrum, performing iterative computation to obtain a prior signal to noise ratio of a present frame according to the posterior signal to noise ratio and the prior signal to noise ratio of a previous frame, and obtaining that each frequency point is in a masking region or in a target signal region according to values of the prior signal to noise ratio. A masking value is calculated by probability distribution obtained by hypothesis testing. The method uses correlation of adjacent frames to extract information to realize enhancement of voice spectrum smooth iteration estimation. For non-stationary noise and strong background noise, a voice enhancement algorithm based on signal to noise ratio soft masking is provided. A rapid tracking noise algorithm performs smooth update on the non-stationary noise frame by frame, preferably estimating a noise spectrum. The algorithm can effectively restrain background noise and improve voice quality and speech intelligibility after denoising.

Description

Based on signal-to-noise ratio flexible, shelter sound enhancement method
Technical field
The invention belongs to voice process technology field, refer to especially a kind ofbased on signal-to-noise ratio flexible, shelter sound enhancement method.
Background technology
Voice are basic means that the mankind link up; Research has brought many new problems to voice signal for the mankind's various social activitieies and behavior, and meanwhile, the development of voice processing technology is at every moment changing mankind's daily life style; For example, the appearance of speech coding technology makes people can in limited communication bandwidth resource, listen to sound at a distance, and recently, the development of wideband speech coding makes our speech in communicating by letter more natural, have more intelligibility, alleviated or reduced the misunderstanding producing in linking up; The breakthrough of a large vocabulary continuous speech recognition difficult problem has caused people to have new phonetic entry mode and interactive mode, and people can liberate both hands and directly give an oral account, and the language that makes to indicate machine work or understand us, increases work efficiency greatly.
The voice processing technology using in daily life as the technology such as voice coding and speech recognition all unavoidably will be in the face of the interference of diversity of settings noise; The existence of noise greatly reduces the performance of these utilizations or directly causes user to stand and abandon using; The background conversation sound that neighbourhood noise exists as scene, the machine vibration noise in car steering storehouse, the car engine sound in running at high speed, the repercussion noises of indoor wall etc., all can pollute primary speech signal; The existence of ground unrest and characteristic thereof are especially serious on considering the parameter voice processing technology impact of human speech characteristic, have destroyed the parameter model and the auditory properties that presuppose; Existing speech recognition system can be used well under noise-free environment, once use in noisy environment place, its recognition performance sharply declines.Obviously, under the interference of noise, the differentiation between the phonetic feature using in recognition system is weakened, and causes system identification mistake to increase.
Along with the universal of mobile communication comes true, when mobile communication technology brings people without constraint and voice communication easily, especially voice communication has been taken to an applied environment that is full of Complex Noise; And the voice coding of mobile phone unavoidably can increase encoding error in noisy environment.
How to eliminate the inconvenience that additive noise brings, the appearance that voice strengthen can reduce or solve the adverse effect of noise; Voice strengthen (speech enhancement) and are typically used as in the speech processing system that front-end processing module appears at various practical applications; It is by carrying out filtering to noisy speech, approximate reduction clean speech signal, make speech processes not directly in the face of noisy speech signal, strengthened the robustness of voice system, and the speech enhancement technique of high robust can expand the application places of speech processing system effectively.
  
Summary of the invention
(1) technical matters that will solve
In view of this, fundamental purpose of the present invention is to propose a kind ofly based on signal-to-noise ratio flexible, to shelter sound enhancement method, for listener, is mainly to improve voice quality, improves the intelligibility of speech, reduces sense of fatigue; For speech processing system (recognizer, vocoder, mobile phone), be discrimination and the antijamming capability of raising system.The key scientific problems that quasi-solution is determined: realize Noise voice and strengthen, improve signal to noise ratio (S/N ratio), reduce voice messaging distortion and damage after strengthening, be practically applicable to multiple noise circumstance as far as possible.Specifically have noise power spectrum renewal,priori snr computation, shelter that territory is also determined with echo signal territory, the big or small calculating of masking value etc.
  
(2) technical scheme
For achieving the above object, the invention provides and a kind ofly based on signal-to-noise ratio flexible, shelter sound enhancement method, the method comprises:
1) the noise power spectrum of frequency band time-varying coefficient upgrades,noisy speech l frame power spectrum | Y (l, k) | 2, k is frequency sequence number, l frame is estimated noise power spectrum
Figure DEST_PATH_RE-142992DEST_PATH_IMAGE001
, P (l, k) is the level and smooth power spectrum of voice, smoothing factor,
Figure DEST_PATH_RE-380255DEST_PATH_IMAGE003
, noisy speech power spectrum minimum value,
If
Figure DEST_PATH_RE-610565DEST_PATH_IMAGE005
, so
Figure DEST_PATH_RE-572705DEST_PATH_IMAGE006
,
Otherwise
Figure DEST_PATH_RE-669974DEST_PATH_IMAGE007
;
Figure DEST_PATH_RE-780099DEST_PATH_IMAGE008
,
If
Figure DEST_PATH_RE-937410DEST_PATH_IMAGE009
,
Figure DEST_PATH_RE-121267DEST_PATH_IMAGE010
,
Otherwise
Figure DEST_PATH_RE-22227DEST_PATH_IMAGE011
;
Figure DEST_PATH_RE-240719DEST_PATH_IMAGE012
,
Here to different frequency point, adopt different thresholdings to upgrade level and smooth spectrum, outstanding voice, suppress noise;
Figure DEST_PATH_RE-568932DEST_PATH_IMAGE013
;
Figure DEST_PATH_RE-240085DEST_PATH_IMAGE014
;
2) according to the size of priori signal to noise ratio (S/N ratio), obtain each Frequency point and shelter territory or echo signal territory;
Time domain:
Figure DEST_PATH_RE-679156DEST_PATH_IMAGE015
, y (n) represents noisy speech signal, x (n) and d (n) represent respectively clean speech and noise signal.The short time discrete Fourier transform of Y (n):
Figure DEST_PATH_RE-752155DEST_PATH_IMAGE016
,
Its polar coordinate representation:
Figure DEST_PATH_RE-251269DEST_PATH_IMAGE017
,
Figure DEST_PATH_RE-878559DEST_PATH_IMAGE018
with
Figure DEST_PATH_RE-652480DEST_PATH_IMAGE019
represent respectively k the phase and magnitude that Frequency point is corresponding;
According to independence:
Figure DEST_PATH_RE-845564DEST_PATH_IMAGE020
Priori instantaneous signal-to-noise ratio:
Figure DEST_PATH_RE-515580DEST_PATH_IMAGE021
,
Figure DEST_PATH_RE-895746DEST_PATH_IMAGE022
;
3) masking value size is calculated by the probability distribution of supposing acquisition :
Two desirable values are sheltered (ideal binary mask, IdBM):
G kgain function can be considered to a stochastic variable, because it depends on instantaneous signal-to-noise ratio
Figure DEST_PATH_RE-989790DEST_PATH_IMAGE024
.In binary, shelter, G is the stochastic variable that distributes of Bernoulli Jacob (Bernoulli) with 0 or 1 value, and its parameter p is hypothetical probabilities.G kto be difficult to estimate, because it depends on the accurate estimation of instantaneous signal-to-noise ratio.Yet we can obtain G at its expectation kmore reliable.Do like this, we obtain following weighted mean expection amplitude square frequency spectrum and include above-mentioned two hypothesis in:
Figure DEST_PATH_RE-830707DEST_PATH_IMAGE025
Here P (H 1) the hypothesis H that refers to 1correct probability, E (G k| H 1) expression hypothesis H 1represent that gain function is that true (being that echo signal is occupied an leading position) represents gain function, E (G k| H 0) expression hypothesis H 0real (shelter and occupy an leading position).E(G k|H 1)=1,E(G k|H 0)=0。In practice, with a very little value E (G k| H 0) result has better quality and strengthen voice and contain a small amount of residual noise.In our research, we are G f=-20 decibel values replace E (G k| H 0), to reduce residual noise as far as possible;
Suppose that the real part of discrete Fourier transform (DFT) (DFT) coefficient and imaginary part such as are at the independently Gaussian random variables of variance, as real part and the imaginary part of voice discrete Fourier transform (DFT) (DFT) coefficient
Figure DEST_PATH_RE-167010DEST_PATH_IMAGE026
normal Distribution all
Figure DEST_PATH_RE-282734DEST_PATH_IMAGE027
,
Figure DEST_PATH_RE-653672DEST_PATH_IMAGE028
all obey
Figure DEST_PATH_RE-196649DEST_PATH_IMAGE029
distribute, they and:
Figure DEST_PATH_RE-285828DEST_PATH_IMAGE030
obey
Figure DEST_PATH_RE-408504DEST_PATH_IMAGE031
distribute, its density function:
Figure DEST_PATH_RE-430687DEST_PATH_IMAGE032
,
Figure DEST_PATH_RE-613407DEST_PATH_IMAGE033
,
Figure DEST_PATH_RE-455461DEST_PATH_IMAGE034
for exponential distribution:
Figure DEST_PATH_RE-381829DEST_PATH_IMAGE035
In like manner
Figure DEST_PATH_RE-258518DEST_PATH_IMAGE036
for exponential distribution:
Figure DEST_PATH_RE-612139DEST_PATH_IMAGE037
According to Bayes rule:
Figure DEST_PATH_RE-941489DEST_PATH_IMAGE038
Wherein: when
Figure DEST_PATH_RE-671547DEST_PATH_IMAGE039
,
Figure DEST_PATH_RE-402743DEST_PATH_IMAGE040
If
Figure DEST_PATH_RE-927265DEST_PATH_IMAGE041
,
Figure DEST_PATH_RE-743912DEST_PATH_IMAGE042
so,
Figure DEST_PATH_RE-277661DEST_PATH_IMAGE043
always positive;
Figure DEST_PATH_RE-863363DEST_PATH_IMAGE044
Figure DEST_PATH_RE-558787DEST_PATH_IMAGE045
, wherein
Figure DEST_PATH_RE-331571DEST_PATH_IMAGE046
Figure DEST_PATH_RE-934590DEST_PATH_IMAGE047
4) priori signal to noise ratio (S/N ratio) is upgraded:
Figure DEST_PATH_RE-843641DEST_PATH_IMAGE048
Preferably, the parameter initialization of described extraction: noisy speech signal is divided frame, frame length N=[0.25fs] point, fs is signal sampling frequency, frame moves N/2; Definite the taking away of noise spectrum initial value begun without several frames of voice segments.
Preferably, described extraction the noise power spectrum of frequency band time-varying coefficient upgradesparameter:
Figure DEST_PATH_RE-506703DEST_PATH_IMAGE049
.
Preferably, the size according to priori signal to noise ratio (S/N ratio) of described extraction obtains each Frequency point and shelters territory or echo signal field parameter:
Figure DEST_PATH_RE-501204DEST_PATH_IMAGE050
.
Preferably, the priori signal to noise ratio (S/N ratio) undated parameter of described extraction: .
Preferably, described this invention implementation procedure is shown in Fig. 1, and voice enhancing process is illustrated as shown in Figure 2.
Preferably, noisy speech signal is processed one by one in real time, as shown in Figure 3.
(3) beneficial effect
1, provided by the inventionly thisly based on signal-to-noise ratio flexible, shelter sound enhancement method, have and effectively suppress noise, improve significantly speech recognition system performance and intelligibility, and there is robustness under different noise circumstances and signal to noise ratio (S/N ratio) condition.This algorithm is real-time, accomplishes that validity and real-time are two satisfied;
2, provided by the inventionly thisly based on signal-to-noise ratio flexible, shelter sound enhancement method advantage and characteristic:
1) realized the noise power spectrum real-time update of frequency band time-varying coefficient;
2) propose signal-to-noise ratio flexible and sheltered principle;
3) take full advantage of the correlation extraction information between consecutive frame, realized the level and smooth iterative estimate method of priori signal to noise ratio (S/N ratio);
4) algorithm complex is low, can meet real-time;
3, provided by the inventionly thisly based on signal-to-noise ratio flexible, shelter sound enhancement method for non-stationary environment noise, from signal-to-noise ratio flexible, shelter angle and propose a kind of voice enhancement algorithm.Adopt quick tracking noise algorithm smoothly to upgrade frame by frame nonstationary noise, preferably estimating noise spectrum; The denoising that this method is strong background noise and the detection of weak signal provide new approach.
accompanying drawing explanation
Fig. 1 is provided by the invention a kind ofly shelters sound enhancement method process flow diagram based on signal-to-noise ratio flexible;
Fig. 2 is noisy speech enhanced processes schematic diagram provided by the invention;
Fig. 3 is that noisy speech short-time spectrum voice provided by the invention strengthen schematic diagram;
Fig. 4 is provided by the invention divide the noise power spectrum of frequency band time-varying coefficient to upgrade analogous diagram;
Figure (a) is English (The birch canoe slid on the smooth planks), voice, be all selected from AURORA database containing babble noise voice, figure (b) signal to noise ratio snr=5dB, figure (c) signal to noise ratio snr=0dB, figure (b1), (c1) are at the true babble noise power spectrum of frequency f=250Hz and the power spectrum of estimation tracking.
Fig. 5 is provided by the invention containing result contrast before and after babble noise voice enhancing emulation.
Left part figure is time-domain signal, and right part figure is sound spectrograph; Noise is selected from the babble noise of Noisex-92 database; Figure (a) raw tone (' language, sound, increasing, strong '), figure (b1) is figure (b) enhancing result (SNR=5dB); Figure (c1) is figure (c) enhancing result (SNR=0dB)
embodiment
For making the object, technical solutions and advantages of the present invention clearer, below in conjunction with specific embodiment, and with reference to accompanying drawing, the present invention is described in more detail.
Core content of the present invention is: realized divide the noise power spectrum of frequency band time-varying coefficient to upgrade; Propose signal-to-noise ratio flexible and sheltered principle; Take full advantage of the correlation extraction information between consecutive frame, realized the level and smooth iterative estimate method of priori signal to noise ratio (S/N ratio), reach voice and strengthen object.
As shown in Figure 1, Fig. 1 provided by the inventionly a kind ofly shelters sound enhancement method process flow diagram based on signal-to-noise ratio flexible, and the method comprises the following steps:
Step 101: parameter initialization: noisy speech signal is divided frame, frame length N=[0.25fs] point, fs is signal sampling frequency, frame moves N/2; Noise spectrum initial value;
Step 102: minute frame, m frame noisy speech, Fourier transform constantly;
Step 103: calculate m frame signal and divide the noise power spectrum of frequency band time-varying coefficient to upgrade;
Step 104: the level and smooth iteration of m frame priori signal to noise ratio (S/N ratio);
Step 105: m frame signal-to-noise ratio flexible masking factor is calculated , Fourier inversion obtains time domain voice and strengthens signal:
Figure DEST_PATH_RE-517154DEST_PATH_IMAGE053
;
Step 106: next frame real time signal processing goes to step 102.
The noise power spectrum of minute frequency band time-varying coefficient described in above-mentioned steps 103 upgrades calculation procedure and comprises:
Noisy speech l frame power spectrum | Y (l, k) | 2, k is frequency sequence number, l frame is estimated noise power spectrum , P (l, k) is the level and smooth power spectrum of voice,
Figure DEST_PATH_RE-943773DEST_PATH_IMAGE002
smoothing factor,
Figure DEST_PATH_RE-827415DEST_PATH_IMAGE003
,
Figure DEST_PATH_RE-832280DEST_PATH_IMAGE004
noisy speech power spectrum minimum value;
If
Figure DEST_PATH_RE-801373DEST_PATH_IMAGE005
, so
Figure DEST_PATH_RE-18728DEST_PATH_IMAGE006
,
Otherwise
Figure DEST_PATH_RE-288035DEST_PATH_IMAGE007
;
if,
Figure DEST_PATH_RE-920191DEST_PATH_IMAGE009
,
Figure DEST_PATH_RE-675657DEST_PATH_IMAGE010
, otherwise
Figure DEST_PATH_RE-65050DEST_PATH_IMAGE011
;
Figure DEST_PATH_RE-614980DEST_PATH_IMAGE012
Figure DEST_PATH_RE-89824DEST_PATH_IMAGE013
Figure DEST_PATH_RE-648981DEST_PATH_IMAGE014
The forming process of the level and smooth iteration of priori signal to noise ratio (S/N ratio) described in above-mentioned steps 104 comprises:
Figure DEST_PATH_RE-361722DEST_PATH_IMAGE048
Figure DEST_PATH_RE-879291DEST_PATH_IMAGE051
  
The forming process of the level and smooth iteration of priori signal to noise ratio (S/N ratio) described in above-mentioned steps 105 comprises:
Figure DEST_PATH_RE-44694DEST_PATH_IMAGE015
Y (n) represents noisy speech signal, and x (n) and d (n) represent respectively clean voice and noise signal.The short time discrete Fourier transform of Y (n):
Figure DEST_PATH_RE-673121DEST_PATH_IMAGE016
,
Its polar coordinate representation:
Figure DEST_PATH_RE-505948DEST_PATH_IMAGE017
,
with
Figure DEST_PATH_RE-847116DEST_PATH_IMAGE019
represent respectively k the phase and magnitude that Frequency point is corresponding.
According to independence:
Figure DEST_PATH_RE-279235DEST_PATH_IMAGE020
Priori instantaneous signal-to-noise ratio:
Figure DEST_PATH_RE-497726DEST_PATH_IMAGE021
, .
Masking value size is calculated by the probability distribution of supposing acquisition :
, wherein
Figure DEST_PATH_RE-936164DEST_PATH_IMAGE046
Figure DEST_PATH_RE-478004DEST_PATH_IMAGE047
Based on a kind of shown in Fig. 1, based on signal-to-noise ratio flexible, shelter sound enhancement method process flow diagram, Fig. 2,3 further shows voice enhancing process signal process, Fig. 4 divide the noise power spectrum of frequency band time-varying coefficient to upgrade emulation experimentfigure.
  
Below in conjunction with specific embodiment, to provided by the invention this based on the soft sound enhancement method further description of sheltering of posteriori SNR; experimentget the noisy voice (babble) that ground unrest is selected from Noisex-92 database, its sample frequency fs=19.98kHZ.Below we with same sample frequency fs, at computing machine noise and room noise environment, recorded " language, sound, end, point " sound and seen Fig. 1 (a).At voice, divide in frame process, frame length is got 25ms, i.e. frame length M=[0.25fs] point, frame moves
Figure DEST_PATH_RE-135567DEST_PATH_IMAGE055
, intercepting starts noise frame N=20;
Objectively from several aspects such as speech waveform, sound spectrograph, signal to noise ratio (S/N ratio) raisings, the performance of this algorithm has been carried out to comprehensive analysis.Adopt signal to noise ratio (S/N ratio)
Figure DEST_PATH_RE-909488DEST_PATH_IMAGE056
Carry out the denoising effect of analytical algorithm quantitatively; Before and after strengthening emulation containing babble noise voice, result contrast is referring to view.
Above-described specific embodiment; object of the present invention, technical scheme and beneficial effect are further described; institute is understood that; the foregoing is only specific embodiments of the invention; be not limited to the present invention; within the spirit and principles in the present invention all, any modification of making, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims (4)

1. based on signal-to-noise ratio flexible, shelter a sound enhancement method, it is characterized in that the method comprises:
For voice signal under nonstationary noise and strong background noise, be difficult to the practical problems of extracting, this algorithm design divide the noise power spectrum of frequency band time-varying coefficient to upgrade, provided size according to priori signal to noise ratio (S/N ratio) and obtained each Frequency point and shelter territory or echo signal territory; The probability distribution that masking value size is obtained by test of hypothesis is calculated and specific embodiments.
2. according to claim 1ly based on signal-to-noise ratio flexible, shelter sound enhancement method, it is characterized in that, described in the noise power spectrum of frequency band time-varying coefficient upgrades,noisy speech l frame power spectrum | Y (l, k) | 2, k is frequency sequence number, l frame is estimated noise power spectrum
Figure 160852DEST_PATH_IMAGE001
, P (l, k) is the level and smooth power spectrum of voice,
Figure 536469DEST_PATH_IMAGE002
smoothing factor, ,
Figure 239776DEST_PATH_IMAGE004
for noisy speech power spectrum minimum value, if
Figure 290908DEST_PATH_IMAGE005
, so
Figure 153822DEST_PATH_IMAGE006
, otherwise
Figure 253496DEST_PATH_IMAGE007
;
Figure 252676DEST_PATH_IMAGE008
if,
Figure 403604DEST_PATH_IMAGE009
,
Figure 753814DEST_PATH_IMAGE010
, otherwise
Figure 657179DEST_PATH_IMAGE011
;
Figure 776445DEST_PATH_IMAGE012
, to different frequency point, adopt different thresholdings to upgrade level and smooth spectrum here, outstanding voice, suppress noise;
Figure 903801DEST_PATH_IMAGE013
;
Figure 741307DEST_PATH_IMAGE014
.
3. according to claim 1ly based on signal-to-noise ratio flexible, shelter sound enhancement method, it is characterized in that, the described size according to priori signal to noise ratio (S/N ratio) obtains each Frequency point and shelters territory or echo signal territory; Time domain:
Figure 179854DEST_PATH_IMAGE015
,
Y (n) represents noisy speech signal, and x (n) and d (n) represent respectively clean voice and noise signal; The short time discrete Fourier transform of Y (n):
Figure 419205DEST_PATH_IMAGE016
, its polar coordinate representation:
Figure 655146DEST_PATH_IMAGE017
,
Figure 979948DEST_PATH_IMAGE018
with
Figure 287432DEST_PATH_IMAGE019
represent respectively k the phase and magnitude that Frequency point is corresponding;
According to independence:
Figure 53394DEST_PATH_IMAGE020
Priori instantaneous signal-to-noise ratio: ,
Figure 275264DEST_PATH_IMAGE022
.
4. according to claim 1ly based on signal-to-noise ratio flexible, shelter sound enhancement method, it is characterized in that, the probability distribution that described masking value size is obtained by test of hypothesis is calculated :
Figure 386439DEST_PATH_IMAGE023
, wherein
Figure 178126DEST_PATH_IMAGE025
CN201210290074.4A 2012-08-15 2012-08-15 Method for enhancing voice based on signal to noise ratio soft masking Pending CN103594093A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210290074.4A CN103594093A (en) 2012-08-15 2012-08-15 Method for enhancing voice based on signal to noise ratio soft masking

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210290074.4A CN103594093A (en) 2012-08-15 2012-08-15 Method for enhancing voice based on signal to noise ratio soft masking

Publications (1)

Publication Number Publication Date
CN103594093A true CN103594093A (en) 2014-02-19

Family

ID=50084199

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210290074.4A Pending CN103594093A (en) 2012-08-15 2012-08-15 Method for enhancing voice based on signal to noise ratio soft masking

Country Status (1)

Country Link
CN (1) CN103594093A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104240717A (en) * 2014-09-17 2014-12-24 河海大学常州校区 Voice enhancement method based on combination of sparse code and ideal binary system mask
CN105023572A (en) * 2014-04-16 2015-11-04 王景芳 Noised voice end point robustness detection method
CN106098077A (en) * 2016-07-28 2016-11-09 浙江诺尔康神经电子科技股份有限公司 Artificial cochlea's speech processing system of a kind of band noise reduction and method
TWI581254B (en) * 2015-06-30 2017-05-01 芋頭科技(杭州)有限公司 Environmental noise elimination system and application method thereof
CN108701468A (en) * 2016-02-16 2018-10-23 日本电信电话株式会社 Mask estimation device, mask estimation method and mask estimation program
CN109961799A (en) * 2019-01-31 2019-07-02 杭州惠耳听力技术设备有限公司 A kind of hearing aid multicenter voice enhancing algorithm based on Iterative Wiener Filtering
CN110085246A (en) * 2019-03-26 2019-08-02 北京捷通华声科技股份有限公司 Sound enhancement method, device, equipment and storage medium
CN111370017A (en) * 2020-03-18 2020-07-03 苏宁云计算有限公司 Voice enhancement method, device and system

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
SUNDARRAJAN R, PHILIPOS C L, YI HU: ""A noise estimation algorithm with rapid adaptation for highly non-stationary environments"", 《IN PROCEEDINGS OF IEEE》 *
SUNDARRAJAN R: ""Noise estimation algorithms for highly non-stationary environments"", 《DALLAS: UNIVERSITY OF TEXAS》 *
SUNDARRAJAN RANGACHARI,PHILIPOS C.LOIZOU: ""A noise-estimation algorithm for highly non-stationary environments"", 《SPEECH COMMUNICATION》 *
YANG LU,PHILIPOS C.,LOIZOU: ""Estimators of the Magnitude-Squared Spectrum and Methods for Incorporating SNR Uncertainty"", 《IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING》 *
马义德,邱秀清,陈昱莅,刘映杰,朱敬锋: ""改进的基于听觉掩蔽特性的语音增强"", 《电子科技大学学报》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105023572A (en) * 2014-04-16 2015-11-04 王景芳 Noised voice end point robustness detection method
CN104240717A (en) * 2014-09-17 2014-12-24 河海大学常州校区 Voice enhancement method based on combination of sparse code and ideal binary system mask
TWI581254B (en) * 2015-06-30 2017-05-01 芋頭科技(杭州)有限公司 Environmental noise elimination system and application method thereof
CN108701468A (en) * 2016-02-16 2018-10-23 日本电信电话株式会社 Mask estimation device, mask estimation method and mask estimation program
CN108701468B (en) * 2016-02-16 2023-06-02 日本电信电话株式会社 Mask estimation device, mask estimation method, and recording medium
CN106098077A (en) * 2016-07-28 2016-11-09 浙江诺尔康神经电子科技股份有限公司 Artificial cochlea's speech processing system of a kind of band noise reduction and method
CN109961799A (en) * 2019-01-31 2019-07-02 杭州惠耳听力技术设备有限公司 A kind of hearing aid multicenter voice enhancing algorithm based on Iterative Wiener Filtering
CN110085246A (en) * 2019-03-26 2019-08-02 北京捷通华声科技股份有限公司 Sound enhancement method, device, equipment and storage medium
CN111370017A (en) * 2020-03-18 2020-07-03 苏宁云计算有限公司 Voice enhancement method, device and system
CN111370017B (en) * 2020-03-18 2023-04-14 苏宁云计算有限公司 Voice enhancement method, device and system

Similar Documents

Publication Publication Date Title
CN103594093A (en) Method for enhancing voice based on signal to noise ratio soft masking
CN108831499B (en) Speech enhancement method using speech existence probability
CN106486131B (en) A kind of method and device of speech de-noising
CN100543842C (en) Realize the method that ground unrest suppresses based on multiple statistics model and least mean-square error
CN103594094B (en) Adaptive spectra subtraction real-time voice strengthens
WO2020107269A1 (en) Self-adaptive speech enhancement method, and electronic device
CN102982801B (en) Phonetic feature extracting method for robust voice recognition
CN103559888A (en) Speech enhancement method based on non-negative low-rank and sparse matrix decomposition principle
CN101154383B (en) Method and device for noise suppression, phonetic feature extraction, speech recognition and training voice model
CN106340292A (en) Voice enhancement method based on continuous noise estimation
CN103544961B (en) Audio signal processing method and device
CN104464728A (en) Speech enhancement method based on Gaussian mixture model (GMM) noise estimation
CN107785028A (en) Voice de-noising method and device based on signal autocorrelation
CN106373559A (en) Robustness feature extraction method based on logarithmic spectrum noise-to-signal weighting
Wang et al. Joint noise and mask aware training for DNN-based speech enhancement with sub-band features
Elshamy et al. An iterative speech model-based a priori SNR estimator
Alam et al. Robust feature extraction for speech recognition by enhancing auditory spectrum
CN107045874A (en) A kind of Non-linear Speech Enhancement Method based on correlation
Gupta et al. Speech enhancement using MMSE estimation and spectral subtraction methods
Ghanbari et al. Improved multi-band spectral subtraction method for speech enhancement
CN102637438B (en) Voice filtering method
CN101533642B (en) Method for processing voice signal and device
Hassani et al. Speech enhancement based on spectral subtraction in wavelet domain
CN115966218A (en) Bone conduction assisted air conduction voice processing method, device, medium and equipment
Tupitsin et al. Two-step noise reduction based on soft mask for robust speaker identification

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
ASS Succession or assignment of patent right

Owner name: HUNAN INTERNATIONAL ECONOMICS UNIVERSITY

Free format text: FORMER OWNER: WANG JINGFANG

Effective date: 20140618

C41 Transfer of patent application or patent right or utility model
C53 Correction of patent of invention or patent application
CB03 Change of inventor or designer information

Inventor after: Pan Lifeng

Inventor after: Wang Jingfang

Inventor before: Wang Jingfang

COR Change of bibliographic data

Free format text: CORRECT: INVENTOR; FROM: WANG JINGFANG TO: PAN LIFENG WANG JINGFANG

TA01 Transfer of patent application right

Effective date of registration: 20140618

Address after: Three road 410205 in Hunan Province, Yuelu District City, Changsha Fenglin No. 822 School of information science and engineering Hunan International Economics University

Applicant after: Hunan International Economics University

Address before: 410205, No. 17, No. 402, Yangming Mountain Villa, Changsha, Hunan, Yuelu District

Applicant before: Wang Jingfang

C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140219