CN101853661A - Noise spectrum estimation and voice mobility detection method based on unsupervised learning - Google Patents
Noise spectrum estimation and voice mobility detection method based on unsupervised learning Download PDFInfo
- Publication number
- CN101853661A CN101853661A CN201010178166A CN201010178166A CN101853661A CN 101853661 A CN101853661 A CN 101853661A CN 201010178166 A CN201010178166 A CN 201010178166A CN 201010178166 A CN201010178166 A CN 201010178166A CN 101853661 A CN101853661 A CN 101853661A
- Authority
- CN
- China
- Prior art keywords
- lambda
- frame
- voice
- alpha
- kappa
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001228 spectrum Methods 0.000 title claims abstract description 38
- 238000001514 detection method Methods 0.000 title claims abstract description 24
- 238000000034 method Methods 0.000 claims abstract description 35
- 238000009499 grossing Methods 0.000 claims description 6
- 239000000284 extract Substances 0.000 claims description 4
- 238000006467 substitution reaction Methods 0.000 claims description 2
- 230000008878 coupling Effects 0.000 abstract description 3
- 238000010168 coupling process Methods 0.000 abstract description 3
- 238000005859 coupling reaction Methods 0.000 abstract description 3
- 238000013459 approach Methods 0.000 description 6
- 230000003247 decreasing effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012804 iterative process Methods 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 206010038743 Restlessness Diseases 0.000 description 1
- 101001120757 Streptococcus pyogenes serotype M49 (strain NZ131) Oleate hydratase Proteins 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 229940083712 aldosterone antagonist Drugs 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000002620 method output Methods 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Landscapes
- Circuit For Audible Band Transducer (AREA)
Abstract
The noise power Power estimation and voice mobility detection method that the present invention relates to a kind of based on unsupervised learning,Include the following steps: the log-magnitude feature 1) for voice signal on each frequency point,Establish a GMM model; 2) for one section of voice data,M frame buffer is set,Preceding M frame input signal is stored in caching,The log-magnitude spectrum of M frame in caching is extracted,The GMM model for substituting into step 1) is initialized,The model λ 0 initialized,k; 3) in the model λ 0 initialized,After k,Since M+1 frame,Using the method for incremental learning,GMM model is updated frame by frame,Successively recursion obtains
, and obtain noise figure
With probability of occurrence of the voice signal on k-th of frequency point of the i-th frame. The present invention is the tight coupling solution of Power estimation and voice mobility detection, can enhance voice application system to the adaptability of noise circumstance; The present invention independent of " noise starting " it is assumed that also, the present invention description of the voice mobility on time-frequency two-dimensional space can also be provided.
Description
Technical field
The present invention relates to the voice process technology field, specifically, the present invention relates to a kind of noise power spectrum and estimate and the voice mobility detection method based on unsupervised learning.Wherein, voice mobility detection is to judge the algorithm whether voice occur on time dimension, and it can answer existence with the form of "Yes" or "No", also can describe the existence of voice with the voice probability of occurrence.
Background technology
Most voice application system is had in the face of ambient noise interference.Forefathers have proposed a lot of methods and have removed the interference of noise to voice system, and nearly all method all depends on voice mobility detection and noise power spectrum is estimated.These two modules exist contact closely, and their accuracy directly influences the whole noiseproof feature of system.Traditional solution exists following several problem:
1. in general anti-noise algorithm, it is the loose coupling of a cascade that voice mobility detection and noise power spectrum are estimated, the mobility of first computing voice is come the estimating noise power spectrum according to mobility then.The voice mobility detection device directly influences the accuracy that noise power spectrum is estimated to the sensitivity of voice signal.The voice mobility detection device is too responsive, causes underestimating of noise power spectrum easily; Otherwise, too blunt, cause over-evaluating of noise power spectrum easily.Therefore, often need to regulate the sensitivity of speech detector in the traditional scheme, the adaptability of noise circumstance is brought influence to system according to noise circumstance.
2. traditional solution is based on the mode of semi-supervised learning.At initial period, general system need make the hypothesis of " noise is initial ", supposes that promptly always there is one section non-speech audio in the beginning of sentence.This section non-speech audio can be understood as the ground unrest sample of artificial mark, sets up the initialization model of noise from these mark samples, and this is a kind of supervised learning method.Its defective is: this hypothesis is difficult to be met in some applications, such as starting with voice signal when sentence, will cause the initialization failure of noise model so, and it is all inaccurate to make speech detection and noise power spectrum estimate then.Follow-up phase after setting up the initialization model of noise, traditional solution adopt detection and results estimated to come more new model mostly, and this learning method is towards decision-making, and it is a kind of study of non-supervision.This learning method towards decision-making, with the output result of estimation/detecting device, the back coupling feedback is used for more new model.But it feeds back to model with incorrect result easily, causes the precise decreasing of model, and model further causes the precise decreasing estimating/detect.Wrong like this along with the time is progressively accumulated, system performance also can be along with the time progressively descends.Supervised learning in initial period adds the unsupervised learning in the follow-up phase, has formed a semi-supervised learning process.Two problems in initial period and follow-up phase all are because the mode of this semi-supervised learning causes.
3. most of voice mobility detection devices in the past only provide the description of voice mobility on time dimension, lack the description of voice mobility on the frequency domain dimension, therefore can't carry out further process of refinement to noise.
Summary of the invention
The present invention is directed in the past the voice mobility detection device and the shortcoming of noise power spectrum estimator, a tightly coupled solution has been proposed, make voice mobility detection and noise power spectrum estimate under a unsupervised learning framework, to obtain unification, thereby strengthen the adaptability of voice application system noise circumstance.In addition, this invention does not rely on " noise is initial " and supposes that practicality is stronger than traditional method; Simultaneously, the present invention also provides the description of voice mobility on time frequency space, helps noise is carried out further process of refinement.
For achieving the above object, the invention provides a kind of noise power spectrum and estimate and the voice mobility detection method, as shown in Figure 2, comprise the following steps: based on unsupervised learning
1) for the logarithm amplitude characteristic of voice signal on each frequency, set up a GMM model, mathematic(al) representation is as follows:
Wherein, gaussian component is expressed as:
Wherein, x
I, kRepresent the logarithm amplitude spectrum on k the frequency of i frame, h represents gaussian component, h ∈ 0,1},
The weight coefficient of expression GMM,
With
Represent average and variance respectively, wherein h=1 represents speech components, and h=0 represents noise component;
The parameter set of expression gauss hybrid models;
2) for one section speech data, set the M frame buffer, preceding M frame input signal is deposited in the buffer memory, extract the logarithm amplitude spectrum of M frame in the buffer memory, the GMM model of substitution step 1) carries out initialization, obtains initialized model λ
0, kInitialization procedure adopts constraint EM algorithm;
3) obtaining initialized model λ
0, kAfterwards, since the M+1 frame, adopt the method for incremental learning, upgrade the GMM model frame by frame, recursion obtains successively
And draw noise figure
With the probability of occurrence of voice signal on k frequency of i frame:
I=1 wherein, 2,3 ...
Wherein, the incremental learning method of described GMM comprises recursion weight coefficient, recursion average and recursion variance;
Wherein α is a smoothing factor.
Compared with prior art, the present invention has following technique effect:
The present invention is that a kind of voice mobility detection and noise power spectrum are estimated tightly coupled scheme, can strengthen the adaptability of voice application system to noise circumstance; In addition, the present invention does not rely on " noise is initial " and supposes to have stronger practicality; And the present invention can also provide the description of voice mobility on the time-frequency two-dimensional space, helps noise is carried out further process of refinement.
Description of drawings
Fig. 1 shows one section voice time-domain diagram and sound spectrograph of being subjected to noise;
Wherein (a) part is one section sound spectrograph that is destroyed by white noise, and signal to noise ratio (S/N ratio) is 0dB; (b) part is the probability graph of voice signal existence, and the gray scale among the figure represents that the probability of (promptly existing) appears in voice signal; From (a) and (b) the contrast of figure as can be seen, the probability that exists of this method output has been described the structure of sound spectrograph accurately.
Fig. 2 is of the present invention a kind of based on the noise power spectrum estimation of unsupervised learning and the process flow diagram of voice mobility detection method.
Embodiment
The present invention proposes a kind of noise power spectrum based on the unsupervised learning framework estimates and the voice mobility detection method.The maximum characteristics of unsupervised learning framework are that the model of noise and voice messaging is set up in a kind of mode of non-supervision, no matter in the initialization of model or in renewal process, all do not rely on the information of artificial mark.Particularly, it has following characteristics:
● at initial phase, do not rely on the initial hypothesis of noise, so the range of application that should invent is used more wide in range than general solution.
● in renewal process, do not need feedback information, therefore, the problem of error accumulation can be eased to a certain extent.
● providing the information of voice mobility and the information of noise power spectrum simultaneously, is tightly coupled relation between them, only need just can regulating system by a few parameters.And in loosely coupled system, voice mobility module and noise detection module exist adjusting parameter separately, and parameter is more, and system is difficult to regulate.
● voice mobility is the two-dimensional signal of " time---frequency ", and other voice mobility detection algorithm has only been described the existence of voice on time dimension.
In one embodiment, the carrier of unsupervised learning framework is the gauss hybrid models (GaussianMixture Model is abbreviated as GMM) of two components.The distribution of one of them representation in components speech energy, another component are the distributions of noise energy.The present invention becomes 8 subbands according to the Mel scale with band segmentation, extracts energy envelope on each subband, and sets up the GMM of a correspondence.At first adopt EM algorithm initialization GMM, adopt the mode of incremental learning progressively to upgrade GMM then.According to the GMM model, deduce out the mobility on this subband of voice and the power spectrum information of noise respectively.
The present invention adopts the GMM that has constraint condition that the spectrum-envelope of voice is carried out match.
In fit procedure, respectively average, the weight of GMM are closed variance etc. and retrain.No matter at the EM algorithm still in the incremental learning process, all requirements
And
Wherein, for the incremental learning method of GMM, specifically comprise the calculating of recursion weight coefficient, recursion average and recursion variance.
1) recursion weight coefficient:
Wherein α be one less than 1 but approach 1 smoothing factor, α=0.99 for example.
2) recursion average.
Perhaps
α wherein
μBe one less than 1 but approach 1 smoothing factor, for example α
μ=0.99.
3) recursion variance.
Perhaps
Perhaps
α wherein
κBe one less than 1 but approach 1 smoothing factor, for example α
κ=0.99.
Below in conjunction with a preferred embodiment the present invention is done description further.
Principle of the present invention is as follows:
For the logarithm amplitude characteristic of voice signal on each frequency, set up a gauss hybrid models GMM, this model changes along with the variation of time and input signal.The mathematic(al) representation of model is as follows:
Wherein gaussian component is expressed as:
Here x
I, kRepresent the logarithm amplitude spectrum on k the frequency of i frame, h represents gaussian component, h ∈ 0,1},
The weight coefficient of expression GMM,
With
Represent average and variance respectively.Wherein h=1 represents speech components, and h=0 represents noise component.
The parameter set of expression gauss hybrid models.
In this model
Be exactly that we want the noise estimated.Simultaneously, we can derive the probability of occurrence of voice signal on k frequency of i frame:
Based on above-mentioned principle, according to one embodiment of present invention, as shown in Figure 2, described noise power spectrum is estimated and the voice mobility detection method comprises the following steps:
Step 100: set the M frame buffer, preceding M frame input signal is deposited in the buffer memory, extract the amplitude spectrum of M frame in the buffer memory.The method of extracting frame amplitude spectrum is as follows:
At first the digitized sound signal of this frame is done pre-service (according to system's actual conditions, can comprise windowing, pre-emphasis etc.), establishing every frame length is the F point, and first zero padding is to N point (N 〉=F wherein, N=2
j, j is integer and j 〉=8), carry out leaf transformation in the N point discrete Fourier, obtain discrete spectrum
Y wherein
I, nN sampled point of i frame in the expression buffer memory, Y
I, kK Fourier transform value of i frame in the expression buffer memory (k=0,1 ..., N-1).So, its range value may be calculated
The initialization of step 200:GMM.The gauss hybrid models λ of two components of initialization on each frequency k
I, k, subscript i express time wherein, λ
I=0, kRepresent initialized model.Initialization procedure adopts constraint EM algorithm, and on certain frequency k, concrete initialization step is as follows:
Step 201: the method by cluster (for example the non-supervision cluster of LBG, perhaps fuzzy clustering or the like) is divided into two classes with M+1 sample:
With
M wherein
0+ M
1-1=M, the class that average is bigger is represented with subscript (1), and is another kind of with subscript (0) expression.The average of two classes is
The average of the class that energy is less is
Wherein
The variance of two classes is respectively:
The initializes weights coefficient of two classes:
The likelihood score of novel model of calculating,
In following iterative process, old model parameter set is expressed as λ '
0, k, new model parameter is:
Before the beginning iteration,
L '
kBe set to very big number, for example a L '
k=-10000.Below begin interative computation.
Step 202: the probability that calculating noise and voice occur,
Step 204: if
Then stop iteration, simultaneously λ
0, k=λ '
0, kWherein υ is one and approaches 0 and greater than 0 number, for example υ=0.05.
Step 207: calculate new variance,
Step 209: the likelihood score of novel model of calculating
Step 210: if satisfy condition
Termination of iterations, wherein ε is a very little numeral, for example ε=0.1.If
Iteration jumps to
" step 202 ".
The progressively renewal of step 300:GMM.Setting up initialized model λ
0, kAfterwards,, adopt the method for incremental learning, upgrade the GMM model frame by frame since the M+1 frame.Iterative process can be expressed as: on each frequency k, and known λ
I, kWith current observed value x
I+1, k, infer λ
I+1, kCarry out Fourier transform for the i+1 frame, obtain Y
I+1, k, 0≤k<N wherein.On each frequency k, calculate amplitude spectrum x
I, k=20*log10|Y
I, k|.For k frequency, concrete iterative step is as follows:
h∈{0,1}。
Step 302: calculate new weight coefficient:
Wherein, α be one less than 1 but approach 1 smoothing factor, α=0.99 for example.
Step 305: new average is retrained:
From above substep, we have obtained λ
I+1, kIn all parameters, thereby obtained corresponding voice probability of occurrence p (h|x
I+1, k, λ
I, k) and the power spectrum valuation of noise signal
Algorithm based on the foregoing description, the noise power spectrum estimation performance is estimated, adopt each 8 sentence of men and women words person speech data in the TIMIT database and white Gaussian noise, F16 fight support storehouse noise and babble noise in the NOISEX92 noise data storehouse according to 0,5, signal to noise ratio (S/N ratio) such as 10dB mixes.Evaluation index is the line spectrum error, is defined as follows formula:
Wherein D (k, l) the actual noise amplitude spectrum of expression,
The noise amplitude spectrum that expression is estimated notices that the SegErr value is more little, and the expression estimated value approaches actual value more, estimates approximately accurately.Algorithm compares respectively at three kinds of noise power spectrum algorithm for estimating of current main-stream, wherein MS represents the minimum statistics algorithm, MCRA represents the recurrence average algorithm of minimum control, and IMCRA represents that the minimum control that improves version returns average algorithm, and TV-GMM is an algorithm of the present invention.Table 1 has been expressed the result of line spectrum error SegError.
Table 1
As can be seen from the above table, the algorithm of the present invention's proposition all has remarkable advantages for three kinds of algorithms of present main flow.
Claims (2)
1. the noise power spectrum based on unsupervised learning is estimated and the voice mobility detection method, comprises the following steps:
1) for the logarithm amplitude characteristic of voice signal on each frequency, set up a GMM model, mathematic(al) representation is as follows:
Wherein, gaussian component is expressed as:
Wherein, x
I, kRepresent the logarithm amplitude spectrum on k the frequency of i frame, h represents gaussian component, h ∈ 0,1},
The weight coefficient of expression GMM,
With
Represent average and variance respectively, wherein h=1 represents speech components, and h=0 represents noise component;
The parameter set of expression gauss hybrid models;
2) for one section speech data, set the M frame buffer, preceding M frame input signal is deposited in the buffer memory, extract the logarithm amplitude spectrum of M frame in the buffer memory, the GMM model of substitution step 1) carries out initialization, obtains initialized model λ
0, kInitialization procedure adopts constraint EM algorithm;
3) obtaining initialized model λ
0, kAfterwards, since the M+1 frame, adopt the method for incremental learning, upgrade the GMM model of each frequency band frame by frame, recursion obtains successively
And draw noise figure
With the probability of occurrence of voice signal on k frequency of i frame:
I=1 wherein, 2,3 ...
2. noise power spectrum according to claim 1 is estimated and the voice mobility detection method, be it is characterized in that the incremental learning method of described GMM comprises: recursion weight coefficient, recursion average and recursion variance;
Recursion weight coefficient method is:
The recursion Mean Method is:
Perhaps
Recursion variance method is:
Perhaps
Wherein, α is a smoothing factor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2010101781664A CN101853661B (en) | 2010-05-14 | 2010-05-14 | Noise spectrum estimation and voice mobility detection method based on unsupervised learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2010101781664A CN101853661B (en) | 2010-05-14 | 2010-05-14 | Noise spectrum estimation and voice mobility detection method based on unsupervised learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101853661A true CN101853661A (en) | 2010-10-06 |
CN101853661B CN101853661B (en) | 2012-05-30 |
Family
ID=42805116
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2010101781664A Expired - Fee Related CN101853661B (en) | 2010-05-14 | 2010-05-14 | Noise spectrum estimation and voice mobility detection method based on unsupervised learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101853661B (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102800322A (en) * | 2011-05-27 | 2012-11-28 | 中国科学院声学研究所 | Method for estimating noise power spectrum and voice activity |
CN103839544A (en) * | 2012-11-27 | 2014-06-04 | 展讯通信(上海)有限公司 | Voice activity detection method and apparatus |
CN104464728A (en) * | 2014-11-26 | 2015-03-25 | 河海大学 | Speech enhancement method based on Gaussian mixture model (GMM) noise estimation |
CN104575513A (en) * | 2013-10-24 | 2015-04-29 | 展讯通信(上海)有限公司 | Burst noise processing system and burst noise detection and suppression method and device |
CN105989843A (en) * | 2015-01-28 | 2016-10-05 | 中兴通讯股份有限公司 | Method and device of realizing missing feature reconstruction |
WO2017063516A1 (en) * | 2015-10-13 | 2017-04-20 | 阿里巴巴集团控股有限公司 | Method of determining noise signal, and method and device for audio noise removal |
CN107731230A (en) * | 2017-11-10 | 2018-02-23 | 北京联华博创科技有限公司 | A kind of court's trial writing-record system and method |
CN107818780A (en) * | 2017-11-13 | 2018-03-20 | 河海大学 | A kind of robust speech recognition methods based on nonlinear characteristic compensation |
CN110675885A (en) * | 2019-10-17 | 2020-01-10 | 浙江大华技术股份有限公司 | Sound mixing method, device and storage medium |
CN111739562A (en) * | 2020-07-22 | 2020-10-02 | 上海大学 | Voice activity detection method based on data selectivity and Gaussian mixture model |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101226742B (en) * | 2007-12-05 | 2011-01-26 | 浙江大学 | Method for recognizing sound-groove based on affection compensation |
CN101464950B (en) * | 2009-01-16 | 2011-05-04 | 北京航空航天大学 | Video human face identification and retrieval method based on on-line learning and Bayesian inference |
-
2010
- 2010-05-14 CN CN2010101781664A patent/CN101853661B/en not_active Expired - Fee Related
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102800322B (en) * | 2011-05-27 | 2014-03-26 | 中国科学院声学研究所 | Method for estimating noise power spectrum and voice activity |
CN102800322A (en) * | 2011-05-27 | 2012-11-28 | 中国科学院声学研究所 | Method for estimating noise power spectrum and voice activity |
CN103839544B (en) * | 2012-11-27 | 2016-09-07 | 展讯通信(上海)有限公司 | Voice-activation detecting method and device |
CN103839544A (en) * | 2012-11-27 | 2014-06-04 | 展讯通信(上海)有限公司 | Voice activity detection method and apparatus |
CN104575513B (en) * | 2013-10-24 | 2017-11-21 | 展讯通信(上海)有限公司 | The processing system of burst noise, the detection of burst noise and suppressing method and device |
CN104575513A (en) * | 2013-10-24 | 2015-04-29 | 展讯通信(上海)有限公司 | Burst noise processing system and burst noise detection and suppression method and device |
CN104464728A (en) * | 2014-11-26 | 2015-03-25 | 河海大学 | Speech enhancement method based on Gaussian mixture model (GMM) noise estimation |
CN105989843A (en) * | 2015-01-28 | 2016-10-05 | 中兴通讯股份有限公司 | Method and device of realizing missing feature reconstruction |
WO2017063516A1 (en) * | 2015-10-13 | 2017-04-20 | 阿里巴巴集团控股有限公司 | Method of determining noise signal, and method and device for audio noise removal |
US10796713B2 (en) | 2015-10-13 | 2020-10-06 | Alibaba Group Holding Limited | Identification of noise signal for voice denoising device |
CN107731230A (en) * | 2017-11-10 | 2018-02-23 | 北京联华博创科技有限公司 | A kind of court's trial writing-record system and method |
CN107818780A (en) * | 2017-11-13 | 2018-03-20 | 河海大学 | A kind of robust speech recognition methods based on nonlinear characteristic compensation |
CN107818780B (en) * | 2017-11-13 | 2020-09-18 | 河海大学 | Robust speech recognition method based on nonlinear feature compensation |
CN110675885A (en) * | 2019-10-17 | 2020-01-10 | 浙江大华技术股份有限公司 | Sound mixing method, device and storage medium |
CN111739562A (en) * | 2020-07-22 | 2020-10-02 | 上海大学 | Voice activity detection method based on data selectivity and Gaussian mixture model |
CN111739562B (en) * | 2020-07-22 | 2022-12-23 | 上海大学 | Voice activity detection method based on data selectivity and Gaussian mixture model |
Also Published As
Publication number | Publication date |
---|---|
CN101853661B (en) | 2012-05-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101853661A (en) | Noise spectrum estimation and voice mobility detection method based on unsupervised learning | |
CN102800322B (en) | Method for estimating noise power spectrum and voice activity | |
WO2018107810A1 (en) | Voiceprint recognition method and apparatus, and electronic device and medium | |
CN102800316B (en) | Optimal codebook design method for voiceprint recognition system based on nerve network | |
CN103280220B (en) | A kind of real-time recognition method for baby cry | |
CN101751921B (en) | Real-time voice conversion method under conditions of minimal amount of training data | |
CN102968990B (en) | Speaker identifying method and system | |
CN104485103B (en) | A kind of multi-environment model isolated word recognition method based on vector Taylor series | |
CN105206270A (en) | Isolated digit speech recognition classification system and method combining principal component analysis (PCA) with restricted Boltzmann machine (RBM) | |
KR100919223B1 (en) | The method and apparatus for speech recognition using uncertainty information in noise environment | |
CN102789779A (en) | Speech recognition system and recognition method thereof | |
CN104464728A (en) | Speech enhancement method based on Gaussian mixture model (GMM) noise estimation | |
CN111899757B (en) | Single-channel voice separation method and system for target speaker extraction | |
Dubey et al. | Non-intrusive speech quality assessment using several combinations of auditory features | |
CN104900232A (en) | Isolation word identification method based on double-layer GMM structure and VTS feature compensation | |
CN104361894A (en) | Output-based objective voice quality evaluation method | |
CN104732972A (en) | HMM voiceprint recognition signing-in method and system based on grouping statistics | |
Karbasi et al. | Twin-HMM-based non-intrusive speech intelligibility prediction | |
CN105355198A (en) | Multiple self-adaption based model compensation type speech recognition method | |
CN112086100B (en) | Quantization error entropy based urban noise identification method of multilayer random neural network | |
CN103345920B (en) | Self-adaptation interpolation weighted spectrum model voice conversion and reconstructing method based on Mel-KSVD sparse representation | |
CN102930863B (en) | Voice conversion and reconstruction method based on simplified self-adaptive interpolation weighting spectrum model | |
CN115758082A (en) | Fault diagnosis method for rail transit transformer | |
Mi et al. | A content-independent method for LFM signal source identification | |
Razani et al. | A reduced complexity MFCC-based deep neural network approach for speech enhancement |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20120530 |
|
CF01 | Termination of patent right due to non-payment of annual fee |