CN1217315C - Hidden Markov model with frame correlation - Google Patents

Hidden Markov model with frame correlation Download PDF

Info

Publication number
CN1217315C
CN1217315C CN01823553.0A CN01823553A CN1217315C CN 1217315 C CN1217315 C CN 1217315C CN 01823553 A CN01823553 A CN 01823553A CN 1217315 C CN1217315 C CN 1217315C
Authority
CN
China
Prior art keywords
frame
expectation
maximization
model
calibration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN01823553.0A
Other languages
Chinese (zh)
Other versions
CN1545695A (en
Inventor
李锦宇
贾颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intel China Ltd
Intel Corp
Original Assignee
Intel China Ltd
Intel Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel China Ltd, Intel Corp filed Critical Intel China Ltd
Publication of CN1545695A publication Critical patent/CN1545695A/en
Application granted granted Critical
Publication of CN1217315C publication Critical patent/CN1217315C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
    • G10L15/142Hidden Markov Models [HMMs]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/12Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being prediction coefficients

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Complex Calculations (AREA)
  • Image Analysis (AREA)

Abstract

The present invention discloses a method and a system for containing frame correlation in a hidden Markov model. The method comprises the steps: calculating the frame independent section of speech; inputting the frame independent section into a Gauss model; calculating frame probability; then, calculating an autoregression coefficient from the frame probability.

Description

Hidden Markov model with frame correlation
Technical field
The present invention relates to hidden Markov model, particularly in hidden Markov model, comprise frame correlation.
This Markovian process is a probability model useful in the Analysis of Complex system.This process can comprise state and/or state-transition.State can comprise the numerical value of a plurality of variablees of the current state of describing a system, when a state changes, state-transition may occur.Current state and probability that a probability model of Markovian process only provides each possible known state to change.Therefore, be made to this process each state transformation with and the probability of in the future track only depend on current state.A hidden Markov model can be described to the stochastic systems model (markov) that a quilt is partly observed, and the some of them status information can not be watched by coverage.
The development of this hidden Markov model (HMM) causes the substance progress in speech recognition technology.This is progressive more remarkable than other field of speech recognition in big vocabulary continuous speech recognition (LVCSR) field.But a plurality of hypothesis in hidden Markov model still are considered to the obstacle to the likely effectiveness of this model.Problematic hypothesis may for: Continuous Observation is independently and in a state to distribute in the same manner.But the mechanism of this voice production process show this observation be basically subordinate with relevant.In addition, under PRML (ML) criterion, the system based on HMM depends on the essence how this model can show actual speech.
Description of drawings
Fig. 1 is process flow diagram based on a frame correlation process in the environment of HMM according to an embodiment of the invention.
Fig. 2 is block scheme based on the frame correlation system of HMM according to an embodiment of the invention.
Embodiment
Recognize the above-mentioned difficulties in the speech recognition of hidden Markov model (HMM) being used unpractiaca hypothesis, the present invention describes a kind of being used at the method and system based on the frame correlation of HMM environment.Thereby for the purpose that illustrates rather than limit, illustrated embodiment of the present invention still obviously the invention is not restricted to this according to describing with the corresponding to mode of this use-pattern.
In the statistical method to automatic speech recognition, best mathematics solution wishes to make recognizer to observe maximum experience (maximum a posteriori, MAP) judgment rule.This MAP judgment rule can be expressed as:
W ^ = arg max p ( W | O ) = arg max p ( O | W ) p ( W ) , - - - [ 1 ]
Wherein W is the word string hypothesis that is used for given sound observation O, and p (O|W) is this sound model, and p ( W ) = Π i = l L p ( w i | w i - l , . . . , w i - N ) Be N gram language model (N-gram language model).When derived sound model score p (O|W), a hidden state sequence
Figure C0182355300063
Usually be expressed as:
p ( O | W ) = Σ Γ p ( o 1 T q 1 T | W ) = Σ Γ p ( o 1 T | q 1 T , W ) · p ( q 1 T | W ) . - - - [ 2 ]
Therefore, suppose that this hiding process can consider the conditional probability of this voice signal fully.
In HMM method based on frame, this status switch probability Can be rewritten as by using markov first rank hypothesis
p ( q 1 | T | W ) = p ( q 0 ) II t = 1 T p ( q t | q t - 1 , W ) = π q 0 a q 1 q 1 a q 1 q 2 . . . a q t - 1 q t . - - - [ 3 ]
Therefore, a given hidden state sequence q 1 T, be accompanied by this status switch p (o 1 T| q 1 T, joint observation probability W) can be written as and depend on former observation o 1 TWith state partial sequence q 1 TIndividual measurement vector o tThe product of probability.This can be expressed as follows
p ( o 1 T | q 1 T ) = Π t = 1 T p ( o t | o 1 t , q t , q 1 t - 1 ) . - - - [ 4 ]
In order to make above-mentioned Equation for Calculating manageable (for standard HMM), wish that making frame independently supposes.Therefore, this hypothesis means that this observation only depends on the state that produces them on statistics, and does not depend on former observation.Therefore, p ( o t | o 1 t , q t , q 1 t - 1 ) = p ( o t | q t ) . Suppose independently that according to this frame this joint observation probability can be rewritten as:
p ( o 1 T | q 1 T ) = Π t = 1 T p ( o t | o 1 t , q t , q 1 t - 1 ) = Π t - 1 T p ( o t | q t ) . - - - [ 5 ]
Under PRML (ML) standard, the performance based on the system of HMM depends on how this hidden Markov model shows the feature of the essence of actual speech well.For this reason, people have attempted the whole bag of tricks, so that the actual more model of frame correlation to be provided.Many these work have been put to Probability p (o t| o 1 T-1, q j, decomposition λ).
Recognize the above-mentioned difficulties of using existing model, this instructions is described a kind of system and method that comprises the novel hidden Markov model (HMM) with frame correlation simulation.Therefore, in an embodiment of native system, the frame correlation in the segment of cepstrum (cepstral) vector that belongs to a HMM state (or Gaussian Mixture) can use a kind of automatic recurrence (AR) technology to simulate.This technology supposes independently that tension and relaxation (Relaxation) is to the correlativity between N+1 the successive frame that is used for a hypothesis HMM state to frame.Then, use this expectation value maximum (EM) process, derive the estimation formulas of the new HMM parameter be used to comprise mean vector, variance matrix and one group of correlation matrix.But when frame correlation was left in the basket, above-mentioned technology was reduced to the hidden Markov model of standard.The initial experiment of the English task of wall street daily record 20K shown to obtain to be reduced to 11.4 the word bit error rate with additional parameter from 11.8 (baselines).
1. the relevant automatic recurrence characteristic model of state
In an embodiment of native system, automatic recurrence (AR) model that state is relevant is used to be included in the simple crosscorrelation between the continuous observation vector.This comprises that generation has the measurement vector of state as follows:
o t = Σ i = 1 N a i o t - 1 + e t + n t , - - - [ 6 ]
A wherein iBe a diagonal matrix, make an AR model be applied to this vector o tEach component; e tIt is the relevant mean vector of one-component in this HMM state; n tFor having a Gaussian noise of zero mean, it can be used as this actual observation o tWith prediction observation Between an error.
The advantage that the automatic regression model that user mode is relevant shows the feature of frame correlation be included in this model for speech production with and the advantage that in the application program of voice coding, provides.In time domain, speech waveform is directly produced by driving source and voice range.This voice range can be fully by time dependent automatic recurrence filter model parametrization fully.According to this model framework, it is called as linear predictive coding, has made in voice coding than much progress, to reduce bit rate.In the cepstrum spectral domain, extract each cepstrum frame from a window of speech samples.
2. the tension and relaxation (Relaxation) independently supposed of frame
The automatic recurrence characteristic model relevant according to above-mentioned state can be supposed given current state q t, and top n frame o T-N..., o T-1, o tHas the identical n that is distributed as tThis hypothesis can be formulated as follows:
p ( o t | o 1 t - 1 , q t ) = p ( n t | o t - N t - 1 , q t ) . - - - [ 7 ]
Therefore, the likelihood of status switch hypothesis can be written as:
p ( o 1 T | q 1 T ) = Π t = 1 T p ( o t | o 1 t - 1 , q t , q 1 t - 1 ) = Π t = 1 T p ( n t | o t - N t - 1 , q t ) . - - - [ 8 ]
3. expectation value maximization procedure
For the status switch of being simulated by Gaussian Mixture, the maximization that this likelihood function p (O|W) has been shown equals the maximization of function Q, wherein:
Q = Σ t = 1 T Σ m = 1 M γ q t , m ( t ) ln p ( n t | o t - N t - 1 , q t ) . - - - [ 9 ]
The automatic recurrence characteristic model that application state is relevant, above-mentioned Q function can be rewritten as:
Q = Σ t = 1 T Σ m = 1 t M γ q t , m ( t ) ln p ( n t | o t - N t - 1 , q t ) - - - [ 10 ]
= Σ t = 1 T Σ m = 1 M γ q t , m ( t ) 1 np ( o t - Σ t = 1 N a m , i o t - 1 - e t , m | q t )
= Σ t = 1 T Σ m = 1 M γ m ( t ) [ ln 2 π | W m | + ( o t - Σ t = 1 N a m , i o t - i - e t , m ) T W m - 1 ( o t - Σ t = 1 N a m , i o t - i - e t , m ) ]
In order to make the Q function maximize, can use expectation value maximization (EM) process with respect to hybrid parameter.For each pronunciation, this mixing occupation rate is an obliterated data.Therefore, can be formulated the EM process of following iteration.
The expectation value step: given mean value e m, variance W mWith correlation matrix a M, i, can use following forward-reverse technology to provide desired calibration γ m(t):
γ m ( t ) = p ( q s , m | e m , t , W m , a m , i , o t - N t - 1 ) = α m ( t ) β m ( t ) . - - - [ 11 ]
Maximization steps: the expectation value of given obliterated data, for the differential Q of hybrid parameter (mean value, variance and correlation matrix) and be set to zero and provide following estimation formulas:
e m , t = Σ t = i T γ m ( t ) ( o t - Σ t = 1 N a m , i o t - i ) Σ t = 1 T γ m ( t ) t - - - [ 12 ]
W m = diag [ Σ t = i T γ m ( t ) ( o t - Σ i = 1 N a m , t o t - i - e m , t ) ( o t - Σ t = 1 N a m , i o t - i - e m , t ) T Σ t = 1 T γ m ( t ) ] .
For diagonal matrix a M, i(1≤i≤N), the vector that is formed by N unit, k diagonal angle from diagonal matrix can be estimated as:
Figure C0182355300097
Therefore, the mode that can the while follow the unit according to the unit uses above-mentioned formula to estimate N diagonal angle correlation matrix.
4. embodiment
Fig. 1 is according to an embodiment of the invention based on the process flow diagram of a frame correlation process in the environment of HMM.In an illustrated embodiment, should be applied to speech recognition based on the frame correlation process of HMM.But, should know that this frame correlation can be used to other application programs, for example phonetic synthesis or Audio Processing.
This frame correlation process is included in the frame independent sector that step 100 is calculated these voice.The calculating of this frame independent sector is by realizing according to relevant automatic recurrence (AR) the Model Calculation measurement vector of state.As indicated above, the relevant AR model of this state is used to be included in the correlativity between the Continuous Observation vector.Therefore, measurement vector is produced as follows according to above-mentioned equation [6]:
o t = Σ i = 1 N a i o t - i + e t + n t ,
A wherein iBe a diagonal matrix, e tBe the relevant mean vector of the component in this HMM state, n tFor having the Gaussian distribution of zero mean.
In step 120, the frame independent sector of these voice is imported into this Gauss model.Calculate this frame probability in step 104 then.In step 106, make this expectation value maximization estimate the AR coefficient that this state is relevant in the step described in above-mentioned the 3rd joint by basis.In one embodiment, can use above-mentioned equation [13] to estimate N diagonal angle correlation matrix simultaneously.
Fig. 2 is block scheme based on the frame correlation system 200 of HMM according to an embodiment of the invention.In an illustrated embodiment, this system 200 comprises that one returns (AR) analogue unit and an expectation value maximization unit 204 automatically.
This AR analogue unit 202 receives diagonal matrix (a i), mean vector (e that component is relevant t) and have zero mean (n t) Gaussian noise.Then, this AR functional unit 202 calculates a measurement vector (o t).
This expectation value maximization unit 204 can comprise a Gauss model piece 206, expectation value piece 208 and maximization piece 210.This Gauss model piece 206 receives the measurement vector (o that is calculated t) and calculate the frame probability.This measurement vector (o t) and mean value (e m), variance (W m) and correlation matrix (a M, i) together be sent to this expectation value piece 208, with the calibration γ of calculation expectation m(t).In one embodiment, this expectation value piece 208 uses forward-reverse technology to calculate γ m(t).
This maximization piece 210 receives the calibration (γ of expectation mAnd the relevant AR coefficient (a of estimated state (t)), i).This coefficient can be expressed as diagonal matrix.
5. result of experiment
The continuous speech recognition task that a big vocabulary is independent of the speaker is carried out above-mentioned application of model program.Wall street daily record 20k English task is carried out this experiment.This baseline system be one with the irrelevant HMM system (gender-independent within-word-triphone Gaussian-mixture tiedstate HMM system) of Gaussian Mixture association status in three sound words of sex.In this model set, each speech model has three states that set out (emitting state) and a left-to-right topology.Also use two quiet models.The first quiet model, very brief pause model, having can the uncared-for single state that sets out.The second quiet model is the complete threaded tree that is used to indicate the long mute periods state model that sets out.First and second differential of these voice and normalized record energy (log-energy) and these parameters are together turned to the cepstrum spectral coefficient (MFCC) of 12 Mel scales by parameter.This parametrization produces the eigenvector of one 39 dimension, and these eigenvectors are used the average normalization of cepstrum.These voice training data comprise 36696 pronunciation from SI-284 WSJ0 and WSJ1 set.The big vocabulary continuous speech recognition of ICRC (LVCSR) system is used based on the state of decision tree and trains, and it is concentrated and determines 6617 three sound states.Tabulation of 24k word and dictionary are used to this three gram language model.Use a dynamic network demoder to carry out all decodings.
For the application-specific of the model of above-mentioned consideration, the state based on contextual sound relevant with this single-tone is assigned to the identity set of diagonal angle correlation matrix.Automatically the rank that returns characteristic model is selected as 3.Therefore, this only causes 117 additional parameters.In the process that makes up this correlation matrix, the final number of component is mixed.Conversion from standard to new hidden Markov model is set to 0 by 117 additional correlation parameters and realizes.At last, 5 iteration of the forward that carry out to embed-oppositely reappraise.
This result of experiment compares in table 1.This result shows that this average character error rate (WER) is reduced to 11.4 from 11.8 (baselines).In addition, the data in this form show that the WER that is used for most of speaker is reduced by using native system.
On average 440 441 442 443 444 445 446 447
Standard HMM 11.8 9.9 21.0 12.4 14.6 11.9 6.1 10.3 8.4
New HMM 11.4 9.5 20.8 11.8 13.0 12.0 5.9 9.7 9.6
Table 1: modular system and frame related system are for the performance of 333 test speaker
Although specific embodiment of the present invention is shown and described, this description only is used for illustrative purposes rather than is used for restriction.Correspondingly, in this is described in detail, in order to illustrate, various details are set, so that thorough understanding of the present invention to be provided.But those of ordinary skill in the art does not obviously have these details can realize this system and method as can be seen yet.For example, although illustrated embodiment and example are described in hidden Markov process analog frame correlativity by being used for speech recognition, shown frame correlation method can be used to other application programs, for example phonetic synthesis and/or Audio Processing.In other examples, do not describe known 26S Proteasome Structure and Function in detail, obscure to avoid purport of the present invention caused.Correspondingly, scope and spirit of the present invention should be determined by appended claim.

Claims (22)

1. method that is used for comprising frame correlation at a hidden Markov model, comprising:
Calculate the frame independent sector of voice;
The frame independent sector of voice is input to a Gauss model;
Calculate the frame probability; And
Estimate automatic regression coefficient.
2. method according to claim 1, the frame independent sector of wherein said computing voice comprise calculates a measurement vector.
3. method according to claim 2, wherein said measurement vector is based on relevant automatic recurrence (AR) model of state.
4. method according to claim 2, the described measurement vector that wherein is used for current state are to multiply each other and sue for peace and this result and mean vector addition of multiplying each other after suing for peace is calculated by the vector with diagonal matrix and observation in the past.
5. method according to claim 4 wherein further comprises:
A Gaussian noise and described measurement vector addition.
6. method according to claim 5, wherein said Gaussian noise have a zero mean.
7. method according to claim 1, wherein said calculating frame probability comprise makes the likelihood of a status switch maximize.
8. method according to claim 1, wherein said calculating frame probability comprise makes a Q function maximize.
9. method according to claim 8 wherein saidly makes the maximization of Q function comprise iteration expectation value maximization procedure.
10. method according to claim 9, wherein said expectation value maximization comprises the expectation value that is formulated data.
Receive mean value, variance and correlation matrix 11. method according to claim 10, the wherein said expectation value that is formulated data comprise, and calculate desired calibration.
12. method according to claim 11, the calibration of wherein said calculation expectation comprise use forward-reverse technology.
13. comprising, method according to claim 9, wherein said expectation value maximization carry out the maximization of Q function.
14. method according to claim 13, wherein said execution Q function maximization comprises reception mean value, variance and correlation matrix, and this Q function is differentiated for this mean value, variance and correlation matrix, and this Q function setup for equalling zero, is used for the new numerical value of this mean value and variance with estimation.
15. comprising, method according to claim 11, the automatic regression coefficient of wherein said estimation use the calibration of estimated mean value, variance and an expectation to estimate one group of cross-correlation matrix.
16. a frame correlation system that is used for hidden Markov model, comprising:
The device that is used for the frame independent sector of computing voice;
Be connected to the device of calculating frame probability of the device of the described frame independent sector that is used for computing voice;
Be connected to the device of the calibration that is used for calculation expectation of the device of the described frame independent sector that is used for computing voice; And
Be connected to the device of the automatic regression coefficient of estimation of the device of the described calibration that is used for calculation expectation.
17. system according to claim 16, the wherein said device that is used for the frame independent sector of computing voice comprises that returns an analogue unit automatically.
18. a system that is used for frame correlation is covered a hidden Markov model, comprising:
The automatic recurrence analogue unit that is used for the calculating observation vector; And
Estimate to be used for this is returned automatically an expectation value maximization unit of the coefficient of analogue unit.
19. system according to claim 18, wherein said expectation value maximization unit comprises a Gauss model piece, an expectation value piece and a maximization piece.
20. system according to claim 19, wherein said Gauss model piece receives described measurement vector, and calculates a frame probability.
21. system according to claim 19, wherein said expectation value piece receives mean value and variance, and calculates the calibration of an expectation.
22. system according to claim 21, wherein said maximization piece receives the calibration of this expectation, and estimates to be used for the coefficient that this returns analogue unit automatically.
CN01823553.0A 2001-06-22 2001-06-22 Hidden Markov model with frame correlation Expired - Fee Related CN1217315C (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2001/001037 WO2003001507A1 (en) 2001-06-22 2001-06-22 Hidden markov model with frame correlation

Publications (2)

Publication Number Publication Date
CN1545695A CN1545695A (en) 2004-11-10
CN1217315C true CN1217315C (en) 2005-08-31

Family

ID=4574818

Family Applications (1)

Application Number Title Priority Date Filing Date
CN01823553.0A Expired - Fee Related CN1217315C (en) 2001-06-22 2001-06-22 Hidden Markov model with frame correlation

Country Status (2)

Country Link
CN (1) CN1217315C (en)
WO (1) WO2003001507A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9721569B2 (en) * 2015-05-27 2017-08-01 Intel Corporation Gaussian mixture model accelerator with direct memory access engines corresponding to individual data streams
CN113057850B (en) * 2021-03-11 2022-06-10 东南大学 Recovery robot control method based on probability motion primitive and hidden semi-Markov

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08241095A (en) * 1995-03-06 1996-09-17 Atr Onsei Honyaku Tsushin Kenkyusho:Kk Speaker adaptation device and speech recognizing device
US5924066A (en) * 1997-09-26 1999-07-13 U S West, Inc. System and method for classifying a speech signal
JPH11212591A (en) * 1998-01-23 1999-08-06 Pioneer Electron Corp Pattern recognition method, device therefor and recording medium recorded with pattern recognizing program

Also Published As

Publication number Publication date
WO2003001507A1 (en) 2003-01-03
CN1545695A (en) 2004-11-10

Similar Documents

Publication Publication Date Title
CN110992987B (en) Parallel feature extraction system and method for general specific voice in voice signal
Virtanen Speech recognition using factorial hidden Markov models for separation in the feature space.
CN101751921B (en) Real-time voice conversion method under conditions of minimal amount of training data
CN101027716B (en) Robust speaker-dependent speech recognition system
CN103065629A (en) Speech recognition system of humanoid robot
Srinivasan et al. Transforming binary uncertainties for robust speech recognition
Kim et al. Cepstrum-domain acoustic feature compensation based on decomposition of speech and noise for ASR in noisy environments
US6990447B2 (en) Method and apparatus for denoising and deverberation using variational inference and strong speech models
CN101246685A (en) Pronunciation quality evaluation method of computer auxiliary language learning system
CN111326170A (en) Method and device for converting ear voice into normal voice by combining time-frequency domain expansion convolution
US8942978B2 (en) Parameter learning in a hidden trajectory model
Sharma et al. Automatic speech recognition systems: challenges and recent implementation trends
Dang et al. Using semi-supervised learning for monaural time-domain speech separation with a self-supervised learning-based si-snr estimator
CN1217315C (en) Hidden Markov model with frame correlation
CN1420486A (en) Voice identification based on decision tree
US7653535B2 (en) Learning statistically characterized resonance targets in a hidden trajectory model
Shahin Improving speaker identification performance under the shouted talking condition using the second-order hidden Markov models
Shen et al. Solfeggio Teaching Method Based on MIDI Technology in the Background of Digital Music Teaching
Kumawat et al. SSQA: Speech signal quality assessment method using spectrogram and 2-D convolutional neural networks for improving efficiency of ASR devices
Martinčić-Ipšić et al. Croatian large vocabulary automatic speech recognition
CN113241090B (en) Multichannel blind sound source separation method based on minimum volume constraint
Oura et al. A fully consistent hidden semi-Markov model-based speech recognition system
Klapuri et al. Representing musical sounds with an interpolating state model
Milner et al. Noisy audio speech enhancement using Wiener filters derived from visual speech.
CN1864202A (en) Adaptation of environment mismatch for speech recognition systems

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20050831

Termination date: 20150622

EXPY Termination of patent right or utility model