CN101807397B - Voice detection method of noise robustness based on hidden semi-Markov model - Google Patents

Voice detection method of noise robustness based on hidden semi-Markov model Download PDF

Info

Publication number
CN101807397B
CN101807397B CN2010101175378A CN201010117537A CN101807397B CN 101807397 B CN101807397 B CN 101807397B CN 2010101175378 A CN2010101175378 A CN 2010101175378A CN 201010117537 A CN201010117537 A CN 201010117537A CN 101807397 B CN101807397 B CN 101807397B
Authority
CN
China
Prior art keywords
sigma
noise
markov model
parameter
likelihood ratio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2010101175378A
Other languages
Chinese (zh)
Other versions
CN101807397A (en
Inventor
刘祥龙
梁苑
单宝松
楼奕华
李未
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN2010101175378A priority Critical patent/CN101807397B/en
Publication of CN101807397A publication Critical patent/CN101807397A/en
Application granted granted Critical
Publication of CN101807397B publication Critical patent/CN101807397B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses a voice detection method of noise robustness based on a hidden semi-Markov model, which comprises the following steps: (1) building the hidden semi-Markov model lambda= (A, B, pi and tau); (2) initializing parameters of pi and tau in the hidden semi-Markov model lambda; (3) carrying out DCT transformation on non-empty input signals; (4) estimating the parameters of B and a likelihood ratio test threshold respectively by utilizing front multi-frame input signals and a likelihood ratio, carrying out likelihood ratio test and finishing the voice detection; and (5) regulating the parameters of B and the likelihood ratio test threshold dynamically. The method regulates the parameters and the test threshold of the model dynamically according to the time-delay feature of voice and noise and realizes the real-time voice detection of noise robustness by utilizing the likelihood ratio test to carry out the voice detection.

Description

A kind of speech detection method of the noise robustness based on hidden semi-Markov model
Invention field
The present invention relates to a kind of under noise circumstance voice signal handle under the category, based on the speech detection method of the noise robustness of hidden semi-Markov model.
Background of invention
Speech detection is used for detection signal phonological component and noise section, is extensive use of in fields such as voice coding, transmission, voice enhancing and speech recognitions.Method based on statistical model has also obtained quite good detecting effectiveness at present, fluctuates bigger but these methods detect effect under different noise types, different signal to noise ratio (S/N ratio) environment.And in the application of reality, noise circumstance is various, inevitable, so noise robustness becomes the focus of present speech detection.Propose the speech detection algorithms of the robust of the different noise circumstances of adaptation, use all significant for voice coding, enhancing, identification etc.
Summary of the invention
The technical problem to be solved in the present invention: traditional voice detect to lack robustness under the noise circumstance, provides a kind of under different signal to noise ratio (S/N ratio)s, different noise circumstance, based on the speech detection method of the noise robustness of hidden semi-Markov model.
The technical solution used in the present invention: a kind of speech detection method of the noise robustness based on hidden semi-Markov model is characterized in that step is as follows:
(1) foundation comprises voice and two state Q={q of non-voice 0, q 1Hidden semi-Markov model λ=(A, B, π, τ), wherein:
q 0Be non-voice, q 1Be voice;
A={a Ij, i, j=0,1 is state q i, q jTransition probability;
B={b i(O t), i=0,1; T>0 is input signal dct transform coefficient O t={ o 1, o 2..., o K, K>0 is at given state q iFollowing condition distribution probability b i(O t)=P (O t| q i), o wherein 1, o 2..., o KSeparate;
π={ π i, i=0,1; π i>0 is state q iThe prior distribution probability;
τ={ P (d|q i), i=0,1; D>0 is state q iContinue the probability of d;
(2) according to the prior distribution probability π={ π of training dataset statistics initialization to state in the hidden semi-Markov model i, the parameter (k of state duration distribution Weibull i, ω i), signal frame sequence number t=0;
(3) if input voice S signal is empty, finish; Otherwise, S is carried out dct transform T=t+1;
(4) if t<P judges that current demand signal is noise VAD=0, change (3); If t=P estimates input signal dct transform coefficient O under the given state tGauss parameter (the μ that distributes i G, σ i) and Laplace parameter (μ i L, l i), the likelihood ratio LRT of P frame before calculating t, initialization likelihood ratio test threshold value η judges that current demand signal is noise VAD=0, changes (3); If t>P calculates likelihood ratio LRT t, if LRT t〉=η judges that then current demand signal is voice VAD=1, if LRT t<η judges that then current demand signal is noise VAD=0, changes (5);
(5) adjust dct transform coefficient O under the given state tGauss parameter (the μ that distributes i G, σ i) and Laplace parameter (μ i L, l i), upgrade likelihood ratio test threshold value η; Change (3).
According to a further aspect of the invention, wherein step (1) further comprises again:
According to the training dataset statistics, determine
(1)a 00=a 11=0,a 10=a 01=1;
(2) to q 0, b 0(o i t) be that Gauss distributes
N ( o i t , μ i G , σ i ) = 1 σ i 2 π e - ( o i t - μ i G ) 2 σ i 2 ;
(3) to q 1, b 1(o i t) for distributing
L ( o i t ; μ i L , l i ) = 1 4 l i e - σ i 2 l i 2 [ e o i ′ l i erfc ( l i o i ′ + σ i 2 2 l i σ i ) + e - o i ′ l i erfc ( - l i o i ′ + σ i 2 2 l i σ i ) ] ,
Wherein o i ′ = o i t - μ i G - μ i L ;
(4) to q 0And q 1, P (d|q i) be that Weibull distributes
W ( d ; k i , ω i ) = k i ω i ( d ω i ) k i - 1 e - ( d ω i ) k i .
According to a further aspect of the invention, wherein step (2) further comprises again:
(a) according to the noise duration frequency F according to statistics of reference numerals in the training set 0And voice duration frequency F 1
(b) by F iApproximate W (d; k i, ω i) parameter (k i, ω i) maximal possibility estimation;
(c) the prior distribution probability of state in the hidden semi-Markov model
According to a further aspect of the invention, wherein step (4) further comprises again:
(a) calculate forward variable α i t, i=0,1:
If t=1, α i t * = π i P ( d = 1 / q i ) b j ( O t ) ;
If t>1, α i t * = Σ d = 1 D Σ j ≠ i α i ( t - d ) * a ji P ( d | q i ) Π s = t - d + 1 t b i ( O s ) ,
α i t = Σ d = 1 D Σ d ′ = 0 d Σ j ≠ i α i ( t - d ′ ) * a ji P ( d | q i ) Π s = t - d ′ + 1 t b i ( O s ) ;
(b) calculate likelihood ratio LRT t = ln ( π 0 α 1 t ) - ln ( π 1 α 0 t ) ;
(c) during t=P, by the dct transform coefficient O of P frame before the input signal t, wherein P>0,1≤t≤P estimates the parameter (μ that B distributes i G, σ i) and (μ i L, l i) be:
μ i L = μ i G = 1 P Σ i = 1 P o i t ; σ i = 1 P - 1 Σ i = 1 P ( o i t - μ i G ) 2 l i = R σ i 2 / 2 ;
P wherein, R is a constant;
(d) during t=P, by the dct transform coefficient O of P frame before the input signal t, wherein P>0,1≤t≤P estimates that the likelihood ratio test threshold value is
Figure GSA00000046689500039
According to a further aspect of the invention, wherein step (5) further comprises again:
(a), adjust parameter (μ if present frame is judged to be noise i G, σ i) and threshold value η:
μ i G = ρ 0 μ i G + ( 1 - ρ 0 ) o i
σ i = ρ 0 σ i + ( 1 - ρ 0 ) ( o i - μ i G ) 2
η=ρ 0η+(1-ρ 0)LRT t
Otherwise adjust parameter (μ i L, l i) and threshold value η:
μ i L = ρ 1 μ i L + ( 1 - ρ 1 ) o i
l i = ρ 1 l i + ( 1 - ρ 1 ) | o i - μ i G |
η=ρ 1η+(1-ρ 1)LRT t
0<ρ wherein 0, ρ 1<1 for upgrading constant;
Description of drawings
Fig. 1 is the inventive method basic flow sheet.
Embodiment
Below with reference to accompanying drawing, embodiments of the invention are described in detail.
At first principle of the present invention is described.
Human acoustic mechanism is that vocal cords are subjected to certain external force generation vibrations, and forms through a series of sympathetic response organ coordination thereafter.Therefore whole voiced process can be thought a life cycle, is subjected to the constraint of human organ's self-characteristic, and the life cycle of sounding can be thought and has certain statistical law.And this statistical law noise robustness normally, promptly Ren Lei sounding can be thought and not be subjected to The noise in the environment, therefore this statistical law of accurate description will make that the speech activity modeling tallies with the actual situation more under the noise circumstance, improve the noise robustness of speech detection.The normal Birnbaum-Saunders of use distributes and Weibull distribution description life cycle on the engineering.
Particularly, method basic procedure proposed by the invention as shown in Figure 1.
The core concept that the present invention mainly comprises: input audio signal is set up hidden semi-Markov model; Relate to the type of distribution by the training dataset testing model, and utilize the parameter that relates in this data set and the preceding some frame estimation models of input audio signal; Carry out speech detection by likelihood ratio test; Dynamically update model parameter and likelihood ratio test threshold value thereafter.
Arthmetic statement of the present invention is as follows:
1. set up and comprise voice and two state Q={q of non-voice 0, q 1Hidden semi-Markov model λ=(A, B, π, τ), wherein: q 0Be non-voice, q 1Be voice;
A={a Ij, i, j=0,1 is state q i, q jTransition probability;
B={b i(O t), i=0,1; T>0 is input signal dct transform coefficient O t={ o 1, o 2..., o K), at given state q iFollowing condition distribution probability b i(O t)=P (O t| q i), o wherein 1, o 2..., o KSeparate;
π={ π i, i=0,1; π i>0 is state q iThe prior distribution probability;
τ={ P (d|q i), i=0,1; D>0 is state q iContinue the probability of d;
The distribution pattern that relates to according to TIMIT training dataset statistics discovery model is as follows:
(1)a 00=a 11=0,a 10=a 01=1;
(2) to q 0, b 0(o i t) be that Gauss distributes
N ( o i t , μ i G , σ i ) = 1 σ i 2 π e - ( o i t - μ i G ) 2 σ i 2 ;
(3) to q 1, b 1(o i t) for distributing
L ( o i t ; μ i L , l i ) = 1 4 l i e - σ i 2 l i 2 [ e o i ′ l i erfc ( l i o i ′ + σ i 2 2 l i σ i ) + e - o i ′ l i erfc ( - l i o i ′ + σ i 2 2 l i σ i ) ] ,
Wherein o i ′ = o i t - μ i G - μ i L ;
(4) to q 0And q 1, P (d|q i) be that Weibull distributes
W ( d ; k i , ω i ) = k i ω i ( d ω i ) k i - 1 e - ( d ω i ) k i .
According to the prior distribution probability π={ π of training dataset statistics initialization to state in the hidden semi-Markov model i, the parameter (k that distributes of state duration i, ω i), signal frame sequence number t=0; Method is as follows:
(a) according to the noise duration frequency F according to statistics of reference numerals in the training set 0And voice duration frequency F 1
(b) by F iApproximate W (d; k i, ω i) parameter (k i, ω i) maximal possibility estimation;
(c) the prior distribution probability of state in the hidden semi-Markov model
Figure GSA00000046689500055
3. if input voice S signal is empty, finish; Otherwise, S is carried out dct transform
Figure GSA00000046689500056
T=t+1;
4. if t<P judges that current demand signal is noise VAD=0, change (3); If t=P estimates input signal dct transform coefficient O under the given state tParameter (the μ that distributes i G, σ i) and (μ i L, l i), the likelihood ratio LRT of P frame before calculating t, initialization likelihood ratio test threshold value η judges that current demand signal is noise VAD=0, changes (3); If t>P calculates likelihood ratio LRT t, if LRT t〉=η judges that then current demand signal is voice VAD=1, if LRT t<η judges that then current demand signal is noise VAD=0, changes (5); Method is as follows:
(a) calculate forward variable α i t, i=0,1:
If t=1, α i t * = π i P ( d = 1 / q i ) b j ( O t ) ;
If t>1, α i t * = Σ d = 1 D Σ j ≠ i α i ( t - d ) * a ji P ( d | q i ) Π s = t - d + 1 t b i ( O s ) ,
α i t = Σ d = 1 D Σ d ′ = 0 d Σ j ≠ i α i ( t - d ′ ) * a ji P ( d | q i ) Π s = t - d ′ + 1 t b i ( O s ) ;
(b) calculate likelihood ratio LRT t = ln ( π 0 α 1 t ) - ln ( π 1 α 0 t ) ;
(c) during t=P, by the dct transform coefficient O of P frame before the input signal t, wherein P>0,1≤t≤P estimates the parameter (μ that B distributes i G, σ i) and (μ i L, l i) be:
μ i L = μ i G = 1 P Σ i = 1 P o i t ; σ i = 1 P - 1 Σ i = 1 P ( o i t - μ i G ) 2 l i = R σ i 2 / 2 ;
P wherein, R is a constant;
(d) during t=P, by the dct transform coefficient O of P frame before the input signal t, wherein P>0,1≤t≤P estimates that the likelihood ratio test threshold value is
Figure GSA00000046689500066
5. dct transform coefficient O under the adjustment given state tParameter (the μ that distributes i G, σ i) and (μ i L, l i), upgrade likelihood ratio test threshold value η; Change (3); Method is as follows:
(a), adjust parameter (μ if present frame is judged to be noise i G, σ i) and threshold value η:
μ i G = ρ 0 μ i G + ( 1 - ρ 0 ) o i
σ i = ρ 0 σ i + ( 1 - ρ 0 ) ( o i - μ i G ) 2
η=ρ 0η+(1-ρ 0)LRT t
Otherwise adjust parameter (μ i L, l i) and threshold value η:
μ i L = ρ 1 μ i L + ( 1 - ρ 1 ) o i
l i = ρ 1 l i + ( 1 - ρ 1 ) | o i - μ i G |
η=ρ 1η+(1-ρ 1)LRT t
ρ wherein 0, ρ 1Be constant;
In the speech detection experiment of NOIZEUS data set, constant P=15, R=20, ρ 0=0.99, ρ 1=0.79;
Experimental data is as shown in the table:
Figure GSA00000046689500071
Can see that the present invention obtains effect under multiple noise circumstance almost consistent, and most applications be better than international standard G.729B reach AMR2.
In sum, be speech frame in the input signal and noise frame under the detection noise environment according to said method.
What may be obvious that for the person of ordinary skill of the art draws other advantages and modification.Therefore, the present invention with wider aspect is not limited to shown and described specifying and exemplary embodiment here.Therefore, under situation about not breaking away from, can make various modifications to it by the spirit and scope of claim and the defined general inventive concept of equivalents thereof subsequently.

Claims (5)

1. based on the speech detection method of the noise robustness of hidden semi-Markov model, it is characterized in that step is as follows:
(1) foundation comprises voice and two state Q={q of non-voice 0, q 1Hidden semi-Markov model λ=(A, B, π, τ), wherein:
q 0Be non-voice, q 1Be voice;
A={a Ij, a IjBe state q i, q jTransition probability; I=0,1; J=0,1;
B={b i(0 t); I=0,1; T>0 is input signal dct transform coefficient O t={ o 1, o 2..., o K, K>0 is at given state q iFollowing condition distribution probability b i(O t)=P (O t| q i), o wherein 1, o 2..., o KSeparate;
π={ π i, i=0,1; π i>0 is state q iThe prior distribution probability;
τ={ P (d|q i), i=0,1; D>0 is state q iContinue the probability of d;
(2) according to the prior distribution probability π={ π of training dataset statistics initialization to state in the hidden semi-Markov model i, the parameter (k of state duration distribution Weibull i, ω i), signal frame sequence number t=0;
(3) if input speech signal S is empty, finish; Otherwise, S is carried out dct transform
Figure FSB00000552027600011
T=t+1;
(4) if t<P judges that current demand signal is noise VAD=0, change (3); If t=P estimates input signal dct transform coefficient O under the given state tThe Gauss parameter that distributes
Figure FSB00000552027600012
With the Laplace parameter
Figure FSB00000552027600013
The likelihood ratio LRT of P frame before calculating t, initialization likelihood ratio test threshold value η judges that current demand signal is noise VAD=0, changes (3); If t>P calculates likelihood ratio LRT t, if LRT t〉=η judges that then current demand signal is voice VAD=1, if LRT t<η judges that then current demand signal is noise VAD=0, changes (5);
(5) adjust dct transform coefficient O under the given state tThe Gauss parameter that distributes
Figure FSB00000552027600014
And Laplace parameter
Figure FSB00000552027600015
Upgrade likelihood ratio test threshold value η; Change (3).
2. according to the speech detection method based on the noise robustness of hidden semi-Markov model of claim 1, it is characterized in that: described step (1) further comprises:
According to the training dataset statistics, determine
(1.1)a 00=a 11=0,a 10=a 01=1;
(1.2) to q 0,
Figure FSB00000552027600021
For Gauss distributes
N ( o i t ; μ i G , σ i ) = 1 σ i 2 π e - ( o i t - μ i G ) 2 σ i 2 ;
(1.3) to q 1, For distributing
L ( o i t ; μ i L , l i ) = 1 4 l i e - σ i 2 l i 2 [ e o i ′ l i erfc ( l i o i ′ + σ i 2 2 l i σ i ) + e - o i ′ l i erfc ( - l i o i ′ + o i 2 2 l i σ i ) ] ,
Wherein o i ′ = o i t - μ i G - μ i L ;
(1.4) to q 0And q 1, P (d|q i) be that Weibull distributes
W ( d ; k i , ω i ) = k i ω i ( d ω i ) k i - 1 e - ( d ω i ) k i .
3. according to the speech detection method based on the noise robustness of hidden semi-Markov model of claim 1, it is characterized in that: described step (2) further comprises:
(2.1) according to the noise duration frequency F according to statistics of reference numerals in the training dataset 0And voice duration frequency F 1
(2.2) by F iApproximate W (d; k i, ω i) parameter (k i, ω i) maximal possibility estimation;
(2.3) the prior distribution probability of state in the hidden semi-Markov model
Figure FSB00000552027600027
4. according to the speech detection method based on the noise robustness of hidden semi-Markov model of claim 1, it is characterized in that: described step (4) further comprises:
(4.1) calculate forward variable
Figure FSB00000552027600028
I=0,1:
If t=1, α i t * = π i P ( d = 1 | q i ) b j ( O t ) ;
If t>1, α i t * = Σ d = 1 D Σ j ≠ i α i ( t - d ) * a ji P ( d | q i ) Π s = t - d + 1 t b i ( O s ) ,
α i t = Σ d = 1 D Σ d ′ = 0 d Σ j ≠ i α i ( t - d ′ ) * a ji P ( d | q i ) Π s = t - d ′ + 1 t b i ( O s ) ;
(4.2) calculate likelihood ratio LRT t = ln ( π 0 α 1 t ) - ln ( π 1 α 0 t ) ;
(4.3) during t=P, by the dct transform coefficient O of P frame before the input signal t, wherein P>0,1≤t≤P estimates the parameter that B distributes And
Figure FSB00000552027600032
For:
μ i L = μ i G = 1 P Σ i = 1 P o i t ; σ i = 1 P - 1 Σ i = 1 P ( o i t - μ i G ) 2 ; l i = R σ i 2 / 2 ;
P wherein, R is a constant;
(4.4) during t=P, by the dct transform coefficient O of P frame before the input signal t, wherein P>0,1≤t≤P estimates that the likelihood ratio test threshold value is
Figure FSB00000552027600036
5. according to the speech detection method based on the noise robustness of hidden semi-Markov model of claim 1, it is characterized in that: described step (5) further comprises:
(5.1), adjust parameter if present frame is judged to be noise And threshold value η:
μ i G = ρ 0 μ i G + ( 1 - ρ 0 ) o i
σ i = ρ 0 σ i + ( 1 - ρ 0 ) ( o i - μ i G ) 2
η=ρ 0η+(1-ρ 0)LRT t
Otherwise adjustment parameter
Figure FSB000005520276000310
And threshold value η:
μ i L = ρ 1 μ i L + ( 1 - ρ 1 ) o i
l i = ρ 1 l i + ( 1 - ρ 1 ) | o i - μ i G |
η=ρ 1η+(1-ρ 1)LRT t
0<ρ wherein 0, ρ 1<1 for upgrading constant.
CN2010101175378A 2010-03-03 2010-03-03 Voice detection method of noise robustness based on hidden semi-Markov model Expired - Fee Related CN101807397B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010101175378A CN101807397B (en) 2010-03-03 2010-03-03 Voice detection method of noise robustness based on hidden semi-Markov model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010101175378A CN101807397B (en) 2010-03-03 2010-03-03 Voice detection method of noise robustness based on hidden semi-Markov model

Publications (2)

Publication Number Publication Date
CN101807397A CN101807397A (en) 2010-08-18
CN101807397B true CN101807397B (en) 2011-11-16

Family

ID=42609166

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010101175378A Expired - Fee Related CN101807397B (en) 2010-03-03 2010-03-03 Voice detection method of noise robustness based on hidden semi-Markov model

Country Status (1)

Country Link
CN (1) CN101807397B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103021405A (en) * 2012-12-05 2013-04-03 渤海大学 Voice signal dynamic feature extraction method based on MUSIC and modulation spectrum filter
CN103730124A (en) * 2013-12-31 2014-04-16 上海交通大学无锡研究院 Noise robustness endpoint detection method based on likelihood ratio test
CN106599920A (en) * 2016-12-14 2017-04-26 中国航空工业集团公司上海航空测控技术研究所 Aircraft bearing fault diagnosis method based on coupled hidden semi-Markov model
CN109856977A (en) * 2019-03-13 2019-06-07 济南大学 A kind of Design of Feedback Controller method of the Markov jump system with noise and time lag

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3888543B2 (en) * 2000-07-13 2007-03-07 旭化成株式会社 Speech recognition apparatus and speech recognition method
CN1320372C (en) * 2004-11-25 2007-06-06 上海交通大学 Method for testing and identifying underwater sound noise based on small wave area
JP4241771B2 (en) * 2006-07-04 2009-03-18 株式会社東芝 Speech recognition apparatus and method
CN101030369B (en) * 2007-03-30 2011-06-29 清华大学 Built-in speech discriminating method based on sub-word hidden Markov model

Also Published As

Publication number Publication date
CN101807397A (en) 2010-08-18

Similar Documents

Publication Publication Date Title
CN107564513B (en) Voice recognition method and device
CN102238190B (en) Identity authentication method and system
WO2020181824A1 (en) Voiceprint recognition method, apparatus and device, and computer-readable storage medium
CN110706692B (en) Training method and system of child voice recognition model
JP2020524308A (en) Method, apparatus, computer device, program and storage medium for constructing voiceprint model
JP4765461B2 (en) Noise suppression system, method and program
CN108538293B (en) Voice awakening method and device and intelligent device
JP6464650B2 (en) Audio processing apparatus, audio processing method, and program
CN106486131A (en) A kind of method and device of speech de-noising
CN105161092A (en) Speech recognition method and device
CN105304080A (en) Speech synthesis device and speech synthesis method
JP5842056B2 (en) Noise estimation device, noise estimation method, noise estimation program, and recording medium
CN101807397B (en) Voice detection method of noise robustness based on hidden semi-Markov model
WO2018051945A1 (en) Speech processing device, speech processing method, and recording medium
CN109616139A (en) Pronunciation signal noise power spectral density estimation method and device
CN106611598A (en) VAD dynamic parameter adjusting method and device
CN109616105A (en) A kind of noisy speech recognition methods based on transfer learning
CN105023570A (en) method and system of transforming speech
CN105023574A (en) Method and system of enhancing TTS
Takamichi et al. Sampling-based speech parameter generation using moment-matching networks
CN105895104B (en) Speaker adaptation recognition methods and system
CN107274892A (en) Method for distinguishing speek person and device
CN110268471A (en) The method and apparatus of ASR with embedded noise reduction
CN103366737B (en) The apparatus and method of tone feature are applied in automatic speech recognition
Sharma et al. Automatic speech recognition systems: challenges and recent implementation trends

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20111116

Termination date: 20170303