JPH04125599A - Reference pattern generating method - Google Patents

Reference pattern generating method

Info

Publication number
JPH04125599A
JPH04125599A JP2246863A JP24686390A JPH04125599A JP H04125599 A JPH04125599 A JP H04125599A JP 2246863 A JP2246863 A JP 2246863A JP 24686390 A JP24686390 A JP 24686390A JP H04125599 A JPH04125599 A JP H04125599A
Authority
JP
Japan
Prior art keywords
vector output
states
output probability
transition
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP2246863A
Other languages
Japanese (ja)
Other versions
JP3251005B2 (en
Inventor
Ryosuke Isotani
亮輔 磯谷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to JP24686390A priority Critical patent/JP3251005B2/en
Publication of JPH04125599A publication Critical patent/JPH04125599A/en
Application granted granted Critical
Publication of JP3251005B2 publication Critical patent/JP3251005B2/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Abstract

PURPOSE:To easily decide the parameter of a reference pattern by finding vector output probability distribution represented in mixed continuous distribution by synthesizing the vector output probability distribution of plural learned reference patterns. CONSTITUTION:An HMM(imbedded Markov) model A(3) is generated from the learning data (1) of a talker A, and an HMM model B(4) from the learning data (2) of a talker B. An HMM model C(5) for speech recognition of unspecific talker is generated from the models A and B. The model (C) can be used as the HMM model for recognition of unspecific talker as it is, and also, a better model C'(7) can be generated by using the learning data (6) for large number of talkers. In such a way, it is possible to easily decide the parameter of the reference pattern in which the vector output probability is represented in the mixed continuous distribution by using the plural learned reference patterns.

Description

【発明の詳細な説明】 〔産業上の利用分野〕 本発明は、音声認識等パターン認識に用いられる標準パ
ターンの作成方法に関する。
DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a method for creating standard patterns used in pattern recognition such as speech recognition.

〔従来の技術〕[Conventional technology]

音声認識などパターン認識の分野で、認識用の標準パタ
ーンとして確率モデルを用いる方法が近年注目されてお
り、特に隠れマルコフモデル(以下HMMと呼ぶ)は音
声認識の分野で標準パターンを表すモデルとして広(用
いられている。
In the field of pattern recognition such as speech recognition, methods that use probabilistic models as standard patterns for recognition have attracted attention in recent years, and Hidden Markov Models (hereinafter referred to as HMMs) are particularly popular as models representing standard patterns in the field of speech recognition. (Used.

HMMは状態の集合と状態間の遷移確率と状態あるいは
遷移のベクトル出力確率によって定義され、入カバター
ンに対する各HMMの尤度を計算することにより認識を
行う。HMMによる音声認識については、刊行物「確率
モデルによる音声認識」牛用を一著に詳しく述べられて
いる。
HMMs are defined by a set of states, transition probabilities between states, and vector output probabilities of states or transitions, and recognition is performed by calculating the likelihood of each HMM with respect to input cover turns. Speech recognition using HMM is described in detail in the publication ``Speech Recognition Using Probabilistic Models'' for Cattle.

各状態(あるいは遷移)のベクトル出力確率が混合連続
分布で表されるHMMモデルのパラメータを決定する方
法として、Baum−Welchアルゴリズムなど、あ
る初期値から学習用データを用いてパラメータを繰り返
し更新する学習法が知られている。この場合、出力確率
分布の平均値などのパラメータの初期値は、混合する各
分布毎に決定する必要がある。これらのパラメータの初
期値を与える方法としては、 (a)乱数で与える (b)単一の分布の場合のパラメータに乱数値でぼかし
作用を行う(「連続出力分布型HMMによる日本語音韻
認識の検討」)電子情報通信学会音声研究会資料5P8
9−48) などの方法が知られている。
As a method for determining the parameters of an HMM model in which the vector output probability of each state (or transition) is expressed by a mixed continuous distribution, learning that repeatedly updates parameters using training data from a certain initial value, such as the Baum-Welch algorithm, is used. The law is known. In this case, the initial values of parameters such as the average value of the output probability distribution need to be determined for each distribution to be mixed. The methods of giving initial values for these parameters are: (a) giving them as random numbers; (b) blurring the parameters in the case of a single distribution with random values (see ``Japanese Phonological Recognition Using Continuous Output Distribution HMM''). ``Consideration'') Institute of Electronics, Information and Communication Engineers Speech Study Group Materials 5P8
9-48) and other methods are known.

一方、ある初期値から更新によって求めるのではなく学
習データから直接パラメータを決定する方法として、 (C)学習データをセグメンテーションしたあとクラス
タリングを行って混合する分布数のクラスタを求め、各
クラスタのデータから平均値等のパラメータを求める方
法が知られている(°°旧ghPerformance
 Connected Digit Recognit
ion IJsing Hidden Markov 
Models″、 IEEE Transaction
on Acoustics、 5peech、 and
 Signal ProcessingVol、37+
 No、8+ pp、1214−1224. Augu
st 1989)。
On the other hand, as a method to directly determine the parameters from the training data instead of determining them by updating from a certain initial value, (C) segment the training data and then perform clustering to find clusters of the number of distributions to be mixed, and from the data of each cluster A method for determining parameters such as average values is known (°°formerly known as ghPerformance).
Connected Digit Recognize
ion IJsing Hidden Markov
Models'', IEEE Transaction
on Acoustics, 5peech, and
Signal Processing Vol, 37+
No, 8+ pp, 1214-1224. August
st 1989).

このようにして決められた値を初期値として、Baum
−Welchアルゴリズムなどにより更新を行うことも
できる。
Using the value determined in this way as the initial value, Baum
-Updating can also be performed using the Welch algorithm or the like.

〔発明が解決しようとする課題] 学習により繰り返しパラメータを更新する方法を用いる
場合、効率よく学習が行われるためには初期値の設定が
重要であることが知られているが、(a)のように乱数
を用いたり(b)のように単一の分布の場合のパラメー
タを用いるのでは、学習の収束までに時間がかかり、ま
た収束値も全体の最適値ではなく局所的な最適値になる
可能性が高い。
[Problem to be solved by the invention] When using a method of repeatedly updating parameters through learning, it is known that setting initial values is important for efficient learning. Using random numbers as in (b) or parameters for a single distribution as in (b), it takes time for learning to converge, and the convergence value is not the overall optimal value but the local optimal value. There is a high possibility that it will.

方(C)の方法は、パラメータ更新のための学習を必ず
しも必要とせず、また、更新の初期値として用いる場合
でも少ない繰り返し回数で収束すると考えられるが、ク
ラスタリングのための計算などが必要で、計算量が多く
なるという欠点があった。
Method (C) does not necessarily require learning for parameter updating, and is thought to converge with a small number of iterations even when used as an initial value for updating, but it requires calculations for clustering, etc. The drawback is that the amount of calculation increases.

本発明の目的は、このような欠点を解消した標準パター
ン作成方法を提供することにある。
An object of the present invention is to provide a standard pattern creation method that eliminates such drawbacks.

(課題を解決するための手段] 第1の発明は、状態の集合と状態間の遷移確率と状態あ
るいは遷移のベクトル出力確率とによって定義される標
準パターンの作成方法において、ベクトル出力確率が連
続分布で表される複数の標準パターンの対応する状態あ
るいは遷移のベクトル出力確率分布を重み付きで混合し
た混合連続分布を状態あるいは遷移のベクトル出力確率
とする標準パターンを作成することを特徴とする。
(Means for Solving the Problems) A first invention provides a method for creating a standard pattern defined by a set of states, transition probabilities between states, and vector output probabilities of states or transitions, in which the vector output probabilities are distributed continuously. The present invention is characterized in that a standard pattern is created in which the vector output probability of a state or transition is a mixed continuous distribution obtained by weighting and mixing vector output probability distributions of corresponding states or transitions of a plurality of standard patterns represented by .

第2の発明は、状態の集合と状態間の遷移確率と状態あ
るいは遷移のベクトル出力確率とによって定義される音
声認識用の標準パターンの作成方法において、 複数の話者について話者ごとにその話者の音声データを
用いて学習して作成されたベクトル出力確率が連続分布
で表される標準パターンの対応する状態あるいは遷移の
ベクトル出力確率分布を重み付きで混合した混合連続分
布を状態あるいは遷移のベクトル出力確率とする標準パ
ターンを作成することを特徴とする。
A second invention is a method for creating a standard pattern for speech recognition defined by a set of states, transition probabilities between states, and vector output probabilities of states or transitions, in which the speech recognition is performed for each speaker for a plurality of speakers. The vector output probability created by learning using the voice data of the person who is learning is expressed as a continuous distribution.The vector output probability distribution of the corresponding state or transition of the standard pattern is mixed with a weight, and the vector output probability distribution of the state or transition is expressed as a continuous distribution. The feature is that a standard pattern is created as a vector output probability.

第3の発明は、状態の集合と状態間の遷移確率と状態あ
るいは遷移のベクトル出力確率とによって定義される音
声認識用の標準パターンの作成方法において、 異なる環境で発声あるいは収録した音声データを用いて
環境ごとに学習して作成されたベクトル出力確率が連続
分布で表される標準パターンの対応する状態あるいは遷
移のベクトル出力確率分布を重み付きで混合した混合連
続分布を状態あるいは遷移のベクトル出力確率とする標
準パターンを作成することを特徴とする。
The third invention is a method for creating a standard pattern for speech recognition defined by a set of states, transition probabilities between states, and vector output probabilities of states or transitions, using speech data uttered or recorded in different environments. The vector output probability created by learning for each environment is expressed as a continuous distribution.The vector output probability of the state or transition is a mixed continuous distribution that is a weighted mixture of the vector output probability distribution of the corresponding state or transition of the standard pattern. The feature is that a standard pattern is created.

[作用] 本発明によれば、混合連続分布で表されるベクトル出力
確率分布を、すでに学習済みの複数の標準パターンのベ
クトル出力確率分布から合成して求めることにより、標
準パターンのパラメータを簡易に決定することができる
。また、合成に用いる標準パターンを適切に選べば、B
aum−Welch法などの学習の初期パラメータとし
て用いる場合、乱数で初期パラメータを決定する場合な
どに比べ少ない学習回数で収束し、局所的な最適値に収
束する確率も小さくなると期待される。また、学習によ
るパラメータ更新を行わずそのまま用いることもできる
[Operation] According to the present invention, the parameters of the standard pattern can be easily determined by synthesizing the vector output probability distribution represented by a mixed continuous distribution from the vector output probability distributions of a plurality of standard patterns that have already been learned. can be determined. In addition, if the standard pattern used for synthesis is appropriately selected, B
When used as an initial parameter for learning such as the aum-Welch method, it is expected that convergence will occur with a smaller number of learning times than when initial parameters are determined using random numbers, and the probability of convergence to a locally optimal value will be reduced. Furthermore, it is also possible to use it as is without updating the parameters through learning.

第2の発明のように、合成に複数の話者について話者ご
とにその話者の音声データを用いて学習して作成された
標準パターンを用いれば、ベクトル出力確率が混合連続
出力分布で表される不特定話者音声認識用の標準パター
ンを簡易乙こ作成することができる。
As in the second invention, if a standard pattern created by learning for multiple speakers using the speech data of each speaker is used for synthesis, the vector output probability can be expressed as a mixed continuous output distribution. It is possible to easily create a standard pattern for speaker-independent speech recognition.

第3の発明のように、合成に異なる環境で発声あるいは
収録した音声データを用いて環境ごとに学習して作成さ
れた標準パターンを用いれば、ベクトル出力確率が混合
連続出力分布で表される環境の変動に強い標準パターン
を簡易に作成することができる。
As in the third invention, if a standard pattern created by learning each environment using audio data uttered or recorded in different environments is used for synthesis, an environment where the vector output probability is expressed by a mixed continuous output distribution. Standard patterns that are resistant to fluctuations can be easily created.

[実施例〕 第1図は、第1の発明を不特定話者音声認識用のHMM
モデル作成に適用した実施例を説明するためのブロック
図である。話者Aの学習データ(1)からHMMモデル
A(3)を、話者Bの学習データ(2)からHMMモデ
ルB(4)を作成する。話者A、Bとしては、たとえば
男性1女性から標準的な話者を1名ずつ選んで用いる。
[Example] FIG. 1 shows the first invention as an HMM for speaker-independent speech recognition.
FIG. 2 is a block diagram for explaining an example applied to model creation. HMM model A (3) is created from speaker A's learning data (1), and HMM model B (4) is created from speaker B's learning data (2). As speakers A and B, standard speakers are selected from, for example, one male and one female.

8MMモデルは第2図に示すような形のモデルとする。The 8MM model has a shape as shown in FIG.

各状態iに対し、状態遷移確率a iil  a ii
*+ (a =i+ a it、、: l )と出力ベ
クトルyに対する出力確率分布b、(y)が定められて
いる。モデルAの状態遷移確率、出力確率分布を、それ
ぞれaA′bA(y)などと表す。出力ベクトル確率分
布が単一ガウス分布で表されたとすると、 b+A(y)=N (y、  μ、^、Σ、^)b=”
 (y)=N (y、  μ4− Σ、II)と表され
る。ここで、N (y、  μ1.Σ、)は平均ベクト
ルをμ8、共分散行列をΣ、とする多次元ガウス分布を
表す。モデルAとモデルBから、不特定話者音声認識用
のHMMモデルC(5)を作成する。モデルCの状態遷
移確率をa81.aii+1出力確率分布をす、cとす
る。出力確率分布が、次のような混合数2の混合ガウス
分布で表されるとする。
For each state i, the state transition probability a iii a ii
*+ (a = i+ a it, , : l ) and the output probability distribution b, (y) for the output vector y are determined. The state transition probability and output probability distribution of model A are expressed as aA'bA(y), respectively. If the output vector probability distribution is represented by a single Gaussian distribution, then b+A(y)=N (y, μ, ^, Σ, ^) b=”
It is expressed as (y)=N (y, μ4−Σ, II). Here, N (y, μ1.Σ,) represents a multidimensional Gaussian distribution with a mean vector of μ8 and a covariance matrix of Σ. From model A and model B, HMM model C (5) for speaker-independent speech recognition is created. The state transition probability of model C is a81. Let c be the output probability distribution of aii+1. Assume that the output probability distribution is represented by the following Gaussian mixture distribution with a mixture number of 2.

tle(y)=λ’N (y、  μi’+  Σ、1
)十λ”N (y、 IJi”+ Σ、′)このとき、
モデルCの各パラメータを次のように定める。
tle(y)=λ'N (y, μi'+ Σ, 1
) 10λ”N (y, IJi”+ Σ, ′) At this time,
Each parameter of model C is determined as follows.

ai=c= (ai♂+aii”)/2a ii。1°
”” (aii++A+ a fis+”) / 2μ
、′=μ、′、Σ1′ −Σ、A IJ i”  −JJ iB+  Σ、2  =Σ、′
λ1 =λ2 =1/2 このようにして作成されたモデルCは、そのまま不特定
話者音声認識用の8MMモデルとして用いることもでき
、また、さらに多数の話者の学習データ(6)を用いて
Baum−Welch法などで学習を行い、よりよいモ
デルC’ (7)を作成するための初期モデルとして用
いることもできる。
ai=c= (ai♂+aii”)/2a ii.1°
"" (aii++A+ a fis+") / 2μ
,′=μ,′,Σ1′ −Σ,A IJ i” −JJ iB+ Σ,2 =Σ,′
λ1 = λ2 = 1/2 The model C created in this way can be used as is as an 8MM model for speaker-independent speech recognition, or it can be used as an 8MM model for speaker-independent speech recognition, or it can be used as an 8MM model for speaker-independent speech recognition. It can also be used as an initial model for creating a better model C' (7) by performing learning using the Baum-Welch method or the like.

モデルA、Bとして出力確率分布が混合ガウス分布で表
されるものが用意されている場合にも、同様にモデルC
を作成することができる。この場合、モデルCの出力確
率分布の混合数は、モデルA、Bの出力確率分布の混合
数の和になる。
Similarly, when models A and B are prepared whose output probability distribution is a Gaussian mixture distribution, model C
can be created. In this case, the number of mixed output probability distributions of model C is the sum of the number of mixed output probability distributions of models A and B.

次に、第2の発明の一実施例について説明する。Next, an embodiment of the second invention will be described.

多数の話者が発声した少数語案の音声データをクラスタ
リングすることにより話者をM個のクラスタに分け、各
クラスタからクラスタ中心の話者M名を選ぶ。M名の各
話者について、HMM学習に必要な量の音声データをも
とに、出力確率分布が単一ガウス分布で表されるHMM
モデルを学習して作成する。作成されたM個のモデルか
ら、第1の発明の実施例と同様に混合数がMの混合ガウ
ス分布を出力確率分布とするHMMモデルを作成するこ
とにより不特定話者音声認識用のHMMモデルが得られ
る。M名の話者を選ぶためのクラスタリングに用いるデ
ータは少数のデータでよいので、従来の技術の(C)に
比べ計算量は少なくなる。
The speakers are divided into M clusters by clustering the voice data of the few words uttered by a large number of speakers, and M names of speakers at the center of the cluster are selected from each cluster. An HMM whose output probability distribution is represented by a single Gaussian distribution for each of M speakers, based on the amount of audio data required for HMM learning.
Learn and create models. From the M models created, an HMM model for speaker-independent speech recognition is created by creating an HMM model whose output probability distribution is a Gaussian mixture distribution with M mixtures, as in the embodiment of the first invention. is obtained. Since only a small amount of data is required for clustering to select M speakers, the amount of calculation is reduced compared to the conventional technique (C).

最後に、第3の発明の一実施例について説明する。第1
の発明の実施例において、モデルA、 Bの選び方とし
て、ある話者の異なる環境下(たとえば、静かな環境と
雑音の多い環境)で発声したデータを用いて学習したモ
デルを用いれば、モデルCとして環境の変動に強い認識
モデルを作成することができる。
Finally, an embodiment of the third invention will be described. 1st
In the embodiment of the invention, models A and B can be selected by using models trained using data uttered by a certain speaker under different environments (for example, a quiet environment and a noisy environment). As a result, it is possible to create a recognition model that is resistant to changes in the environment.

〔発明の効果〕〔Effect of the invention〕

以上述べたように、第1の発明によれば、すでに学習さ
れている複数の標準パターンを用いて、ベクトル出力確
率が混合連続分布で表される標準パターンのパラメータ
を簡単に決定することができ、そのまま、あるいはこの
値を初期値とした少数回の学習でパターン認識に用いる
ことができる。
As described above, according to the first invention, it is possible to easily determine the parameters of a standard pattern whose vector output probability is represented by a mixed continuous distribution using a plurality of standard patterns that have already been learned. , it can be used for pattern recognition either as is or by a small number of training sessions with this value as the initial value.

また、第2.第3の発明によれば、不特定話者用、環境
の変動に強い標準パターンをそれぞれ簡易に作成するこ
とができる。
Also, the second. According to the third invention, it is possible to easily create standard patterns for unspecified speakers and that are resistant to environmental changes.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図は、第1の発明を不特定話者音声認識用のHMM
モデル作成に適用した実施例を説明するためのブロック
図、 第2図は、実施例におけるHMMモデルの形を示す図で
ある。 1・・・・・話者Aの学習データ 2・・・・・話者Bの学習データ 3・・・・・f(MMモデルA 4・・・・・HMMモデルB 5・・・・・HMMモデルC 6・・・・・多数話者の学習データ 7・・・・・HMMモデルC′ 代理人 弁理士  岩 佐  義 幸
FIG. 1 shows the first invention as an HMM for speaker-independent speech recognition.
A block diagram for explaining the embodiment applied to model creation. FIG. 2 is a diagram showing the shape of the HMM model in the embodiment. 1...Speaker A's learning data 2...Speaker B's learning data 3...f (MM model A 4...HMM model B 5... HMM model C 6...Multi-speaker learning data 7...HMM model C' Agent Patent attorney Yoshiyuki Iwasa

Claims (3)

【特許請求の範囲】[Claims] (1)状態の集合と状態間の遷移確率と状態あるいは遷
移のベクトル出力確率とによって定義される標準パター
ンの作成方法において、 ベクトル出力確率が連続分布で表される複数の標準パタ
ーンの対応する状態あるいは遷移のベクトル出力確率分
布を重み付きで混合した混合連続分布を状態あるいは遷
移のベクトル出力確率とする標準パターンを作成するこ
とを特徴とする標準パターン作成方法。
(1) In a method for creating standard patterns defined by a set of states, transition probabilities between states, and vector output probabilities of states or transitions, corresponding states of multiple standard patterns whose vector output probabilities are expressed in a continuous distribution Alternatively, a standard pattern creation method is characterized in that a standard pattern is created in which the vector output probability of a state or transition is a mixed continuous distribution obtained by mixing the vector output probability distributions of transitions with weights.
(2)状態の集合と状態間の遷移確率と状態あるいは遷
移のベクトル出力確率とによって定義される音声認識用
の標準パターンの作成方法において、複数の話者につい
て話者ごとにその話者の音声データを用いて学習して作
成されたベクトル出力確率が連続分布で表される標準パ
ターンの対応する状態あるいは遷移のベクトル出力確率
分布を重み付きで混合した混合連続分布を状態あるいは
遷移のベクトル出力確率とする標準パターンを作成する
ことを特徴とする標準パターン作成方法。
(2) In a method for creating a standard pattern for speech recognition defined by a set of states, transition probabilities between states, and vector output probabilities of states or transitions, each speaker's voice is The vector output probability of a state or transition is a mixed continuous distribution that is a weighted mixture of the vector output probability distribution of the corresponding state or transition of a standard pattern in which the vector output probability created by learning using data is expressed as a continuous distribution. A standard pattern creation method characterized by creating a standard pattern.
(3)状態の集合と状態間の遷移確率と状態あるいは遷
移のベクトル出力確率とによって定義される音声認識用
の標準パターンの作成方法において、異なる環境で発声
あるいは収録した音声データを用いて環境ごとに学習し
て作成されたベクトル出力確率が連続分布で表される標
準パターンの対応する状態あるいは遷移のベクトル出力
確率分布を重み付きで混合した混合連続分布を状態ある
いは遷移のベクトル出力確率とする標準パターンを作成
することを特徴とする標準パターン作成方法。
(3) In a method for creating standard patterns for speech recognition defined by a set of states, transition probabilities between states, and vector output probabilities of states or transitions, speech data uttered or recorded in different environments is used to create a standard pattern for each environment. A standard in which the vector output probability of a state or transition is a mixed continuous distribution that is a weighted mixture of the vector output probability distributions of the corresponding state or transition of a standard pattern in which the vector output probability created by learning is expressed as a continuous distribution. A standard pattern creation method characterized by creating a pattern.
JP24686390A 1990-09-17 1990-09-17 Standard pattern creation method Expired - Fee Related JP3251005B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP24686390A JP3251005B2 (en) 1990-09-17 1990-09-17 Standard pattern creation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP24686390A JP3251005B2 (en) 1990-09-17 1990-09-17 Standard pattern creation method

Publications (2)

Publication Number Publication Date
JPH04125599A true JPH04125599A (en) 1992-04-27
JP3251005B2 JP3251005B2 (en) 2002-01-28

Family

ID=17154851

Family Applications (1)

Application Number Title Priority Date Filing Date
JP24686390A Expired - Fee Related JP3251005B2 (en) 1990-09-17 1990-09-17 Standard pattern creation method

Country Status (1)

Country Link
JP (1) JP3251005B2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08123468A (en) * 1994-10-24 1996-05-17 Atr Onsei Honyaku Tsushin Kenkyusho:Kk Unspecified speaker model generating device and speech recognition device
US7603276B2 (en) 2002-11-21 2009-10-13 Panasonic Corporation Standard-model generation for speech recognition using a reference model

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6367197A (en) * 1986-09-09 1988-03-25 松田 健次 Elliptic trammel

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS6367197A (en) * 1986-09-09 1988-03-25 松田 健次 Elliptic trammel

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08123468A (en) * 1994-10-24 1996-05-17 Atr Onsei Honyaku Tsushin Kenkyusho:Kk Unspecified speaker model generating device and speech recognition device
US7603276B2 (en) 2002-11-21 2009-10-13 Panasonic Corporation Standard-model generation for speech recognition using a reference model

Also Published As

Publication number Publication date
JP3251005B2 (en) 2002-01-28

Similar Documents

Publication Publication Date Title
CN110444214B (en) Speech signal processing model training method and device, electronic equipment and storage medium
Chen et al. Deep attractor network for single-microphone speaker separation
Huo et al. On-line adaptive learning of the continuous density hidden Markov model based on approximate recursive Bayes estimate
Qi et al. Voiced-unvoiced-silence classifications of speech using hybrid features and a network classifier
Skowronski et al. Automatic speech recognition using a predictive echo state network classifier
US10629185B2 (en) Statistical acoustic model adaptation method, acoustic model learning method suitable for statistical acoustic model adaptation, storage medium storing parameters for building deep neural network, and computer program for adapting statistical acoustic model
Nefian et al. Dynamic Bayesian networks for audio-visual speech recognition
US6343267B1 (en) Dimensionality reduction for speaker normalization and speaker and environment adaptation using eigenvoice techniques
EP0750293B1 (en) Triphone hidden Markov model (HMM) design method and apparatus
WO2019017403A1 (en) Mask calculating device, cluster-weight learning device, mask-calculating neural-network learning device, mask calculating method, cluster-weight learning method, and mask-calculating neural-network learning method
Fujita et al. Neural speaker diarization with speaker-wise chain rule
Lee et al. Ensemble of jointly trained deep neural network-based acoustic models for reverberant speech recognition
Zweig et al. Probabilistic modeling with Bayesian networks for automatic speech recognition.
Delcroix et al. Context adaptive neural network based acoustic models for rapid adaptation
Yu et al. Cam: Context-aware masking for robust speaker verification
WO2020170907A1 (en) Signal processing device, learning device, signal processing method, learning method, and program
KR100574769B1 (en) Speaker and environment adaptation based on eigenvoices imcluding maximum likelihood method
Sagi et al. A biologically motivated solution to the cocktail party problem
Girin et al. Audio source separation into the wild
Srijiranon et al. Thai speech recognition using Neuro-fuzzy system
US20050021335A1 (en) Method of modeling single-enrollment classes in verification and identification tasks
JPH04125599A (en) Reference pattern generating method
CN116434758A (en) Voiceprint recognition model training method and device, electronic equipment and storage medium
JPH0486899A (en) Standard pattern adaption system
Delfarah et al. Talker-independent speaker separation in reverberant conditions

Legal Events

Date Code Title Description
FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20071116

Year of fee payment: 6

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20081116

Year of fee payment: 7

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20081116

Year of fee payment: 7

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20091116

Year of fee payment: 8

LAPS Cancellation because of no payment of annual fees