JP5113797B2

JP5113797B2 - Dissimilarity utilization type discriminative learning apparatus and method, and program thereof

Info

Publication number: JP5113797B2
Application number: JP2009100865A
Authority: JP
Inventors: 篤中村; エリックマクダモット
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2009-04-17
Filing date: 2009-04-17
Publication date: 2013-01-09
Anticipated expiration: 2029-04-17
Also published as: JP2010250161A

Description

この発明は、音声、静止画像、動画画像等の時間軸上や空間軸上、或いはその双方において動的に変化し、何らかの概念情報を表現する信号から何らかの方法によって抽出した特徴量系列から、予め定められた信号の種別を離散値で表現したシンボル系列に同定するパターン認識のための識別的学習方法とその装置と、プログラムに関する。 The present invention is based on a feature amount sequence that is dynamically changed on a time axis and / or a space axis of voice, still image, moving image, etc. The present invention relates to a discriminative learning method, an apparatus thereof, and a program for pattern recognition for identifying a specified signal type as a symbol series expressed by discrete values.

パターン認識誤りの多くは、特徴量空間上で隣接する他シンボルとの境界周辺に位置するパターンの混同に起因して発生する。これを抑止するためには、学習の段階で正解シンボルと隣接するシンボルの両方の学習データから情報を得、混同を減らすようにモデルパラメータを推定することが有効である。このようなシンボル間識別能力の向上を図る枠組みを総称して識別的学習（Discriminative training）と呼んでいる。 Many pattern recognition errors occur due to confusion of patterns located around the boundary with other symbols adjacent in the feature amount space. In order to suppress this, it is effective to obtain information from learning data of both correct symbols and adjacent symbols at the learning stage and estimate model parameters so as to reduce confusion. Such a framework for improving the inter-symbol discriminating ability is generically called discriminative training.

その識別的学習法の代表的な実現法のひとつである最小識別誤り（ＭＣＥ：Minimum Classification Error、以降ＭＣＥと称する）学習を、シンボル系列を同定するパターン認識に適用する場合を例に説明する。図７にパターン認識の一種である連続単語音声認識にＭＣＥ学習法を適用した識別的学習装置７００の機能構成例を示す。 An example in which minimum identification error (MCE: Minimum Classification Error, hereinafter referred to as MCE) learning, which is one of the typical implementations of the discriminative learning method, is applied to pattern recognition for identifying symbol sequences will be described. FIG. 7 shows an example of a functional configuration of a discriminative learning device 700 in which the MCE learning method is applied to continuous word speech recognition which is a kind of pattern recognition.

識別的学習装置７００は、音響モデル記録部７０、正例言語モデル記録部７１、正例識別関数値生成部７３、正例用音声認識部７４、負例識別関数値生成部７５、音声認識部７６、正例・負例比較部７７、モデルパラメータ最適化部７８を備える。音響モデル記録部７０は、音響モデル７０１と言語モデル７０２とを含む。音響モデル７０１は、例えば連続単語音声認識に広く用いられる隠れマルコフモデル（ＨＭＭ：Hidden Markov Model）で実現されるものである。言語モデル７０２は、単語Ｎ−ｇｒａｍ確率モデルであり、単語の品詞属性情報や発音情報等を保持する単語発音辞書も含むものである。 The discriminative learning device 700 includes an acoustic model recording unit 70, a positive example language model recording unit 71, a positive example identification function value generation unit 73, a positive example speech recognition unit 74, a negative example identification function value generation unit 75, and a voice recognition unit. 76, a positive / negative example comparison unit 77, and a model parameter optimization unit 78. The acoustic model recording unit 70 includes an acoustic model 701 and a language model 702. The acoustic model 701 is realized by, for example, a hidden Markov model (HMM) widely used for continuous word speech recognition. The language model 702 is a word N-gram probability model, and includes a word pronunciation dictionary that holds part-of-speech attribute information, pronunciation information, and the like.

正例言語モデル記録部７１は、入力音声信号の特徴量系列Ｘに対応する正解言語シンボル系列である正例言語モデルを記録する。複数の特徴量系列Ｘによって特徴量系列群Ｚ＝{Ｘ_１，Ｘ_２，Ｘ_３，…}が構成される。正例識別関数値生成部７３は、特徴量系列Ｘとその正解Ｒ（Ｘ）を入力として、正例言語モデル記録部７１を参照してその特徴量系列Ｘが所属する正解のシンボル系列Ｒ（Ｘ）（以降、正例シンボル系列Ｒ（Ｘ）と称する）に対応するか否かを評価するための識別関数値Ｇ（式（１））を出力する。 The correct example language model recording unit 71 records a correct example language model that is a correct language symbol sequence corresponding to the feature amount sequence X of the input speech signal. A feature quantity sequence group Z = {X ₁ , X ₂ , X ₃ ,... The correct example identification function value generation unit 73 receives the feature quantity sequence X and the correct answer R (X) as an input, refers to the correct example language model recording unit 71, and receives the correct symbol series R (to which the feature quantity series X belongs). X) (hereinafter referred to as positive example symbol series R (X)) is output as an identification function value G (expression (1)) for evaluating whether or not it corresponds.

ここでΛは、音響モデル記録部７０に記録されたシンボルが持つモデルパラメータの集合である。 Here, Λ is a set of model parameters possessed by the symbols recorded in the acoustic model recording unit 70.

正例用音声認識部７４は、正例シンボル系列Ｒ（Ｘ）と識別関数値Ｇを並び替えて正解と推定される音声認識結果を出力する。 The correct example speech recognition unit 74 rearranges the correct example symbol series R (X) and the discriminant function value G, and outputs a speech recognition result estimated to be correct.

音声認識部７６は、特徴量系列Ｘとその正解Ｒ（Ｘ）を入力として、正例シンボル系列Ｒ（Ｘ）以外、つまり正解以外のシンボル系列Ｓ（以降、負例シンボル系列Ｓ（Ｓ≠Ｒ（Ｘ））と称する）を生成し、特徴量系列Ｘと負例シンボル系列Ｓ（Ｓ≠Ｒ（Ｘ））を負例識別関数値生成部７５に出力する。 The speech recognition unit 76 receives the feature amount series X and its correct answer R (X) as input, and other than the positive example symbol series R (X), that is, a symbol series S other than the correct answer (hereinafter, negative example symbol series S (S ≠ R (X)) is generated, and the feature amount series X and the negative example symbol series S (S ≠ R (X)) are output to the negative example identification function value generation unit 75.

負例識別関数値生成部７５は、音響モデル記録部７０を参照して入力音声信号の特徴量系列Ｘが負例シンボル系列Ｓ（Ｓ≠Ｒ（Ｘ））に対応するか否かを評価するための識別関数値Ｇ￣（式（２））を出力する。Ｇ￣の表記は式中及び図中の表記が正しい。 The negative example identification function value generation unit 75 refers to the acoustic model recording unit 70 and evaluates whether or not the feature amount series X of the input audio signal corresponds to the negative example symbol series S (S ≠ R (X)). A discrimination function value G￣ (formula (2)) is output. The notation of G￣ is correct in the formula and the figure.

ここで、Ｗは想定するシンボル系列全体の集合である。Ｐ_ΛＡ（Ｘ｜Ｓ）は、負例シン
ボル系列Ｓ（Ｓ≠Ｒ（Ｘ））を意図して発声された音声の特徴量系列がＸであることの確率であり音響モデルを用いて計算される。Ｐ_ΛＬ（Ｓ）は、負例シンボル系列Ｓ（Ｓ≠Ｒ（Ｘ））の出現に関する事前確率であり言語モデルを用いて計算される。ηは、人為的に定めることの出来る事前確率Ｐ_ΛＬ（Ｓ）の効果を制御する係数であり、ηが大きいほど式（２）における事前確率Ｐ_ΛＬ（Ｓ）の寄与が大きくなる。 Here, W is a set of all assumed symbol sequences. P _ΛA (X | S) is a probability that the feature amount sequence of speech uttered with the intention of the negative example symbol sequence S (S ≠ R (X)) is X, and is calculated using an acoustic model. The P _ΛL (S) is a prior probability regarding the appearance of the negative example symbol sequence S (S ≠ R (X)), and is calculated using a language model. η is a coefficient that controls the effect of prior probability P _ΛL (S) that can be artificially determined. The larger η is, the larger the contribution of prior probability P _ΛL (S) in equation (2) is.

正例・負例比較部７７は、正例シンボル系列を評価する識別関数値Ｇと負例シンボル系列Ｓを評価する識別関数値Ｇ￣を入力とし、それら全ての識別関数値Ｇ￣を用いて入力音声信号の特徴量系列Ｘについての誤識別の尺度である式（３）に示す誤識別尺度ｄ（Ｘ；Λ）を出力する。 The positive example / negative example comparison unit 77 receives an identification function value G that evaluates a positive example symbol series and an identification function value G 評価 that evaluates a negative example symbol series S, and uses all these discrimination function values G￣. A misidentification scale d (X; Λ) shown in Expression (3), which is a misclassification scale for the feature amount series X of the input speech signal, is output.

この誤識別尺度ｄ（Ｘ；Λ）の意味するところは、複数の負例シンボル系列Ｓ（Ｓ≠Ｒ（Ｘ））が与える識別関数値Ｇ￣の内の最大値と、正例シンボル系列が与える識別関数値Ｇとの差であり、これが正値をとるならば少なくとも一つの負例シンボル系列Ｓ（Ｓ≠Ｒ（Ｘ））の識別関数値Ｇ￣が正例シンボル系列Ｒ（Ｘ）の識別関数値Ｇを上回り、入力音声信号の特徴量Ｘは誤識別されたことになる。 This misidentification scale d (X; Λ) means that the maximum value among the discriminant function values G￣ given by a plurality of negative example symbol sequences S (S ≠ R (X)) and the positive example symbol sequences are This is a difference from the given discriminant function value G. If this takes a positive value, the discriminant function value G￣ of at least one negative example symbol sequence S (S ≠ R (X)) is equal to that of the positive example symbol sequence R (X). It exceeds the discrimination function value G, and the feature amount X of the input audio signal is erroneously discriminated.

誤識別尺度ｄ（Ｘ；Λ）を、より一般的には式（４）で表すことが出来る。 The misidentification scale d (X; Λ) can be expressed more generally by the equation (4).

このように、式（４）右辺第二項において、より多くの負例シンボル系列の影響を考慮
した誤識別尺度を考えることも出来る。ここでｈ（・）は任意の単調増加可逆関数、ｈ^-１（・）はその逆関数である。連続単語音声認識においては、誤識別尺度ｄ（Ｘ；Λ）を例えば式（５）で定義する。 In this way, in the second term on the right side of Equation (4), it is possible to consider a misidentification scale that takes into account the influence of more negative symbol sequences. Here, h (•) is an arbitrary monotonically increasing reversible function, and h ⁻¹ (•) is its inverse function. In continuous word speech recognition, the misidentification scale d (X; Λ) is defined by, for example, Expression (5).

ここでφは正定数であり、φが大きいほど右辺第二項においてはＳ≠Ｒ（Ｘ）を満たす
exp（ｇ（Ｘ，Ｓ；Λ）の中で最大値をとるものが支配的となる。 Here, φ is a positive constant, and as φ is larger, the second term on the right side satisfies S ≠ R (X).
The exp (g (X, S; Λ) having the maximum value is dominant.

モデルパラメータ最適化部７８は、誤識別尺度ｄ（Ｘ；Λ）を入力として、誤識別尺度ｄ（Ｘ；Λ）によって被る損失の大きさを表す損失関数loss（ｄ）を定義し、その総損失が最小化されるモデルパラメータΛを見つける。損失関数としては、例えば式（６）に示す様な連続非線形なものが考えられる。 The model parameter optimization unit 78 receives the misidentification measure d (X; Λ) as an input and defines a loss function loss (d) representing the magnitude of loss caused by the misidentification measure d (X; Λ). Find the model parameter Λ where the loss is minimized. As the loss function, for example, a continuous nonlinear one as shown in the equation (6) can be considered.

式（６）の損失関数loss（ｄ）は、ｄ＝０のシンボル系列境界周辺の狭い領域では誤識別尺度ｄ（Ｘ；Λ）の値に応じた０〜１の間の値をとり、ｄ＜０では０に漸近し、ｄ＞０では１に漸近する値をとる。また、最も簡単な線形の損失関数loss（ｄ）としては式（７）が考えられる。 The loss function loss (d) in equation (6) takes a value between 0 and 1 according to the value of the misclassification measure d (X; Λ) in a narrow region around the symbol sequence boundary where d = 0. Asymptotically approaches 0 when <0, and asymptotically approaches 1 when d> 0. Further, as the simplest linear loss function loss (d), equation (7) is conceivable.

式（７）の損失関数の場合は、損失値は誤識別尺度ｄ（Ｘ；Λ）と一致した値となる。

In the case of the loss function of Equation (7), the loss value is a value that matches the misclassification measure d (X; Λ).

今、一団の特徴量系列群Ｚ＝{Ｘ_１，Ｘ_２，Ｘ_３，…}と、その特徴量系列群Ｚの個々に対する正例シンボル系列{Ｒ（Ｘ_１），Ｒ（Ｘ_２），Ｒ（Ｘ_３），…}が学習データとして与えられると、特徴量系列群Ｚ全体の総損失Ｌ（Ｚ；Λ）は式（８）で得られる。 Now, a group of feature quantity series groups Z = {X ₁ , X ₂ , X ₃ ,...}, And positive example symbol series {R (X ₁ ), R (X ₂ ), for each of the feature quantity series groups Z When R (X ₃ ),...} Is given as learning data, the total loss L (Z; Λ) of the entire feature quantity sequence group Z is obtained by Expression (8).

総損失Ｌ（Ｚ；Λ）を最適化手法によって最小化するモデルパラメータΛを見つけることが、識別的学習方法の識別能力を高めることに相当する。最適化手法としては、確率的効果（ＰＤ：Probabilistic Descent）法、Quickprop法等を利用することが出来る。 Finding the model parameter Λ that minimizes the total loss L (Z; Λ) by the optimization method corresponds to enhancing the discrimination ability of the discriminative learning method. As an optimization method, a probabilistic effect (PD: Probabilistic Descent) method, a Quickprop method, or the like can be used.

また、もう一つの識別学習の代表例である最大相互情報量（ＭＭＩ：Maximum Mutual Information）学習法では、式（９）で定義されるＭＭＩ目的関数Ｆ_ＭＭＩ（Ｚ；Λ）を最大にするモデルパラメータΛを最適化手法によって見つける。 In addition, in a maximum mutual information (MMI) learning method, which is another typical example of discriminative learning, a model that maximizes the MMI objective function F _MMI (Z; Λ) defined by Equation (9). The parameter Λ is found by an optimization method.

ここで、Ｗ′は想定するシンボル系列全体Ｗに対して連続単語音声認識をした結果とし
て得られたシンボル系列Ｗ′（Ｗ′⊂Ｗ）である。また、式（５）を変形すると式（１０）で表せる。 Here, W ′ is a symbol sequence W ′ (W′⊂W) obtained as a result of continuous word speech recognition for the entire assumed symbol sequence W. Further, when equation (5) is modified, it can be expressed by equation (10).

このように、ｆ_ＭＭＩ（Ｘ，Λ）と、ＭＣＥ学習法における誤識別尺度ｄ（Ｘ；Λ）とは、ほぼ同じ手順によって計算出来る。特にＭＣＥ学習法において線形損失関数（式（６））を適用した場合の総損失の最小化は、ＭＭＩ目的関数の最大化とほぼ同じ手順になる。 Thus, f _MMI (X, Λ) and the misclassification measure d (X; Λ) in the MCE learning method can be calculated by substantially the same procedure. In particular, the minimization of the total loss when the linear loss function (formula (6)) is applied in the MCE learning method is almost the same procedure as that of the MMI objective function.

式（９）と式（１０）の計算では、特徴量系列Ｘを連続単語音声認識した結果として得
られる複数の対立関係にある正例単語系列と負例単語系列が用いられる。識別的学習にお
いては、正例，負例の認識単語系列を十分多くの種類用いて、より多様な認識誤りを考慮
することが重要である。このため、多数の単語から成る単語系列群を単語のネットワーク
構造で効率よく表現出来る単語ラティス等を利用して式（９）及び式（１０）の計算が行
われる。そして、正例単語系列を教師情報として利用して総損失が最小化されるようにモ
デルパラメータΛを最適化する。 In the calculations of Expressions (9) and (10), a plurality of opposing positive example word series and negative example word series obtained as a result of continuous word speech recognition of the feature amount series X are used. In discriminative learning, it is important to consider a wider variety of recognition errors by using a sufficiently large number of positive and negative recognition word sequences. For this reason, the calculation of Expressions (9) and (10) is performed using a word lattice or the like that can efficiently express a word sequence group composed of a large number of words with a word network structure. Then, the model parameter Λ is optimized so that the total loss is minimized by using the positive example word sequence as teacher information.

[Juang & Katagiri 92] Biing-Hwang JUANG and Shigeru KATAGIRI; Discriminative Learning for Minimum Error Classification, IEEE, Trans. On SP., Vol. 40, No. 12, pp. 3043-3054 (1992).[Juang & Katagiri 92] Biing-Hwang JUANG and Shigeru KATAGIRI; Discriminative Learning for Minimum Error Classification, IEEE, Trans. On SP., Vol. 40, No. 12, pp. 3043-3054 (1992). [Katagiri et al., 98] Shigeru KATAGIRI, Biing-Hwang JUANG and Chin-Hui LEE; Pattern Recognition Using a Family of Design Algorithms Based Upon the Generalized Probabilistic Descent Method, Proc. IEEE, Vol. 86, No. 11, pp. 2345-2373 (1998).[Katagiri et al., 98] Shigeru KATAGIRI, Biing-Hwang JUANG and Chin-Hui LEE; Pattern Recognition Using a Family of Design Algorithms Based Upon the Generalized Probabilistic Descent Method, Proc.IEEE, Vol. 86, No. 11, pp 2345-2373 (1998). 「McDermott & Katagiri, 97」 Erik MCDERMOTT and Shigeru KATAGIRI; String-Level MCE for Continuous Phoneme Recognition, Proc. Eurospeech97, Vol. 1, pp. 123-126 (1997)."McDermott & Katagiri, 97" Erik MCDERMOTT and Shigeru KATAGIRI; String-Level MCE for Continuous Phoneme Recognition, Proc.Eurospeech97, Vol. 1, pp. 123-126 (1997). [McDermott et al., 07] E. McDermott, T. Hazen, J. Le Roux, A. Nakamura, and S. Katagiri: Discriminative training for large vocabulary speech recognition using Minimum Classification Error, IEEE Transactions on Audio, Speech and Language Processing, vol. 15, no. 1, pp. 203-223 (2007).[McDermott et al., 07] E. McDermott, T. Hazen, J. Le Roux, A. Nakamura, and S. Katagiri: Discriminative training for large vocabulary speech recognition using Minimum Classification Error, IEEE Transactions on Audio, Speech and Language Processing, vol. 15, no. 1, pp. 203-223 (2007). [Macherey et al., 05] W. Macherey, L. Haferkamp, R.Schlueter, and H. Ney: Investigations on error minimizing training criteria for discriminative training in automatic speech recognition, in Proc. Interspeech’ 05 - Eurospeech, pp. 2133-2136 (2005).[Macherey et al., 05] W. Macherey, L. Haferkamp, R. Schlueter, and H. Ney: Investigations on error minimizing training criteria for discriminative training in automatic speech recognition, in Proc. Interspeech '05-Eurospeech, pp. 2133-2136 (2005).

従来の識別的学習法は、正例シンボル系列と負例シンボル系列とを、それぞれ別に計算して求め、それぞれの誤識別尺度を最小化するか又は、それぞれの識別関数値の差を最大化するシンボル系列を求めていた。そために次のような問題が生じる。第一に正例シンボル系列と負例シンボル系列のそれぞれの識別関数値を求める必要性から所要計算リソース量が大きいという問題がある。 In the conventional discriminative learning method, a positive example symbol series and a negative example symbol series are calculated separately, and each misclassification measure is minimized or a difference between respective discriminant function values is maximized. I was seeking a symbol series. This causes the following problems. First, there is a problem that the required calculation resource amount is large due to the necessity of obtaining the discrimination function values of the positive example symbol series and the negative example symbol series.

第二に正例に準ずるシンボル系列の選定が恣意的、且つ手作業で行われていた問題がある。特徴量系列Ｘの正例シンボル系列Ｒ（Ｘ）と比較して、文意への影響がわずかである相違を持つ複数の負例シンボル系列{Ｒ_１′（Ｘ），Ｒ_２′（Ｘ），Ｒ_３′（Ｘ），…}も正例に準ずるものとして扱い正例についての知識量を大きくする。知識量を増やすことで学習データに含まれない特徴量系列に対してより頑健なモデルパラメータを生成する識別学習装置が実現できる。しかし、その複数の負例シンボル系列{Ｒ_１′（Ｘ），Ｒ_２′（Ｘ），Ｒ_３′（Ｘ），…}の選定は恣意的に行われていた。 Secondly, there is a problem that selection of a symbol series according to a positive example is arbitrarily and manually performed. Compared with the positive example symbol series R (X) of the feature quantity series X, a plurality of negative example symbol series {R ₁ ′ (X), R ₂ ′ (X) having a slight difference in the meaning of the sentence , R ₃ ′ (X),...} Are treated as equivalent to the positive example, and the knowledge amount of the positive example is increased. By increasing the amount of knowledge, it is possible to realize an identification learning device that generates a more robust model parameter for a feature amount sequence not included in the learning data. However, the selection of the plurality of negative example symbol sequences {R ₁ ′ (X), R ₂ ′ (X), R ₃ ′ (X),...} Is arbitrarily performed.

この発明は、このような点に鑑みてなされたものであり、個々のシンボル系列に対する正例，負例の区別を、正例に対する相違度として一般化することで計算量を削減すると共に、正例に準ずるシンボル系列を客観的基準に基づいて自動的に目的関数に反映させ、正例に準ずるシンボル系列の選定を手動で行う必要の無い相違度利用型識別的学習装置と、その方法とプログラムを提供することを目的とする。 The present invention has been made in view of the above points, and reduces the amount of calculation by generalizing the distinction between positive examples and negative examples for each symbol series as the degree of difference from the positive examples. A difference-based discriminative learning device that automatically reflects symbol sequences according to examples in an objective function based on objective criteria and does not need to manually select symbol sequences according to positive examples, and a method and program thereof The purpose is to provide.

この発明の相違度利用型識別的学習装置は、モデルパラメータ記録部と、パターン認識部と、識別関数値生成部と、相違度算出部と、正例認識比較部と、モデルパラメータ最適化部とを具備する。モデルパラメータ記録部はモデルパラメータを記録する。パターン認識部は、学習データをパターン認識した認識シンボル系列を生成する。識別関数値生成部は、モデルパラメータ記録部を参照して認識シンボル系列が学習データの特徴量に対応するか否かを評価する識別関数値を出力する。相違度算出部は、認識シンボル系列と正例との相違度を算出する。正例認識比較部は、Ｎ個（Ｎ≧２）の減衰係数と識別関数値と相違度を入力とし、そのＮ個の減衰係数を用いて正例側統合値及びその正例側統合値を補正するための統合値を求め、上記正例側統合値を補正した目的関数を出力する。モデルパラメータ最適化部は、目的関数を用いて認識シンボル系列に対応するモデルパラメータを最適化する。 The discriminating-type discriminating learning device of the present invention includes a model parameter recording unit, a pattern recognition unit, an identification function value generation unit, a difference calculation unit, a positive example recognition comparison unit, a model parameter optimization unit, It comprises. The model parameter recording unit records model parameters. The pattern recognition unit generates a recognition symbol series obtained by pattern recognition of the learning data. The discriminant function value generation unit refers to the model parameter recording unit and outputs an discriminant function value for evaluating whether or not the recognized symbol sequence corresponds to the feature amount of the learning data. The difference calculation unit calculates a difference between the recognized symbol series and the positive example. The positive example recognition comparison unit receives N (N ≧ 2) attenuation coefficients, discriminant function values, and dissimilarities as inputs, and uses the N attenuation coefficients to determine the positive example side integrated value and the positive example side integrated value. An integrated value for correction is obtained, and an objective function obtained by correcting the positive-side integrated value is output. The model parameter optimization unit optimizes the model parameter corresponding to the recognized symbol sequence using the objective function.

この発明の相違度利用型識別的学習装置によれば、相違度推定部が認識シンボル系列と正例との相違度を推定し、正例，負例の区別を、正例に対する相違度として一般化する。
そして、その相違度を用いた目的関数によってモデルパラメータを最適化する。よって、従来の識別的学習装置のように負例シンボル系列を生成するためのパターン認識処理を必要としない。また、負例シンボル系列の識別関数値を計算する必要も無くなるので所要計算リソース量を削減することが出来る。 According to the discriminant learning device using the degree of difference of the present invention, the difference degree estimation unit estimates the degree of difference between the recognized symbol sequence and the positive example, and the distinction between the positive example and the negative example is generally used as the degree of difference with respect to the positive example. Turn into.
Then, the model parameter is optimized by an objective function using the degree of difference. Therefore, unlike the conventional discriminative learning device, pattern recognition processing for generating a negative example symbol series is not required. Further, since it is not necessary to calculate the identification function value of the negative example symbol series, the required calculation resource amount can be reduced.

また、正例と正例に準ずる認識シンボル系列を減衰係数によって自動的に重み付けして目的関数に反映するので、正例に準ずるシンボル系列の選定を恣意的、且つ手作業で行う必要が無くなる。 In addition, since the recognized symbol series according to the positive example and the recognized symbol series according to the positive example are automatically weighted by the attenuation coefficient and reflected in the objective function, it is not necessary to select the symbol series according to the positive example arbitrarily and manually.

この発明の相違度利用型識別的学習装置１００の機能構成例を示す図。The figure which shows the function structural example of the difference degree utilization type | formula discriminative learning apparatus 100 of this invention. 相違度利用型識別的学習装置１００の動作フローを示す図。The figure which shows the operation | movement flow of the dissimilarity utilization type discriminative learning apparatus 100. FIG. 最大相互情報量学習法による正例認識比較部１４の機能構成例を示す図。The figure which shows the function structural example of the positive example recognition comparison part 14 by the maximum mutual information learning method. 正例認識比較部１４の動作フローを示す図。The figure which shows the operation | movement flow of the positive example recognition comparison part. 最小識別誤り学習法による正例認識比較部５０の機能構成例を示す図。The figure which shows the function structural example of the positive example recognition comparison part 50 by the minimum identification error learning method. 正例認識比較部５０の動作フローを示す図。The figure which shows the operation | movement flow of the positive example recognition comparison part. 従来の識別的学習装置の機能構成の一例を示す図。The figure which shows an example of a function structure of the conventional discriminative learning apparatus.

この発明の相違度利用型識別的学習装置は、従来の識別的学習装置７００の正例シンボル系列のパターン認識を行う正例側音声認識部７４と正例識別関数値生成部７３に相当する機能構成を必要としない点で新しい。この発明の実施例の説明をする前に、この発明の基本的な考えについて説明する。 The discriminant-based discriminative learning device of the present invention has functions corresponding to a positive example side speech recognition unit 74 and a positive example discrimination function value generation unit 73 that perform pattern recognition of a positive example symbol series of the conventional discriminative learning device 700. New in that no configuration is required. Before describing the embodiments of the present invention, the basic idea of the present invention will be described.

〔基本的な考え〕
この発明の相違度利用型識別的学習方法は、各々の認識シンボル系列に対する正例、負
例の区別を、正例に対する相違度を用いて抽象化し、相違度を基準とした識別関数値の荷重和を用いて学習する方法である。 [Basic idea]
The discriminant-based discriminative learning method of the present invention abstracts the distinction between positive examples and negative examples for each recognition symbol sequence by using the degree of difference with respect to the positive example, and weights the discriminant function value based on the degree of difference. It is a method of learning using the sum.

まず、二つの任意のシンボル系列ＶとＳとの間の相違度を表す関数Δ（Ｖ，Ｓ）を導入する。関数Δ（Ｖ，Ｓ）の実現法としては、例えばＶとＳの間の編集距離等を用いることが出来る。また、ＶとＳとが共通の特徴量系列Ｘに対応付けられている場合には、シンボル系列を成す各々のシンボルと特徴量系列を成す各々の特徴量との対応関係に基づく相違尺度（参考文献;J. Zheng and A. Stolcke: Improved Discriminative Training Using Phone Lattices, in Proc. Interspeech, pp. 2125-2128, (2005)）等が利用できる。 First, a function Δ (V, S) representing the degree of difference between two arbitrary symbol sequences V and S is introduced. As a method for realizing the function Δ (V, S), for example, an edit distance between V and S can be used. When V and S are associated with a common feature quantity sequence X, a difference scale based on the correspondence between each symbol constituting a symbol series and each feature quantity constituting a feature quantity series (reference) Literature: J. Zheng and A. Stolcke: Improved Discriminative Training Using Phone Lattices, in Proc. Interspeech, pp. 2125-2128, (2005)).

正例シンボル系列Ｒ（Ｘ）と任意の認識シンボル系列Ｓ（Ｓ∈Ｗ′）との間の相違度Δ（Ｒ（Ｘ），Ｓ）は、認識シンボル系列Ｓ（Ｓ∈Ｗ′）の誤り尺度とみなすことが出来る。この相違度Δ（Ｒ（Ｘ），Ｓ）を利用して新たな目的関数Ｆ⁺（Ｚ；Λ）を式（１１）に示すように定義することが出来る。Ｆ⁺の＋はこの発明で提案するものであることを意味する。 The difference Δ (R (X), S) between the positive example symbol series R (X) and the arbitrary recognition symbol series S (S∈W ′) is an error of the recognition symbol series S (S∈W ′). It can be regarded as a scale. Using this difference Δ (R (X), S), a new objective function F ⁺ (Z; Λ) can be defined as shown in equation (11). F ⁺ means that it is proposed in the present invention.

ｆ⁺（Ｘ；Λ）の各項は、各認識シンボル系列Ｓ（Ｓ∈Ｗ′）に対応する識別関数値ｇ（Ｘ，Ｓ；Λ）を任意の単調増加関数ｈ（・）に通し、その値に正例との相違度Δ（Ｒ（Ｘ），Ｓ）と減衰係数σを乗じた値の指数関数値を乗ずるものである。つまり、指数減衰する値で荷重和をとったもので成り立っている。減衰係数σ_１，σ_２を適切に設定することで、目的関数Ｆ⁺（Ｚ；Λ）の最大化による識別的学習が行える。 Each term of f ⁺ (X; Λ) passes the discriminant function value g (X, S; Λ) corresponding to each recognition symbol sequence S (S∈W ′) through an arbitrary monotonically increasing function h (•), This value is multiplied by an exponential function value obtained by multiplying the difference Δ (R (X), S) from the positive example by the attenuation coefficient σ. In other words, it consists of a value that exponentially decays and a sum of loads. By appropriately setting the attenuation coefficients σ ₁ and σ ₂ , discriminative learning can be performed by maximizing the objective function F ⁺ (Z; Λ).

例えば、減衰係数σ_１を大きな値にすることにより、式（１１）の右辺第一項は、相違度Δ（Ｒ（Ｘ），Ｓ）が小さい程大きな値となる。よって、相違度Δ（Ｒ（Ｘ），Ｓ）＝０の正例と正例に準ずる相違度Δ（Ｒ（Ｘ），Ｓ）が極小さな認識シンボル系列Ｓ（Ｓ∈Ｗ′）についての識別関数値の影響が支配的となる。減衰係数σ_２＝０とすると式（１１）の右辺第二項の荷重値は全て１となり、認識シンボル系列群内（Ｓ_＊∈Ｗ′）の全ての識別関数値が公平に扱われる。つまり、式（１１）の右辺第一項の値は、正例若しくは正例に極近いシンボル系列の識別関数値の荷重和であり、右辺第二項はほとんどが負例のシンボル系列の識別関数値の累積となる。 For example, by setting the attenuation coefficient σ ₁ to a large value, the first term on the right side of the equation (11) becomes a larger value as the dissimilarity Δ (R (X), S) is smaller. Therefore, a positive example of the difference degree Δ (R (X), S) = 0 and a recognition symbol series S (S∈W ′) having a very small difference degree Δ (R (X), S) according to the positive example. The effect of the function value becomes dominant. When the attenuation coefficient σ ₂ = 0, the weight values in the second term on the right side of Equation (11) are all 1, and all discriminant function values in the recognition symbol sequence group (S _* ∈W ′) are treated fairly. That is, the value of the first term on the right side of Equation (11) is a weighted sum of the discriminant function values of the symbol series that are close to the positive example or the positive example, and the second term on the right side is mostly the discriminant function of the symbol series of the negative example. Cumulative value.

このように、この発明によれば、一回のパターン認識で目的関数を生成することが可能である。よって、従来の方法よりも計算量を削減することが出来る。また、式（１１）の第二式右辺第一項の指数減衰係数σ_１によって、正例と正例に準ずる認識シンボル系列Ｓ（Ｓ∈Ｗ′）を相違度の大きさに応じて自動的に重み付けして目的関数に反映させることが出来る。その結果、従来の方法のように正例に準ずるシンボル系列の選定を手作業で行う必要がない。 As described above, according to the present invention, it is possible to generate an objective function by a single pattern recognition. Therefore, the amount of calculation can be reduced as compared with the conventional method. Further, the recognized symbol sequence S (S∈W ′) according to the positive example and the positive example is automatically set according to the magnitude of the difference by the exponential attenuation coefficient σ ₁ of the first term on the right side of the second expression of the equation (11). Can be weighted and reflected in the objective function. As a result, it is not necessary to manually select a symbol series according to the positive example as in the conventional method.

以下、この発明の実施の形態を図面を参照して説明する。複数の図面中同一のものには同じ参照符号を付し、説明は繰り返さない。 Embodiments of the present invention will be described below with reference to the drawings. The same reference numerals are given to the same components in a plurality of drawings, and the description will not be repeated.

図１にこの発明の相違度利用型識別的学習装置１００の機能構成例を示す。その動作フローを図２に示す。相違度利用型識別的学習装置１００は、モデルパラメータ記録部１２と、識別関数値生成部１１と、パターン認識部１０と、相違度算出部１３と、正例認識比較部１４と、モデルパラメータ最適化部１５とを具備する。相違度利用型識別的学習装置１００は、例えばＲＯＭ、ＲＡＭ、ＣＰＵ等で構成されるコンピュータに所定のプログラムが読み込まれて、ＣＰＵがそのプログラムを実行することで実現されるものである。 FIG. 1 shows an example of the functional configuration of the discriminant utilization type discriminative learning device 100 of the present invention. The operation flow is shown in FIG. The dissimilarity utilization type discriminative learning device 100 includes a model parameter recording unit 12, a discriminant function value generation unit 11, a pattern recognition unit 10, a dissimilarity calculation unit 13, a positive example recognition comparison unit 14, and a model parameter optimum. And the conversion unit 15. The dissimilarity utilization type discriminative learning device 100 is realized by reading a predetermined program into a computer composed of, for example, a ROM, a RAM, a CPU, and the like, and executing the program by the CPU.

相違度利用型識別的学習装置１００は、学習データの特徴量系列Ｘとその正解である正例シンボル系列Ｒ（Ｘ）を入力信号として、最適化したモデルパラメータΛ_ｍを出力す
るものである。図１及び図３、図５の入力信号の表記は、多数の特徴量系列{Ｘ_１，Ｘ_２，Ｘ_３，…}、多数の正例シンボル系列{Ｒ（Ｘ_１），Ｒ（Ｘ_２），Ｒ（Ｘ_３），…}を、Ｘ_＊及びＲ（Ｘ_＊）と表記している。なお、本文中にはこの表記は用いない。 The dissimilarity utilization type discriminative learning device 100 outputs an optimized model parameter Λ _m by using a feature value sequence X of learning data and a correct example symbol sequence R (X) as a correct answer thereof as input signals. 1, 3, and 5 are represented by a large number of feature quantity sequences {X ₁ , X ₂ , X ₃ ,...} And a number of positive example symbol sequences {R (X ₁ ), R (X ₂ ), R (X ₃ ),...} Are denoted as X _* and R (X _* ). Note that this notation is not used in the text.

モデルパラメータ記録部１２は、音響モデルと言語モデルとから成る認識対象シンボル系列に対応するモデルパラメータを記録する。パターン認識部１０は、外部から入力される学習データの特徴量系列Ｘをパターン認識した認識シンボル系列Ｓを生成する（ステップＳ１０）。識別関数値生成部１１は、認識シンボル系列Ｓを入力としモデルパラメータ記録部１２を参照して、その認識シンボル系列Ｓが学習データの特徴量系列Ｘに対応するか否かを評価する識別関数値ｇ（Ｘ，Ｓ；Λ）を出力する（ステップＳ１１）。 The model parameter recording unit 12 records model parameters corresponding to a recognition target symbol sequence including an acoustic model and a language model. The pattern recognition unit 10 generates a recognition symbol series S obtained by pattern recognition of the feature quantity series X of the learning data input from the outside (step S10). The discriminant function value generation unit 11 receives the recognition symbol sequence S as an input and refers to the model parameter recording unit 12 to evaluate whether or not the recognition symbol sequence S corresponds to the feature amount sequence X of the learning data. g (X, S; Λ) is output (step S11).

識別関数値ｇ（Ｘ，Ｓ；Λ）は、パターン認識部１０を介して正例認識比較部１４に入力される。相違度算出部１３は、学習データの特徴量系列Ｘに対応する正例シンボル系列Ｒ（Ｘ）と、認識シンボル系列Ｓを入力として、その間の相違度Δ（Ｒ（Ｘ，Ｓ））を算出する（ステップＳ１３）。 The discriminant function value g (X, S; Λ) is input to the positive example recognition comparison unit 14 via the pattern recognition unit 10. The dissimilarity calculation unit 13 receives the positive example symbol series R (X) corresponding to the feature amount series X of the learning data and the recognition symbol series S as input, and calculates the dissimilarity Δ (R (X, S)) between them. (Step S13).

正例認識比較部１４は、予め定められたＮ個（Ｎ≧２）の減衰係数σ_１，σ_２と上記識別関数値ｇ（Ｘ，Ｓ；Λ）と相違度Δ（Ｒ（Ｘ，Ｓ））とを入力とし、Ｎ個の減衰係数を用いて正例側統合値及びその正例側統合値を補正するための統合値を求め、正例側統合値を補正した目的関数ｆ^＋（Ｘ；Λ）を出力する（ステップＳ１４）。 The positive example recognition / comparison unit 14 has predetermined N (N ≧ 2) attenuation coefficients σ ₁ and σ ₂ , the discrimination function value g (X, S; Λ), and the difference Δ (R (X, S )) As inputs, and the N-type attenuation coefficient is used to obtain a positive-side integrated value and an integrated value for correcting the positive-side integrated value, and the objective function f ⁺ ( X; Λ) is output (step S14).

モデルパラメータ最適化部１５は、目的関数ｆ^＋（Ｘ；Λ）を用いて目的関数ｆ^＋（Ｘ；Λ）をより大きくする様にパラメータの集合Λ内の認識シンボル系列に対応するモデルパラメータを最適化する（ステップＳ１５）。モデルパラメータ最適化部１５は、目的関数ｆ^＋（Ｘ；Λ）の増分が予め定めた収束条件閾値よりも小さな値になるまでモデルパラメータを最適化する。 Model parameter optimization unit 15, ⁺ objective function f model parameters corresponding to the recognized symbol sequence in the set lambda parameter as a larger;; ^(Λ X) objective function f ⁺ using ^(X lambda) Optimize (step S15). The model parameter optimization unit 15 optimizes the model parameter until the increment of the objective function f ⁺ (X; Λ) becomes a value smaller than a predetermined convergence condition threshold value.

以上のように実施例１の相違度利用型識別的学習装置１００は、従来の識別的学習装置７００が必要とした正例側のパターン認識部（正例用音声認識部７４）と正例識別関数値生成部７３に相当する機能構成がない。パターン認識部１０の１回のパターン認識動作で目的関数ｆ^＋（Ｘ；Λ）を生成する。したがって、従来の識別的学習装置７００よりも計算量を削減することが出来る。 As described above, the discriminant learning device 100 using the degree-of-difference according to the first embodiment has a pattern recognition unit (positive speech recognition unit 74 for positive examples) and a positive example identification required by the conventional discriminative learning device 700. There is no functional configuration corresponding to the function value generation unit 73. The objective function f ⁺ (X; Λ) is generated by one pattern recognition operation of the pattern recognition unit 10. Therefore, the calculation amount can be reduced as compared with the conventional discriminative learning device 700.

なお、実施例１のパターン認識部１０、識別関数値生成部１１、モデルパラメータ記録部１２、モデルパラメータ最適化部１５は、それぞれ従来の識別的学習装置７００の音声認識部７６、負例識別関数値生成部７５、モデルパラメータ最適化部７８に対応するものであり各々の動作も同じである。 The pattern recognition unit 10, the discrimination function value generation unit 11, the model parameter recording unit 12, and the model parameter optimization unit 15 according to the first embodiment are respectively a speech recognition unit 76 and a negative example discrimination function of the conventional discriminative learning device 700. The operations correspond to the value generation unit 75 and the model parameter optimization unit 78, and the operations are the same.

相違度利用型識別的学習装置１００は、相違度算出部１３と正例認識比較部１４の機能構成が新しい。以降の説明では、この新しい構成についてのみ説明を行う。なお、相違度利用型識別的学習装置１００では、学習データの特徴量系列Ｘに対応する正解シンボル系列Ｒ（Ｘ）を入力する例で説明を行ったが、正解シンボル系列Ｒ（Ｘ）の入力が無くてもこの発明の相違度利用型識別的学習装置１００は実現出来る。 In the dissimilarity utilization type discriminative learning device 100, the functional configurations of the dissimilarity calculation unit 13 and the positive example recognition comparison unit 14 are new. In the following description, only this new configuration will be described. Note that, in the dissimilarity utilization type discriminative learning device 100, an example in which the correct symbol sequence R (X) corresponding to the feature amount sequence X of the learning data is input has been described. However, the correct symbol sequence R (X) is input. Even if there is no difference, the dissimilarity utilization type discriminative learning device 100 of the present invention can be realized.

〔変形例〕
図１に破線で、正解シンボル系列Ｒ（Ｘ）の入力を必要としない実施例１の変形例の相違度利用型識別的学習装置１００′の機能構成例を示す。変形例は、相違度算出部１３′の入力信号とその動作のみが異なる。相違度推定部１３′は、学習データの特徴量系列Ｘとその特徴量系列Ｘをパターン認識した認識シンボル系列Ｓ（Ｓ∈Ｗ′）と識別関数値ｇ（Ｘ，Ｓ；Λ）を入力として、認識シンボル系列Ｓ（Ｓ∈Ｗ′）と正例との相違度Δ（Ｒ（Ｘ，Ｓ））の推定値Δ＾（Ｓ）を推定する。相違度の推定値Δ＾（Ｓ）は、例えば、[Wessel et ai., 01]の方法により認識結果の確信度θ（Ｓ）を推定することで計算出来る。（参考文献：F. Wessel, R. Schiuter, K. Macherey, and H. Ney: Confidence Measures for Large Vocabulary Continuous Speech Recognition, IEEE Transactions on Speech and Audio Processing, vol. 9, no. 3, pp. 288-298 (2001)） [Modification]
FIG. 1 shows a functional configuration example of the dissimilarity utilization type discriminative learning device 100 ′ of the modification of the first embodiment that does not require the input of the correct symbol series R (X) by a broken line. In the modification, only the input signal of the dissimilarity calculation unit 13 ′ and its operation are different. The dissimilarity estimation unit 13 ′ receives as input a feature quantity sequence X of learning data, a recognition symbol series S (S∈W ′) obtained by pattern recognition of the feature quantity series X, and an identification function value g (X, S; Λ). Then, an estimated value Δ ^ (S) of the difference Δ (R (X, S)) between the recognized symbol sequence S (S∈W ′) and the positive example is estimated. The estimated difference Δ ^ (S) can be calculated, for example, by estimating the certainty factor θ (S) of the recognition result by the method of [Wessel et ai., 01]. (Reference: F. Wessel, R. Schiuter, K. Macherey, and H. Ney: Confidence Measures for Large Vocabulary Continuous Speech Recognition, IEEE Transactions on Speech and Audio Processing, vol. 9, no. 3, pp. 288- 298 (2001))

確信度θ（Ｓ）とは、認識結果を正解として信頼できる度合いを表す尺度である。例えば、特徴量系列Ｘを連続パターン認識して得られた複数の認識シンボル系列Ｓ_１，Ｓ_２，Ｓ_３，…の内、Ｓ_１の確信度を考える。もし識別関数値ｇ（Ｘ_１，Ｓ_１，Λ）の値がｇ（Ｘ_２，Ｓ_２，Λ），（Ｘ_３，Ｓ_３，Λ），…との比較において突出して大きければ、Ｓ_１が正解である確信度は大きいとみなすことが出来る。 The certainty factor θ (S) is a scale that represents the degree of confidence that the recognition result is a correct answer. For example, consider the certainty of S ₁ among a plurality of recognition symbol series S ₁ , S ₂ , S ₃ ,... Obtained by continuous pattern recognition of the feature quantity series X. If the discriminant function value _{_{g (X 1, S 1,}} Λ) the value of _{_{g (X 2, S 2,}} Λ), (X 3, S 3, Λ), larger projects in comparison ... and, _{S 1} It can be considered that the certainty that is correct is large.

逆にｇ（Ｘ_２，Ｓ_２，Λ），（Ｘ_３，Ｓ_３，Λ），…の多くが、ｇ（Ｘ_１，Ｓ_１，Λ）の値と同程度の値を持つ場合は、Ｓ_１が正解であることの確信度は小さくなる。この他、Ｓ_１を構成する各シンボル毎に対応する部分特徴量系列の長さの妥当性や、Ｓ_１を構成する各シンボルが同一シンボル系列内に共に存在することの妥当性も考慮して、Ｓ_１が正解として信頼できる度合いを０〜１以下の数値θ（Ｓ_１）で表す。 Conversely, when many of g (X ₂ , S ₂ , Λ), (X ₃ , S ₃ , Λ),... Have the same value as g (X ₁ , S ₁ , Λ), confidence that S ₁ is the correct answer is reduced. In addition, considering the validity of the length of the partial feature quantity sequence corresponding to each symbol constituting S ₁ and the validity of the symbols constituting S ₁ being present in the same symbol series. , S ₁ represents a degree of reliability as a correct answer by a numerical value θ (S ₁ ) of 0 to 1 or less.

Ｓ_１と正例シンボル系列の相違度推定値Δ＾（Ｓ_１）は、Ｓ_１が正解として信頼できるほど、つまり数値θ（Ｓ_１）が１に近いほど０に近づく、Ｓ_１が正解として信頼できないほど大きくなるように定める。この相違度推定値Δ＾（Ｓ_１）は例えば式（１２）で計算することが出来る。 The estimated difference Δ ^ (S ₁ ) between S ₁ and the positive example symbol series is closer to 0 as S ₁ is more reliable as a correct answer, that is, as the numerical value θ (S ₁ ) is closer to ₁ , and S ₁ is taken as a correct answer. Determine to be unreliable. This difference degree estimated value Δ ^ (S ₁ ) can be calculated by, for example, Expression (12).

この相違度推定値Δ＾（Ｓ）を用いることで、正例シンボル系列Ｒ（Ｘ）が明示的に与えられなくても識別的学習を実行することが可能である。従来の識別敵学習では、正例シンボル系列が与えられていなければ識別的学習を実行できない問題から、大量のデータの全てに正例シンボル系列を付与する必要があった。しかし、相違度推定値Δ＾（Ｓ）を計算する相違度算出部１３′を設けることで、手作業で正例シンボル系列を用意する必要が無くなる。 By using this difference degree estimation value Δ ^ (S), it is possible to execute discriminative learning even if the positive symbol series R (X) is not explicitly given. In the conventional discriminative enemy learning, since the discriminative learning cannot be performed unless the positive example symbol sequence is given, it is necessary to assign the positive example symbol sequence to all of a large amount of data. However, by providing the dissimilarity calculation unit 13 ′ for calculating the dissimilarity estimated value Δ ^ (S), it is not necessary to prepare a normal example symbol series manually.

図３に、正例認識比較部１４のより具体的な機能構成例を示して実施例１を更に詳しく説明する。図４のその動作フローを示す。図３はＭＭＩ学習法における目的関数ｆ^＋（Ｘ；Λ）の計算の実現例を示したものである。 FIG. 3 shows a more specific functional configuration example of the positive example recognition / comparison unit 14, and the first embodiment will be described in more detail. The operation | movement flow of FIG. 4 is shown. FIG. 3 shows an implementation example of the calculation of the objective function f ⁺ (X; Λ) in the MMI learning method.

正例認識比較部１４は、識別関数平滑化・逆対数化手段１４０と、正例側荷重手段１４１と、認識側荷重手段１４２と、正例側統合・対数化手段１４３と、認識側統合・対数化手段１４４と、統合値比較手段１４５とを備える。なお、識別関数平滑化・逆対数化手段１４０と正例側荷重手段１４１と認識側荷重手段１４２とを、それぞれ一つずつ設ける例を示しているが、多数入力される認識シンボル系列Ｓ（Ｓ∈Ｗ′）にそれぞれ対応する各手段１４０，１４１，１４２を設けて識別関数値ｇ（Ｘ，Ｓ；Λ）と相違度Δ（Ｒ（Ｘ），Ｓ）を同時に処理するようにしても良い。図３に示す例は、多数入力される認識シンボル系列Ｓ（Ｓ∈Ｗ′）を時間を分けて処理する方式の機能構成例である。 The positive example recognition comparison unit 14 includes a discriminant function smoothing / reverse logarithmizing unit 140, a positive example side load unit 141, a recognition side load unit 142, a positive example side integration / logarithmization unit 143, and a recognition side integration / A logarithmizing means 144 and an integrated value comparing means 145 are provided. In addition, although the example which provides one each of discriminant function smoothing / antilogarithmizing means 140, positive example side load means 141, and recognition side load means 142 is shown, recognition symbol series S (S Each means 140, 141, 142 corresponding to εW ′) may be provided to simultaneously process the discrimination function value g (X, S; Λ) and the difference Δ (R (X), S). . The example shown in FIG. 3 is an example of a functional configuration of a method for processing a recognition symbol sequence S (SεW ′), which is input in large numbers, at different times.

識別関数平滑化・逆対数化手段１４０は、識別関数平滑化値Ａを式（１３）で計算する（ステップＳ１４０）。 The discriminant function smoothing / antilogarithmizing unit 140 calculates the discriminant function smoothing value A by the equation (13) (step S140).

識別関数平滑化値Ａは、識別関数値ｇ（Ｘ，Ｓ；Λ）に予め定められた正定数φを乗じた値の指数関数値である。 The discriminant function smoothing value A is an exponential function value obtained by multiplying the discriminant function value g (X, S; Λ) by a predetermined positive constant φ.

正例側荷重手段１４１は、正例側荷重値Ｂを式（１４）で計算する（ステップＳ１４１）。 The positive example side load means 141 calculates the positive example side load value B by the equation (14) (step S141).

正例側荷重値Ｂは、相違度Δ（Ｒ（Ｘ），Ｓ）に第１の減衰係数-σ_１を乗じた値の指数関数値に、識別関数平滑化値Ａを乗じた値である。 The positive side load value B is a value obtained by multiplying an exponential function value obtained by multiplying the degree of difference Δ (R (X), S) by the first attenuation coefficient −σ ₁ by the discrimination function smoothing value A. .

認識側荷重手段１４２は、認識側統合値Ｃを式（１５）で計算する（ステップＳ１４２）。 The recognition-side load means 142 calculates the recognition-side integrated value C using equation (15) (step S142).

認識側荷重値Ｃは、相違度Δ（Ｒ（Ｘ），Ｓ）に第２の減衰係数-σ_２を乗じた値の指数関数値に、識別関数平滑化値Ａを乗じた値である。 The recognition-side load value C is a value obtained by multiplying an exponential function value obtained by multiplying the degree of difference Δ (R (X), S) by the second attenuation coefficient −σ ₂ by the discrimination function smoothing value A.

正例側統合・対数化手段１４３は、正例側統合値Ｄを式（１６）で計算する（ステップ
Ｓ１４３）。 The positive example side integration / logarithmization means 143 calculates the positive example side integration value D by the equation (16) (step S143).

正例側統合値Ｄは、全ての認識シンボル系列Ｓ（Ｓ∈Ｗ′）に対する正例側荷重値Ｂの総和の対数関数値である。 The positive example side integrated value D is a logarithmic function value of the sum of the positive example side load values B for all the recognized symbol sequences S (SεW ′).

認識側統合・対数化手段１４４は、認識側統合値Ｅを式（１７）で計算する（ステップＳ１４４）。 The recognizing side integration / logarithmizing means 144 calculates the recognizing side integrated value E by the equation (17) (step S144).

認識側統合値Ｅは、全ての認識シンボル系列Ｓ（Ｓ∈Ｗ′）に対応する認識側荷重値Ｃ
を累計した値の対数関数値であり、正例側統合値を補正するための統合値である。 The recognition-side integrated value E is a recognition-side load value C corresponding to all recognition symbol sequences S (S∈W ′).
Is a logarithmic function value of the accumulated value, and an integrated value for correcting the positive example side integrated value.

統合値比較手段１４５は、正例側統合値Ｄと認識側統合値Ｃを入力として式（１８）に示す目的関数ｆ^＋（Ｘ；Λ）を出力する（ステップＳ１４５）。 The integrated value comparison means 145 outputs the objective function f ⁺ (X; Λ) shown in the equation (18) with the positive example integrated value D and the recognition integrated value C as inputs (step S145).

式（１８）において、例えば減衰係数σ_１を十分大きくとり、σ_２＝０として各種最適化手法によって最大化することで、従来のＭＭＩ学習法より小さい所要計算リソース量で従来のＭＭＩ学習法と同等の認識精度の向上が図れる。所要計算リソース量を削減する方法は実施例１の構成に限られない、他の方法を実施例２として説明する。 In Expression (18), for example, by taking a sufficiently large attenuation coefficient σ ₁ and maximizing it by various optimization methods with σ ₂ = 0, the conventional MMI learning method can be compared with the conventional MMI learning method with a smaller required calculation resource amount. Equivalent recognition accuracy can be improved. The method for reducing the required calculation resource amount is not limited to the configuration of the first embodiment, and another method will be described as a second embodiment.

図５に実施例２の正例認識比較部５０の機能構成例を示す。その動作フローを図６に示
す。正例認識比較部５０は、識別関数平滑化・逆対数化手段１４０と、正例側荷重手段１
４１と、正例側統合・対数化手段１４３と、負例側荷重手段２４２と、負例側統合・対数
化手段２４４と、統合値比較手段２４５とを備える。識別関数平滑化・逆対数化手段１４
０と正例側荷重手段１４１と正例側統合・対数化手段１４３とは、実施例１の正例認識比
較部１４と同じものであり、正例側統合・対数化手段１４３は正例側統合値Ｄ（式（１６））を出力する。よって、この部分の説明は省略する。 FIG. 5 shows a functional configuration example of the positive example recognition / comparison unit 50 of the second embodiment. The operation flow is shown in FIG. The positive example recognition / comparison unit 50 includes a discriminant function smoothing / antilogarithmizing unit 140 and a positive example side loading unit 1.
41, positive example side integration / logarithmization means 143, negative example side load means 242, negative example side integration / logarithmization means 244, and integrated value comparison means 245. Discriminant function smoothing / antilogarithmization means 14
The positive example side load means 141 and the positive example side integration / logarithmization means 143 are the same as the positive example recognition comparison unit 14 of the first embodiment, and the positive example side integration / logarithmization means 143 is the positive example side. The integrated value D (Formula (16)) is output. Therefore, description of this part is omitted.

負例側荷重手段２４２は、負例側統合値Ｋを式（１９）で計算する（ステップＳ２４２）。 The negative example side load means 242 calculates the negative example side integrated value K by the equation (19) (step S242).

負例側統合値Ｋは、相違度Δ（Ｒ（Ｘ），Ｓ）に第１の減衰係数σ_１よりも小さな第２の減衰係数σ_２を乗じた値の指数関数値に識別関数平滑化値Ａを乗じた第１負例側荷重値と、第２減衰係数よりも大きな第３の減衰係数σ_３を相違度Δ（Ｒ（Ｘ），Ｓ）に乗じた値の指数関数値に識別関数平滑化値Ａを乗じた第２負例側荷重値とを計算し、第１負例側荷重値から第２負例側荷重値を減じた値を正例側統合値Ｄを補正するための統合値である負例側統合値として計算する。 The negative example-side integrated value K is an identification function smoothed by an exponential function value obtained by multiplying the degree of difference Δ (R (X), S) by a _second attenuation coefficient σ ₂ smaller than the first attenuation coefficient σ ₁ . The first negative example side load value multiplied by the value A and the third attenuation coefficient σ ₃ larger than the second attenuation coefficient are identified as exponential function values obtained by multiplying the difference Δ (R (X), S) by the difference Δ The second negative example side load value multiplied by the function smoothing value A is calculated, and the positive example side integrated value D is corrected by subtracting the second negative example side load value from the first negative example side load value. It is calculated as the negative example side integrated value that is the integrated value of.

負例側統合・対数化手段２４４は、負側統合値Ｌを式（２０）で計算する（ステップＳ２４４）。 The negative example side integration / logarithmizing means 244 calculates the negative side integration value L by the equation (20) (step S244).

負例側統合値Ｌは、全ての認識シンボル系列Ｓ（Ｓ∈Ｗ′）に対する負例側統合値Ｌの
総和の対数関数値である。 The negative example side integration value L is a logarithmic function value of the sum of the negative example side integration values L for all recognition symbol sequences S (SεW ′).

統合値比較手段２４５は、正例側統合値Ｄから負例側統合値Ｌを減じた識別尺度ｄ（Ｘ；Λ）を式（２１）で計算し、識別尺度ｄ（Ｘ；Λ）を損失関数に通した目的関数loss（ｄ^＋（Ｘ；Λ））として出力する。損失関数は例えば上記した式（６）に示した様なものである。 The integrated value comparison means 245 calculates the identification measure d (X; Λ) obtained by subtracting the negative example-side integrated value L from the positive example-side integrated value D by the equation (21), and the identification measure d (X; Λ) is lost. The objective function loss (d ⁺ (X; Λ)) passed through the function is output. The loss function is, for example, as shown in the above equation (6).

正例側統合値Ｄ内の減衰係数σ_１を大きくとると共に、負例側統合値Ｌ内の減衰係数σ_２＝０、例えばσ_３≒σ_１とすれば、学習における対立シンボル系列として負例シンボル系列のみを用いることが出来る。つまり、減衰係数σ３を減衰係数σ１に近い値にすることにより、負例側シンボル系列から正例と正例に極近い負例シンボル系列を削除することが出来る。 If the attenuation coefficient σ ₁ in the positive example side integrated value D is increased and the attenuation coefficient σ ₂ = 0 in the negative example side integrated value L, for example, σ ₃ ≈σ ₁ , a negative example as an opposing symbol sequence in learning is used. Only symbol sequences can be used. That is, by setting the attenuation coefficient σ3 to a value close to the attenuation coefficient σ1, the negative example symbol series that is very close to the positive example and the positive example can be deleted from the negative example side symbol series.

このように正例認識比較部５０を備えた相違度利用型識別的学習装置によれば、ＭＣＥ
学習法より小さな所要計算リソース量でＭＣＥ学習法と同等の認識精度の向上が図れる。 As described above, according to the discriminant learning device using the degree of difference provided with the positive example recognition comparison unit 50, the MCE
The recognition accuracy equivalent to that of the MCE learning method can be improved with a smaller amount of required calculation resources than the learning method.

〔評価実験〕
この発明の実施例１の相違度利用型識別的学習装置１００を用いて、モデルパラメータ
を取得する実験を行った。実験は、日本語の学会講演約２３０時間分の音声を学習データ
として用いた。まず、既存技術である最大尤度学習法によって初期モデルを学習し、その
初期モデルをそのまま用いて連続単語音声認識装置を動作させた場合の単語誤り率は２１．２％であった。 [Evaluation experiment]
An experiment for acquiring model parameters was performed using the discriminating-type discriminative learning device 100 of Example 1 of the present invention. In the experiment, about 230 hours of speech for a Japanese conference was used as learning data. First, when the initial model was learned by the maximum likelihood learning method which is an existing technique and the continuous word speech recognition apparatus was operated using the initial model as it was, the word error rate was 21.2%.

この２１．２％に対して同じ学習データを用いて既存のＭＭＩ学習法によって得たモデ
ルパラメータを用いた場合の単語誤り率は１８．６％であった。この単語誤り率を比較対象として、相違度利用型識別的学習装置１００の第２の減衰係数σ_２＝−４、第１の減衰係数σ_１を１，２，３，４として得たモデルパラメータによる連続単語音声認識結果の単語誤り率を表１に示す。 The word error rate in the case of using the model parameters obtained by the existing MMI learning method using the same learning data for 21.2% was 18.6%. Using this word error rate as a comparison target, model parameters obtained by setting the second attenuation coefficient σ ₂ = −4 and the first attenuation coefficient σ ₁ as ₁ , 2, 3, 4 of the dissimilarity utilization type discriminative learning device 100 Table 1 shows the word error rates of the continuous word speech recognition results obtained from the above.

この発明の相違度利用型識別的学習装置１００によるモデルパラメータで連続単語音声
認識装置を動作させた場合の単語誤り率は１８．５％〜１８．７％と、従来のＭＭＩ学習法による単語誤り率と同等の結果が得られた。なお、この発明の実施例２と従来のＭＣＥ学習法との比較実験は未実施であるが、ＭＭＩ学習法の比較結果と同じように同水準の単語誤り率となると予測される。 The word error rate when the continuous word speech recognition apparatus is operated with the model parameters by the discriminative learning apparatus 100 using the degree-of-difference of the present invention is 18.5% to 18.7%, and the word error by the conventional MMI learning method A result equivalent to the rate was obtained. In addition, although the comparison experiment of Example 2 of this invention and the conventional MCE learning method is not implemented, it is estimated that it becomes the same level word error rate similarly to the comparison result of the MMI learning method.

以上説明した相違度利用型識別的学習装置は、例えば音声認識装置に利用することが可
能である。また、それ以外の用途として静止画像、動画画像等の時間軸上、空間軸上ある
いはその双方において変化し、何らかの概念情報を表現する特徴量系列をパターン認証対
象とする認識装置に適用することが可能である。具体例としては、手書き文字の画像情報
のパターン認識に用いることが出来る。 The dissimilarity utilization type discriminative learning device described above can be used for a speech recognition device, for example. Also, as other uses, it may be applied to a recognition apparatus that uses a feature quantity series that changes on the time axis, the space axis, or both of a still image, a moving image, etc., and expresses some conceptual information as a pattern authentication target. Is possible. As a specific example, it can be used for pattern recognition of image information of handwritten characters.

なお、正定数φ及び減衰係数σ_１〜σ_３は正例認識比較部１４，５０に予め設定されている例で説明したが、これらの値を外部から与えるようにしても良い。また、上記方法及び装置において説明した処理は、記載の順に従って時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されるとしてもよい。 Although the positive constant φ and the attenuation coefficients σ _{1 to} σ ₃ have been described as examples set in advance in the positive example recognition comparison units 14 and 50, these values may be given from the outside. Further, the processes described in the above method and apparatus are not only executed in time series according to the order of description, but also may be executed in parallel or individually as required by the processing capability of the apparatus that executes the processes. Good.

また、上記装置における処理手段をコンピュータによって実現する場合、各装置が有すべき機能の処理内容はプログラムによって記述される。そして、このプログラムをコンピュータで実行することにより、各装置における処理手段がコンピュータ上で実現される。 Further, when the processing means in the above apparatus is realized by a computer, the processing contents of functions that each apparatus should have are described by a program. Then, by executing this program on the computer, the processing means in each apparatus is realized on the computer.

この処理内容を記述したプログラムは、コンピュータで読み取り可能な記録媒体に記録しておくことができる。コンピュータで読み取り可能な記録媒体としては、例えば、磁気記録装置、光ディスク、光磁気記録媒体、半導体メモリ等どのようなものでもよい。具体的には、例えば、磁気記録装置として、ハードディスク装置、フレキシブルディスク、磁気テープ等を、光ディスクとして、ＤＶＤ（Digital Versatile Disc）、ＤＶＤ−ＲＡＭ（Random Access Memory）、ＣＤ−ＲＯＭ（Compact Disc Read Only Memory）、ＣＤ−Ｒ（Recordable）/ＲＷ（ReWritable）等を、光磁気記録媒体として、ＭＯ（Magneto Optical disc）等を、半導体メモリとしてＥＥＰ−ＲＯＭ（Electronically Erasable and Programmable-Read Only Memory）等を用いることができる。 The program describing the processing contents can be recorded on a computer-readable recording medium. As the computer-readable recording medium, for example, any recording medium such as a magnetic recording device, an optical disk, a magneto-optical recording medium, and a semiconductor memory may be used. Specifically, for example, as a magnetic recording device, a hard disk device, a flexible disk, a magnetic tape or the like, and as an optical disk, a DVD (Digital Versatile Disc), a DVD-RAM (Random Access Memory), a CD-ROM (Compact Disc Read Only). Memory), CD-R (Recordable) / RW (ReWritable), etc., magneto-optical recording medium, MO (Magneto Optical disc), etc., semiconductor memory, EEP-ROM (Electronically Erasable and Programmable-Read Only Memory), etc. Can be used.

また、このプログラムの流通は、例えば、そのプログラムを記録したＤＶＤ、ＣＤ−ＲＯＭ等の可搬型記録媒体を販売、譲渡、貸与等することによって行う。さらに、このプログラムをサーバコンピュータの記録装置に格納しておき、ネットワークを介して、サーバコンピュータから他のコンピュータにそのプログラムを転送することにより、このプログラムを流通させる構成としてもよい。 The program is distributed by selling, transferring, or lending a portable recording medium such as a DVD or CD-ROM in which the program is recorded. Further, the program may be distributed by storing the program in a recording device of a server computer and transferring the program from the server computer to another computer via a network.

また、各手段は、コンピュータ上で所定のプログラムを実行させることにより構成することにしてもよいし、これらの処理内容の少なくとも一部をハードウェア的に実現することとしてもよい。 Each means may be configured by executing a predetermined program on a computer, or at least a part of these processing contents may be realized by hardware.

Claims

A model parameter recording unit for recording model parameters;
A pattern recognition unit that generates a recognition symbol sequence obtained by pattern recognition of a feature amount sequence of learning data;
An identification function value generation unit that outputs an identification function value for evaluating whether or not the recognition symbol sequence corresponds to a feature amount sequence of the learning data with reference to the model parameter recording unit;
A difference calculation unit for calculating a difference between the recognition symbol series and the positive example;
N predetermined attenuation coefficients (N ≧ 2), the discriminant function value, and the dissimilarity are input, and the positive example side integrated value and the positive example side integrated value are obtained using the N attenuation coefficients. A positive example recognition / comparison unit that calculates an integrated value for correction and outputs an objective function that corrects the positive example side integrated value;
A model parameter optimization unit that optimizes the model parameter corresponding to the recognition symbol sequence using the objective function;
Dissimilarity utilization type discriminative learning device comprising:

The dissimilarity utilization type discriminative learning device according to claim 1,
The positive example recognition comparison unit includes two attenuation coefficients,
The positive example recognition comparison unit
Obtain an identification function smoothing value that is an exponential function value obtained by multiplying the above identification function value by a predetermined positive constant;
The exponential function value obtained by multiplying the degree of difference by the first attenuation coefficient is multiplied by the discriminant function smoothing value to obtain a positive example side load value, and the sum of the positive example side load values for all recognition symbol sequences can be arbitrarily set. Through the monotonically increasing reversible function of
An exponential function value obtained by multiplying the degree of difference by a second attenuation coefficient smaller than the first attenuation coefficient is multiplied by the discrimination function smoothing value to obtain a recognition-side load value, and the recognition for all recognition symbol sequences is performed. The recognition side integrated value, which is an integrated value for correcting the positive example side integrated value by passing the sum of the side load values through an arbitrary monotonically increasing reversible function,
A discriminant learning apparatus using the degree-of-difference, wherein a value obtained by subtracting the recognition-side integrated value from the positive example integrated value is output as an objective function.

In the dissimilarity utilization type discriminative learning device according to claim 1 or 2,
The positive example recognition comparison unit
Discriminant function smoothing / antilogarithmizing means for multiplying the discriminant function value by a predetermined positive constant and calculating an exponential function value of the value as a discriminant function smoothing value;
Positive example side load means for calculating a positive example side load value obtained by multiplying the exponential function value obtained by multiplying the difference by the first attenuation coefficient by the discriminant function smoothing value;
Positive example side integration / logarithmizing means for calculating a logarithmic function value of the sum of the positive example side load values for all recognition symbol sequences as a positive example side integrated value;
Recognition-side load means for calculating a recognition-side load value obtained by multiplying the discriminant function smoothing value by an exponential function value obtained by multiplying the difference by the second attenuation coefficient smaller than the first attenuation coefficient;
Recognizing side integration / logarithm that outputs a logarithmic function value obtained by accumulating the above recognition side load values corresponding to all recognition symbol series as the above recognition side integration value, which is an integrated value for correcting the positive example side integration value And
Integrated value comparison means for outputting, as an objective function, a value obtained by subtracting the recognized integrated value from the positive example integrated value;
A discriminant learning device using the degree of difference, characterized by comprising:

The dissimilarity utilization type discriminative learning device according to claim 1,
The positive example recognition comparison unit includes three attenuation coefficients,
The above objective function is a misclassification measure passed through a loss function,
The positive example recognition comparison unit
Obtain an identification function smoothing value that is an exponential function value obtained by multiplying the identification function value by a positive constant set in advance,
The exponential function value obtained by multiplying the degree of difference by the first attenuation coefficient is multiplied by the discriminant function smoothing value to obtain a positive example side load value, and the sum of the positive example side load values for all recognition symbol sequences can be arbitrarily set. Through the monotonically increasing reversible function of
A first negative example side load value obtained by multiplying the discriminant function smoothing value by an exponential function value obtained by multiplying the degree of difference by a second damping coefficient smaller than the first damping coefficient, and a second damping coefficient A second negative example side load value obtained by multiplying the exponential function value obtained by multiplying the degree of difference by the larger third attenuation coefficient by the discriminant function smoothing value, and calculating from the first negative example side load value. The negative example side load value is reduced to the negative example side load value, and the sum of the negative example side load values for all recognition symbol sequences is passed through an arbitrary monotonically increasing reversible function to correct the positive example side integrated value. It is calculated as a negative example side integrated value that is an integrated value for
The discriminant learning apparatus using the degree of difference is characterized in that the positive example side integrated value is subtracted from the negative example side integrated value to obtain the discrimination scale.

In the dissimilarity utilization type discriminative learning device according to claim 1 or 4,
The positive example recognition comparison unit
Discriminant function smoothing / antilogarithmizing means for multiplying the discriminant function value by a predetermined positive constant and calculating an exponential function value of the value as a discriminant function smoothing value;
Positive example side load means for calculating a positive example side load value obtained by multiplying the exponential function value obtained by multiplying the difference by the first attenuation coefficient by the discriminant function smoothing value;
Positive example side integration / logarithmizing means for calculating a logarithmic function value of the sum of the positive example side load values for all recognition symbol sequences as a positive example side integrated value;
A first negative example side load value obtained by multiplying the discriminant function smooth value by an exponential function value obtained by multiplying the degree of difference by a second attenuation coefficient smaller than the first attenuation coefficient, and a second attenuation coefficient. A second negative example side load value obtained by multiplying the exponential function value obtained by multiplying the degree of difference by the large third attenuation coefficient and the discriminant function smoothing value is calculated, and the first negative example side load value is used to calculate the above value. A negative example side load means for calculating a value obtained by subtracting the second negative example side load value;
Negative side integration for accumulating the negative side load values corresponding to all recognized symbol series and outputting the logarithmic function value of the cumulative value as a negative example side integrated value for correcting the positive example side integrated value・ Logarithmization means,
An integrated value comparison means for subtracting the positive example side integrated value from the negative example integrated value as the discrimination scale, and outputting the discrimination scale passed through a loss function as an objective function;
A discriminant learning device using the degree of difference, characterized by comprising:

The dissimilarity utilization type discriminative learning device according to any one of claims 1 to 5,
The dissimilarity calculation unit inputs the feature amount of the learning data, the recognition symbol sequence, and the discriminant function value, and estimates the dissimilarity as a certainty of a scale representing the degree to which the recognition symbol sequence is reliable as a correct answer. A difference utilization type discriminative learning device characterized by being output as a difference degree estimated value.

A pattern recognition process in which a pattern recognition unit generates a recognition symbol sequence obtained by pattern recognition of a feature amount sequence of learning data;
An identification function value generation process in which an identification function value generation unit outputs an identification function value for evaluating whether or not the recognition symbol sequence corresponds to a feature quantity of the learning data with reference to a model parameter in a model parameter recording unit; ,
A difference calculation unit calculates a difference between the recognized symbol sequence and the positive example,
The positive example recognition / comparison unit receives N (N ≧ 2) attenuation coefficients, the discriminant function value, and the dissimilarity as inputs, and uses the N attenuation coefficients to integrate the positive example side value and its positive example side. A positive example recognition comparison process for obtaining an integrated value for correcting the integrated value and outputting an objective function in which the positive example side integrated value is corrected,
A model parameter optimization unit for optimizing the model parameter corresponding to the recognition symbol sequence using the objective function;
Including
The positive example recognition comparison process comprises two attenuation coefficients,
The positive recognition comparison process is as follows:
Obtain an identification function smoothing value that is an exponential function value obtained by multiplying the above identification function value by a predetermined positive constant;
The exponential function value obtained by multiplying the degree of difference by the first attenuation coefficient is multiplied by the discriminant function smoothing value to obtain a positive example side load value, and the sum of the positive example side load values for all recognition symbol sequences can be arbitrarily set. Through the monotonically increasing reversible function of
An exponential function value obtained by multiplying the degree of difference by a second attenuation coefficient smaller than the first attenuation coefficient is multiplied by the discrimination function smoothing value to obtain a recognition-side load value, and the recognition for all recognition symbol sequences is performed. The sum of the side load values is passed through an arbitrary monotonically increasing reversible function to obtain the integrated value for correcting the positive example side integrated value as the recognized side integrated value,
What is obtained by subtracting the recognition side integrated value from the positive example integrated value is output as an objective function.

The difference degree utilization type discriminative learning method according to claim 7,
The positive recognition comparison process is as follows:
Discriminant function smoothing / antilogarithmizing means is a discriminant function smoothing / antilogarithmizing step for multiplying the discriminant function value by a positive constant and calculating an exponential function value of the value as a discriminant function smoothing value;
A positive example side load means for calculating a positive example side load value obtained by multiplying the exponential function value of the value obtained by multiplying the degree of difference by the first attenuation coefficient by the discrimination function smoothing value;
Positive example side integration / logarithmization means for calculating a logarithmic function value of the sum of the positive example side load values for all recognition symbol sequences as a positive example side integration value;
A recognition side load means for calculating a recognition side load value obtained by multiplying an exponential function value obtained by multiplying the degree of difference by a second attenuation coefficient smaller than the first attenuation coefficient by the discrimination function smoothing value; A side load step;
The recognition side integration / logarithmization means is an integration value for accumulating the recognition side load values corresponding to all recognition symbol sequences and correcting the logarithmic function value of the value to the positive example side integration value. Recognition side integration / logarithmization step to output as integrated value,
An integrated value comparing step, wherein the integrated value comparing means outputs the objective function obtained by subtracting the recognized integrated value from the positive integrated value;
A discriminative learning method using the degree of difference.

The difference degree utilization type discriminative learning method according to claim 7,
The positive example recognition comparison process comprises three attenuation coefficients,
The objective function is a misclassification measure passed through a loss function,
The positive recognition comparison process is as follows:
Obtain an identification function smoothing value that is an exponential function value obtained by multiplying the identification function value by a positive constant set in advance,
The exponential function value obtained by multiplying the degree of difference by the first attenuation coefficient is multiplied by the discriminant function smoothing value to obtain a positive example side load value, and the sum of the positive example side load values for all recognition symbol sequences can be arbitrarily set. Through the monotonically increasing reversible function of
A first negative example side load value obtained by multiplying the discriminant function smoothing value by an exponential function value obtained by multiplying the degree of difference by a second damping coefficient smaller than the first damping coefficient, and a second damping coefficient A second negative example side load value obtained by multiplying the exponential function value obtained by multiplying the degree of difference by the larger third attenuation coefficient by the discriminant function smoothing value, and calculating from the first negative example side load value. The negative example side load value is reduced to the negative example side load value, and the sum of the negative example side load values for all recognition symbol sequences is passed through an arbitrary monotonically increasing reversible function to correct the positive example side integrated value. It is calculated as a negative example side integrated value that is an integrated value for
A method of discriminative learning using a degree of difference, which is a process of subtracting the positive example side integrated value from the negative example side integrated value to obtain the discrimination scale.

The difference degree utilization type discriminative learning method according to claim 7,
The positive recognition comparison process is as follows:
A discriminant function smoothing / antilogarithmizing step, wherein the discriminant function smoothing / antilogarithmizing means multiplies the discriminant function value by a predetermined positive constant and calculates an exponential function value of the value as a discriminant function smoothing value; ,
A positive side load step for calculating a positive side load value obtained by multiplying the exponential function value obtained by multiplying the degree of difference by the first attenuation coefficient by the discrimination function smoothing value;
Positive example side integration / logarithmization means for calculating a logarithmic function value of the sum of the positive example side load values for all recognition symbol sequences as a positive example side integration value;
The negative example side load means obtains a first negative example side load value obtained by multiplying the degree of difference by a second attenuation coefficient smaller than the first attenuation coefficient, and a third attenuation coefficient larger than the second attenuation coefficient. A second negative example load value obtained by multiplying the exponential function value obtained by multiplying the degree of difference by the discriminant function smoothing value is calculated, and the second negative example side load value is calculated from the first negative example side load value. A negative example side load step that calculates a reduced value as a negative example side integrated value that is an integrated value for correcting the positive example side integrated value;
A negative integration / logarithmization step in which the negative integration / logarithmization means accumulates the negative load values corresponding to all recognition symbol sequences and outputs a logarithmic function value of the value as a negative example integration value;
An integrated value comparison step, wherein the integrated value comparison means subtracts the positive example integrated value from the negative example integrated value to make the identification scale, and outputs the identification scale passed through a loss function as an objective function;
A discriminative learning method using the degree of difference.

The dissimilarity utilization type discriminative learning method according to any one of claims 7 to 10,
The degree-of-difference calculation process uses the feature amount of the learning data, the recognition symbol series, and the identification function value as inputs, and estimates the degree of difference as a certainty of a scale representing the degree to which the recognition symbol series is reliable as a correct answer. A discriminant utilization type discriminative learning method characterized in that the discriminating factor is output as a dissimilarity estimation value.

An apparatus program for causing a computer to function as the dissimilarity utilization type discriminative learning apparatus according to any one of claims 1 to 6.