JPH08241092A

JPH08241092A - Speaker adaptation method for acoustic model and device therefor

Info

Publication number: JPH08241092A
Application number: JP7044430A
Authority: JP
Inventors: Tomoko Matsui; 知子松井; Sadahiro Furui; 貞煕古井
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1995-03-03
Filing date: 1995-03-03
Publication date: 1996-09-17

Abstract

PURPOSE: To realize a recognition system of high performance which minimizes the error rate of an acoustic HMM. CONSTITUTION: This device has a characteristic parameter extraction section 1 which forms and holds the acoustic HMM(hidden Markov model) 6 for nonspecific speakers formed by learning from the speeches of many speakers and extracts a characteristic parameter from speech data 5 of the person to be recognized and a likelihood maximizing adaptation section 2 which optimizes the parameter of the acoustic HMM for the nonspecific persons so as to maximize the likelihood for the speech of a person to be recognized. Further, the device has an identification error minimizing adaptation section 3 which obtaines the acoustic HMM of the min. in the identification error by defining the differentiable loss function from the parameter of the acoustic HMM having the parameter maximized in the likelihood and the time series speech data of the characteristic parameter of the person to be recognized and selecting the parameter so as to minimize the function and an adaptation acoustic HMM accumulation section 4.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、音声認識方法および装
置に関し、特に、音声の音声的特徴量をＨＭＭによって
モデル化し、音素、単語などの認識カテゴリに対応した
不特定話者用音響ＨＭＭを特定の認識対象話者に適応化
する音響ＨＭＭの話者適応化方法と装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech recognition method and apparatus, and more particularly, to a speech HMM for an unspecified speaker corresponding to a recognition category such as a phoneme or a word, which is obtained by modeling a speech feature of speech by an HMM. The present invention relates to a speaker adaptation method and apparatus for an acoustic HMM adapted to a specific recognition target speaker.

【０００２】[0002]

【従来の技術】従来、不特定話者音声認識において、認
識システムを認識対象話者に適応化することで、認識性
能を改善しようとする試みがなされてきた。認識対象話
者の音声に対する不特定話者用音響ＨＭＭの尤度が最大
になるように、例えば文献「中川聖一：”確率モデルに
よる音声認識”電子情報通信学会、１９８８」（以下文
献１と称す）に発表されたバウム／ウエルチ（Baum-Wel
ch）アルゴリズムに従って、音響ＨＭＭのパラメータを
推定して、音ＨＭＭを認識対象話者に適応化していた。2. Description of the Related Art Conventionally, in unspecified speaker voice recognition, attempts have been made to improve recognition performance by adapting a recognition system to a recognition target speaker. In order to maximize the likelihood of the acoustic HMM for the unspecified speaker with respect to the voice of the recognition target speaker, for example, the document “Seiji Nakagawa:“ Speech recognition by probabilistic model ”IEICE, 1988” (hereinafter referred to as Document 1). Baum-Wel (Baum-Wel)
ch) The parameters of the acoustic HMM are estimated according to the algorithm to adapt the sound HMM to the recognition target speaker.

【０００３】[0003]

【本発明が解決しようとする課題】上述した従来の音響
モデルの話者適応方法は、認識誤り率に関して、良い性
能が得られなかったのが実情であった。In the above-described conventional speaker adaptation method of the acoustic model, it is the actual situation that good performance is not obtained with respect to the recognition error rate.

【０００４】本発明の目的は、認識誤り率に関して、よ
り優れた性能が得られる音響モデルの話者適応化方法と
その装置を提供することである。It is an object of the present invention to provide a speaker adaptation method for an acoustic model and a device therefor which can obtain a better performance in terms of recognition error rate.

【０００５】[0005]

【課題を解決するための手段】本発明の音響モデルの話
者適応化方法は、音声の音響的特徴を抽出し、その特徴
量を統計的にモデル化して、音素、単語その他の認識カ
テゴリに対応した音響モデルを構成するために、多数の
話者の音声を用いて学習した不特定話者用の音響モデル
をＨＭＭと略称されている隠れマルコフモデルで表現し
ておき、認識対象となる話者の音声を用いて、前記不特
定話者用音響ＨＭＭのパラメータを、認識対象話者の音
声に対する尤度が最大となるように最適化する音響モデ
ルの話者適応化方法において、前記認識対象話者の音声
に対する尤度が最大になるように最適化された不特定話
者用音響ＨＭＭのパラメータを認識対象話者の音声に対
する認識誤りが最小になるように適応化するステップを
有する。A speaker adaptation method for an acoustic model according to the present invention extracts acoustic characteristics of a speech, statistically models the characteristic quantity, and classifies the acoustic characteristics into phonemes, words and other recognition categories. In order to construct a corresponding acoustic model, an acoustic model for an unspecified speaker, which has been learned by using the voices of a large number of speakers, is represented by a hidden Markov model, which is abbreviated as HMM. In the speaker adaptation method of the acoustic model, the parameters of the acoustic HMM for the unspecified speaker are optimized using the speech of the speaker so that the likelihood for the speech of the recognition target speaker is maximized. There is a step of adapting the parameters of the acoustic HMM for the unspecified speaker optimized so that the likelihood for the speaker's voice is maximized so that the recognition error for the voice of the recognition-target speaker is minimized.

【０００６】また、前記不特定話者用音響ＨＭＭのパラ
メータを認識対象者の音声に対する認識誤りを最小にな
るように適応化するステップが、微分可能な損失関数を
定義しこの値が減少するように前記音響ＨＭＭのパラメ
ータを逐次更新して最適値を求めるステップであるもの
も本発明に含まれる。Further, the step of adapting the parameters of the acoustic HMM for the unspecified speaker so as to minimize the recognition error with respect to the voice of the recognition target person defines a differentiable loss function so that this value decreases. The present invention also includes the step of sequentially updating the parameters of the acoustic HMM to obtain an optimum value.

【０００７】本発明の音響モデルの話者適応化装置は、
音声の音響的特徴を抽出し、その特徴量を統計的にモデ
ル化して、音素、単語その他の認識カテゴリに対応した
音響モデルを構成するために、多数の話者の音声を用い
て学習した不特定話者用の音響モデルをＨＭＭと略称さ
れている隠れマルコフモデルで表現しておき、認識対象
となる話者の音声を用いて、前記不特定話者用音響ＨＭ
Ｍのパラメータを、認識対象話者の音声に対する尤度が
最大となるように最適化する音響モデルの話者適応化装
置において、前記認識対象話者の音声に対する尤度が最
大になるように最適化された不特定話者用音響ＨＭＭの
パラメータを認識対象話者の音声に対する認識誤りが最
小になるように適応化する適応化手段を有している。The speaker adaptation device of the acoustic model of the present invention comprises:
In order to construct an acoustic model corresponding to phonemes, words, and other recognition categories by extracting the acoustic features of speech and statistically modeling the features, we learned the learning using speech from many speakers. The acoustic model for a specific speaker is represented by a hidden Markov model, which is abbreviated as HMM, and the voice of the speaker to be recognized is used to generate the acoustic HM for the unspecified speaker.
In a speaker adaptation device of an acoustic model for optimizing the parameters of M so that the likelihood for the speech of the recognition target speaker is maximized, the optimal is such that the likelihood for the speech of the recognition target speaker is maximized. It has an adapting means for adapting the parameters of the converted acoustic HMM for the unspecified speaker so that the recognition error with respect to the voice of the recognition target speaker is minimized.

【０００８】また、前記不特定話者用音響ＨＭＭのパラ
メータを認識対象者の音声に対する認識誤りを最小にな
るように適応化する適応化手段が、微分可能な損失関数
を定義しこの値が減少するように前記音響ＨＭＭのパラ
メータを逐次更新して最適値を求める手段を含むものも
本発明の音響モデルの話者適応装置に含まれる。Further, the adaptation means for adapting the parameters of the acoustic HMM for the unspecified speaker so as to minimize the recognition error with respect to the voice of the recognition target person defines a differentiable loss function, and this value decreases. As described above, a device including means for sequentially updating the parameters of the acoustic HMM to obtain an optimum value is also included in the speaker adaptation device of the acoustic model of the present invention.

【０００９】[0009]

【作用】多数の話者の音声を用いて、不特定話者用の音
響ＨＭＭのパラメータを、ある特定の話者の音声に対し
て尤度が最大になるように最適化した後に、さらに、認
識対象話者の音声に対する識別誤りが最小となるように
適応化するので、誤り率の少ない音響モデル話者適応化
が可能となる。After the parameters of the acoustic HMM for an unspecified speaker are optimized using the sounds of a large number of speakers so as to maximize the likelihood with respect to the sounds of a specific speaker, Since adaptation is performed so that the recognition error with respect to the voice of the recognition target speaker is minimized, adaptation of the acoustic model speaker with a low error rate becomes possible.

【００１０】[0010]

【実施例】次に、本発明の実施例について、図面を参照
して説明する。Embodiments of the present invention will now be described with reference to the drawings.

【００１１】図１（Ａ）は本発明の音響モデル話者適応
化方法の一実施例のフローチャート，図１（Ｂ）は図１
（Ａ）のステップ１３の識別誤り率最小化適応化に損失
関数を使用した実施態様のフローチャートである。FIG. 1A is a flow chart of an embodiment of the acoustic model speaker adaptation method of the present invention, and FIG. 1B is FIG.
It is a flowchart of the embodiment which used the loss function for the identification error rate minimization adaptation of step 13 of (A).

【００１２】この音響モデルの話者適応化方法は、多数
の話者の音声を用いて学習した不特定話者用の音響モデ
ルを隠れマルコフモデルである音響ＨＭＭ（Hidden Mar
kovModel）で表現する（ステップ１１）。次に、特定の
認識対象話者の音声を用いて音響ＨＭＭのパラメータを
該認識対象話者の音声に対する尤度が最大になるように
最適化する（ステップ１２）。さらに、前記認識対象話
者の音声に対する識別誤りが最小となるようにステップ
１２の出力の音響ＨＭＭのパラメータを適応化する（ス
テップ１３）。ステップ１３の出力である音響ＨＭＭを
蓄積する（ステップ１４）。In this speaker adaptation method of an acoustic model, an acoustic model for an unspecified speaker learned by using the voices of a large number of speakers is an acoustic HMM (Hidden Mar) which is a hidden Markov model.
kovModel) (step 11). Next, using the voice of the specific speaker to be recognized, the parameters of the acoustic HMM are optimized so that the likelihood for the voice of the speaker to be recognized is maximized (step 12). Further, the parameters of the acoustic HMM output in step 12 are adapted so that the identification error with respect to the voice of the recognition target speaker is minimized (step 13). The acoustic HMM output from step 13 is stored (step 14).

【００１３】また、識別誤り最小化適応化ステップ１３
が、損失関数Also, the identification error minimization adaptation step 13 is performed.
Is the loss function

【００１４】[0014]

【数１】を減少するように[Equation 1] To reduce

【００１５】[0015]

【数２】ここでε_tは更新量を調節する係数で、実験的に設定す
る。[Equation 2] Here, ε _t is a coefficient for adjusting the update amount, and is set experimentally.

【００１６】式（６）を順次Λ_tを更新して最適値を求
めるステップである実施態様も本発明にふくまれる。The embodiment including the step of sequentially updating Λ _t in Equation (6) to obtain the optimum value is also included in the present invention.

【００１７】図２は本発明の音響モデル話者適応化方法
が適用された装置のブロック図である。FIG. 2 is a block diagram of an apparatus to which the acoustic model speaker adaptation method of the present invention is applied.

【００１８】この音響モデル話者適応化装置は、多数の
話者の音声音声を収録して音声の特徴量を統計的にモデ
ル化した不特定話者用の音響モデルを隠れマルコフモデ
ル（以下音響ＨＭＭ（HIDDEN MARKOV MODEL）と称す）
６と、認識対象話者の音声データ５を入力されると入力
された音声をケプストラム等の特徴パラメータを用いた
表現形式に変換する特徴パラメータ抽出部１と、特徴パ
ラメータ抽出部１の出力とＨＭＭ６が入力されると、認
識対象話者の時系列に変換された音声データにより認識
対象話者に適応化された尤度の高い音響ＨＭＭと、前記
認識対象話者の時系列に変換された音声データを出力す
る尤度最大化適応部２と、尤度最大化適応化部２の出力
を入力とし、識別誤りを最小にするように音響ＨＭＭの
パラメータを修正する識別誤り最小化適応部３と、識別
誤り最小化適応化部３の出力を蓄積する適応化音響ＨＭ
Ｍ蓄積部４を有している。This acoustic model speaker adaptation apparatus includes a hidden Markov model (hereinafter referred to as an acoustic model) for an acoustic model for an unspecified speaker in which voices of a large number of speakers are recorded and statistically modeled for the feature amount of the voice. HMM (HIDDEN MARKOV MODEL)
6, a feature parameter extraction unit 1 for converting the input voice into an expression format using a feature parameter such as a cepstrum when the voice data 5 of the recognition target speaker is input, an output of the feature parameter extraction unit 1, and the HMM 6 Is input, a highly likely acoustic HMM adapted to the recognition target speaker by the voice data converted into the recognition target speaker in time series, and the speech converted into the recognition target speaker in time series. A likelihood maximization adaptation unit 2 that outputs data, and an identification error minimization adaptation unit 3 that receives the output of the likelihood maximization adaptation unit 2 and modifies the parameters of the acoustic HMM so as to minimize the identification error. , An adaptive acoustic HM for accumulating the output of the identification error minimization adaptation unit 3
It has an M storage unit 4.

【００１９】尤度最大化適応化部２では、入力された音
響ＨＭＭ６のＨＭＭパラメータΛ＝In the likelihood maximization adaptation unit 2, the HMM parameter Λ = of the input acoustic HMM 6 is

【００２０】[0020]

【外１】が式１に示す尤度関数Ｌ_K（・）を最大にするように、
例えばバウム／ウエルチアルゴリズム（文献１参照）に
よって認識対象話者に対する適応化を実行する。[Outside 1] So that the likelihood function L _K (·) shown in Equation 1 is maximized,
For example, the Baum / Welch algorithm (see Reference 1) is used to adapt the speaker to be recognized.

【００２１】ここで、 αは状態、ｍは混合分布、Where α is a state, m is a mixture distribution,

【００２２】[0022]

【外２】は状態αの混合分布ｍの重み係数、[Outside 2] Is the weighting coefficient of the mixture distribution m of the state α,

【００２３】[0023]

【外３】は平均値、[Outside 3] Is the average value,

【００２４】[0024]

【外４】は分散値、[Outside 4] Is the variance value,

【００２５】[0025]

【外５】は状態αから状態βへの遷移確率を表す。[Outside 5] Represents the transition probability from the state α to the state β.

【００２６】[0026]

【数３】ここで、Ｘ＝｛ｘ₁、ｘ₂、ｘ₃・・・ｘ_T）は特徴パラメ
ータの時系列に変換された音声データ、ｔは時刻、πは
初期状態確率、Ｑ＝｛ｑ₀、ｑ₁、・・・ｑ_T）は状態遷
移系列を表す。(Equation 3) Here, X = {x ₁ , x ₂ , x ₃ ... x _T ) is voice data converted into time series of feature parameters, t is time, π is initial state probability, Q = {q ₀ , q ₁ , ... Q _T ) represent a state transition sequence.

【００２７】また、In addition,

【００２８】[0028]

【数４】は出現確率である。[Equation 4] Is the probability of occurrence.

【００２９】識別誤り最小化適応化部３では、識別誤り
関数ｄ_k（・）が最小となるように音響ＨＭＭが修正さ
れる。ここで、識別関数ｇ_K（・）と識別誤り関数ｄ
_k（・）をそれぞれ式（３）、（４）で定義する。In the identification error minimization adaptation section 3, the acoustic HMM is modified so that the identification error function d _k (.) Is minimized. Here, the discrimination function g _K (·) and the discrimination error function d
_k (·) is defined by equations (3) and (4), respectively.

【００３０】[0030]

【数５】Ｌ_kはビタービ（Viterbi)アルゴリズムによる尤度関数
である（文献１による）。(Equation 5) L _k is a likelihood function based on the Viterbi algorithm (according to Reference 1).

【００３１】また、識別誤り最小化適応部３では、例え
ば音響ＨＭＭのパラメータ修正において識別誤り関数ｄ
_k（・）の代りに式５に示す微分可能な損失関数ｌ
（ｄ_k）を定義し、この関数が減少するように式（６）
の音響パラメータを逐次更新する。ここに、式（５）、
（６）は文献［B.H. Juang and Katagiri:"Discriminat
ivetrain-ing" J. Acoust. Soc. Jpn.,13, 6,pp. 333-3
39,1992.］の識別学習アルゴリズムによる。In addition, in the identification error minimization adaptation section 3, for example, in the parameter correction of the acoustic HMM, the identification error function d
Instead of _k (•), differentiable loss function l shown in Equation 5
(D _k ) is defined and equation (6) is used so that this function decreases.
The acoustic parameters of are sequentially updated. Where equation (5),
(6) is a reference [BH Juang and Katagiri: "Discriminat
ivetrain-ing "J. Acoust. Soc. Jpn., 13, 6, pp. 333-3
39, 1992.].

【００３２】[0032]

【数６】ここで、ε_tは更新量を調節する係数で、実験的に設定
する。(Equation 6) Here, ε _t is a coefficient for adjusting the update amount, and is set experimentally.

【００３３】次に、本実施例の動作結果について実験例
を参照して説明する。Next, the operation results of this embodiment will be described with reference to experimental examples.

【００３４】まず、不特定話者用音響ＨＭＭの作成を行
った。本例の音響ＨＭＭは、混合分布数２５６の半連続
型ＨＭＭであり、音響ＨＭＭは音韻環境独立の４３種類
である。不特定話者用音響ＨＭＭの作成には、男性３５
名に計７，０１６文章を用いて、バウムウエルチのアル
ゴリズムによって、ＨＭＭパラメータの推定を行った。First, an acoustic HMM for an unspecified speaker was created. The acoustic HMM of this example is a semi-continuous HMM with a mixture distribution number of 256, and there are 43 types of acoustic HMM independent of the phonological environment. To create an acoustic HMM for unspecified speakers, a male 35
Using a total of 7,016 sentences as the name, the HMM parameters were estimated by the Baumwelch algorithm.

【００３５】また、話者適応化に、不特定話者用音響Ｈ
ＭＭの作成に用いた話者とは異なる男性２名に１０およ
び５０文章のデータを用いた場合の、連続音声中の音韻
認識率により評価した。音韻認識実験では、音声内容の
書き下ろしを与えて音声区間に対してビタビアラインメ
ントを取り、それを正解の音韻区間と仮定し、その音韻
区間で全ての音韻ＨＭＭのうち、最大尤度を示すものを
認識結果とした。音韻識別実験は、話者適応化に用いた
ものと同じ文章セットとは異なる１００文章を用いた場
合について行った。特徴パラメータとして、標本周波数
１２ＫＨｚ、フレーム長３２ｍｓ、フレーム周期８ｍ
ｓ、ＬＰＣ（Linear Prediction Coefficient）分析次
数１６でケプストラムを抽出した。In addition, for speaker adaptation, the speaker-specific sound H is used.
It was evaluated by the phoneme recognition rate in continuous speech when data of 10 and 50 sentences were used for two men different from the speaker used for creating the MM. In the phonological recognition experiment, the voice content is written down, the Viterbi alignment is taken for the speech section, and it is assumed that it is the correct phonological section, and the one showing the maximum likelihood among all the phonological HMMs in the phonological section. The recognition result. The phonological discrimination experiment was performed using 100 sentences different from the same sentence set used for speaker adaptation. As a characteristic parameter, sample frequency 12 KHz, frame length 32 ms, frame period 8 m
s, LPC (Linear Prediction Coefficient) The cepstrum was extracted in the analysis order 16.

【００３６】また、尤度最大化による話者適応化では、
学習の繰返し回数は５回とし、また、各混合分布の平均
値だけを推定した。また、識別誤り最小化による話者適
応化では、繰返し回数は１０回として、各混合分布の平
均値と重み係数を推定した。両方ともに、各繰返しにお
いて、前学習データを適用した後に、一斉にＨＭＭパラ
メータを更新した。Further, in speaker adaptation by likelihood maximization,
The learning was repeated 5 times, and only the average value of each mixture distribution was estimated. Further, in the speaker adaptation by minimizing the discrimination error, the number of iterations was set to 10 and the average value and weighting coefficient of each mixture distribution were estimated. In both cases, in each iteration, the HMM parameters were updated all at once after applying the pre-learning data.

【００３７】上記の実験結果を表１に示したが、、本方
法が従来の音響モデルの話者適応方法に比して有効であ
ることがよくわかる。The results of the above experiment are shown in Table 1, and it is clear that this method is more effective than the speaker adaptation method of the conventional acoustic model.

【００３８】[0038]

【表１】 [Table 1]

【００３９】[0039]

【発明の効果】以上説明したとおり本発明は、不特定話
者用音響ＨＭＭを尤度最大化適応化された後さらに音響
ＨＭＭのＨＭＭパラメータの識別誤り最小化適応化をす
るので、音響モデルの話者適応化が増進され、高性能の
音響認識システムを実現できる効果がある。As described above, according to the present invention, the acoustic HMM for the unspecified speaker is subjected to the likelihood maximization adaptation and then the HMM parameter identification error minimization adaptation of the acoustic HMM is performed. Speaker adaptation is enhanced, and a high-performance acoustic recognition system can be realized.

[Brief description of drawings]

【図１】（Ａ）は本発明の音響モデル話者適応方法の一
実施例のフローチャート、（Ｂ）は（Ａ）のステップ１
３の識別誤り率最小化適応化に損失関数を使用した実施
態様のフローチャートである。1A is a flowchart of an embodiment of an acoustic model speaker adaptation method of the present invention, and FIG. 1B is step 1 of FIG.
3 is a flowchart of an embodiment using a loss function for the identification error rate minimization adaptation of FIG.

【図２】本発明の音響モデルの話者適応装置の一実施例
のブロック図である。FIG. 2 is a block diagram of an embodiment of a speaker adaptation device for an acoustic model of the present invention.

[Explanation of symbols]

１特徴パラメータ抽出部２尤度最大化適応化部３識別誤り最小化適応化部４適応化音響ＨＭＭ蓄積部５認識対象話者の音声データ６不特定話者用音響ＨＭＭ 1 Feature Parameter Extraction Unit 2 Likelihood Maximization Adaptation Unit 3 Identification Error Minimization Adaptation Unit 4 Adapted Acoustic HMM Storage Unit 5 Speech Data of Speaker to be Recognized 6 Acoustic HMM for Unspecified Speaker

Claims

[Claims]

1. To extract acoustic features of a voice and statistically model the feature amount to construct an acoustic model corresponding to a phoneme, a word and other recognition categories, voices of a large number of speakers are extracted. The acoustic model for the unspecified speaker learned by using it is represented by a hidden Markov model, which is abbreviated as HMM, and the speech of the speaker to be recognized is used to set the parameters of the acoustic HMM for the unspecified speaker. In the speaker adaptation method of the acoustic model that optimizes the likelihood for the speech of the recognition target speaker to be maximum, and is optimized to maximize the likelihood for the speech of the recognition target speaker. A speaker adaptation method for an acoustic model, comprising a step of adapting a parameter of an acoustic HMM for an unspecified speaker so as to minimize a recognition error with respect to a voice of a recognition target speaker.

2. The step of adapting the parameters of the acoustic HMM for an unspecified speaker so as to minimize the recognition error with respect to the speech of the recognition target speaker defines a differentiable loss function, and this value is reduced. 2. The speaker adaptation method for an acoustic model according to claim 1, further comprising the step of sequentially updating the parameters of the acoustic HMM to obtain an optimum value.

3. A plurality of speakers' voices are extracted in order to extract an acoustic feature of the voice and statistically model the feature amount to construct an acoustic model corresponding to a phoneme, word or other recognition category. The acoustic model for the unspecified speaker learned by using it is represented by a hidden Markov model, which is abbreviated as HMM, and the speech of the speaker to be recognized is used to set the parameters of the acoustic HMM for the unspecified speaker. In a speaker adaptation device for an acoustic model that optimizes the likelihood for the speech of the recognition-target speaker to be maximum, and is optimized so that the likelihood for the speech of the recognition-target speaker is maximized. A speaker adaptation apparatus for an acoustic model, comprising an adaptation means for adapting parameters of an acoustic HMM for an unspecified speaker so that a recognition error with respect to a voice of a recognition target speaker is minimized.

4. An adapting means for adapting the parameters of the acoustic HMM for an unspecified speaker so as to minimize a recognition error with respect to a voice of a recognition target person defines a differentiable loss function, and this value is A means for obtaining an optimum value by sequentially updating the parameters of the acoustic HMM so as to decrease.
Speaker adaptation device of the described acoustic model.