JP2002342323A

JP2002342323A - Language model learning device, voice recognizing device using the same, language model learning method, voice recognizing method using the same, and storage medium with the methods stored therein

Info

Publication number: JP2002342323A
Application number: JP2001144885A
Authority: JP
Inventors: Yohei Okato; 洋平岡登; Jun Ishii; 純石井
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2001-05-15
Filing date: 2001-05-15
Publication date: 2002-11-29
Anticipated expiration: 2021-05-15
Also published as: JP3961780B2

Abstract

PROBLEM TO BE SOLVED: To provide a language model learning device, which is enhanced in recognition accuracy. SOLUTION: The device is provided with object task language data 101, general task language data 102, a similar word couple extracting means 103, a similar word string synthesizing means 104 and a language model generating means 105 for constructing a task adapted language model by reading text data from the respective language data. Then, the similar word couple extracting means 103 reads the respective text data and extracts a similar couple from the combination of words contained in the object task data and words contained in the general task data, the similar word string synthesizing means 104 reads the respective text data and the similar word couple and synthesizes out a word string containing words in an object task, which is not contained in the language data, and the language generating means 105 finds the statistic quantity of the word string by weighting for every text data.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は、確率的言語モデ
ルを用いた言語モデル学習装置およびそれを用いた音声
認識装置ならびに言語モデル学習方法およびそれを用い
た音声認識方法ならびにそれらの方法の音声認識プログ
ラムを記憶した記憶媒体に関するものである。The present invention relates to a language model learning device using a probabilistic language model, a speech recognition device using the same, a language model learning method, a speech recognition method using the same, and a speech recognition of those methods. The present invention relates to a storage medium storing a program.

【０００２】[0002]

【従来の技術】一般に、音声認識においては、通常、デ
ィジタル化されて入力される音声信号の処理手法を用い
て、音声の音響的特徴をよく表すベクトルの時系列に変
換した後、音声モデルとの照合処理が行われる。2. Description of the Related Art Generally, in speech recognition, a speech signal is converted into a time series of vectors that well represent acoustic characteristics of a speech by using a processing method of a digitized and inputted speech signal. Is performed.

【０００３】照合処理とは、Ｋ個の時刻フレームからな
る音響特徴ベクトル時系列Ａ（＝［ａ₁，ａ₂，・・・，
ａ_K］）に基づいて、発声された単語列Ｗ（＝［ｗ₁，ｗ
₂，・・・，ｗ_M］、（Ｍは単語数））を求める問題に相
当する。[0003] The collation processing is an acoustic feature vector time series A (= [a ₁ , a ₂ ,...) Composed of K time frames.
a _K ]), and the uttered word sequence W (= [w ₁ , w
₂ ,..., W _M ], (M is the number of words)).

【０００４】上記照合処理において、認識精度が最も高
くなるような単語列Ｗを推定するためには、出現確率Ｐ
（Ｗ｜Ａ）が最大となる認識単語列Ｗ^*を、以下の
（１）式により求めればよい。[0004] In the above collation processing, in order to estimate a word string W for which the recognition accuracy is highest, the appearance probability P
The recognized word string W ^* that maximizes (W | A) may be obtained by the following equation (1).

【０００５】[0005]

【数１】 (Equation 1)

【０００６】ただし、（１）式において、出現確率Ｐ
（Ｗ｜Ａ）を直接求めることは、通常困難である。そこ
で、出現確率Ｐ（Ｗ｜Ａ）は、ベイズの定理を用いて、
以下の（２）式のように書き換えられる。However, in equation (1), the appearance probability P
It is usually difficult to directly determine (W | A). Therefore, the appearance probability P (W | A) is calculated using Bayes' theorem.
It can be rewritten as the following equation (2).

【０００７】[0007]

【数２】 (Equation 2)

【０００８】ここで、（２）式の左辺を最大化する単語
列Ｗを求める際、右辺の分母Ｐ（Ａ）は、認識候補とな
る単語列Ｗに影響を与えないので、右辺の分子を最大化
する単語列Ｗを求めればよい。すなわち、認識単語列Ｗ
^*は、以下の（３）式のように表される。Here, when finding the word string W that maximizes the left side of equation (2), the denominator P (A) on the right side does not affect the word string W that is a candidate for recognition. What is necessary is just to find the word string W to be maximized. That is, the recognition word string W
^* Is represented by the following equation (3).

【０００９】[0009]

【数３】 (Equation 3)

【００１０】ここで、（３）式内のＰ（Ｗ）を与える確
率モデル、Ｐ（Ａ｜Ｗ）を与える確率モデルを、それぞ
れ、言語モデル、音響モデルと呼ぶ。音声認識におい
て、近年盛んに検討されているモデル化方法としては、
音響モデルを「隠れマルコフモデル」で表現し、言語モ
デルを「確率言語モデル」で表現するものが知られてい
る。Here, the probability model giving P (W) and the probability model giving P (A | W) in equation (3) are called a language model and an acoustic model, respectively. In speech recognition, modeling methods that have been actively studied in recent years include:
It is known that an acoustic model is represented by a “Hidden Markov Model” and a language model is represented by a “probabilistic language model”.

【００１１】これらのモデル化方法の詳細は、たとえ
ば、「音声認識の基礎（上、下）」（Ｌ．Ｒ．ＲＡＢＩ
ＮＥＲ、Ｂ．Ｈ．ＪＵＡＮＧ、古井監訳、１９９５年、
１１月、ＮＴＴアドバンステクノロジ）（以下、「文献
１」と称する）、または「確率的言語モデル」（北研
二、東京大学出版会）（以下、「文献２」と称する）な
どに記されている。The details of these modeling methods are described in, for example, "Basics of speech recognition (upper and lower)" (LR RABI).
NER, B.A. H. JUANG, translated by Furui, 1995,
November, NTT Advanced Technology (hereinafter referred to as "Reference 1"), or "Stochastic Language Model" (Kenji Kita, University of Tokyo Press) (hereinafter referred to as "Reference 2"). .

【００１２】これらの方法において、確率モデルを構成
するパラメータは、大量のデータから統計的に推定され
る。すなわち、音響モデルの構築においては、あらかじ
め多数の話者からの単語や文などの音声データを収集
し、統計的手法を利用して認識精度や認識精度と良好に
関連した指標が向上するように推定が行われる。In these methods, the parameters constituting the probability model are estimated statistically from a large amount of data. That is, in constructing an acoustic model, speech data such as words and sentences from a large number of speakers are collected in advance, and the recognition accuracy and an index well related to the recognition accuracy are improved using a statistical method. An estimation is made.

【００１３】たとえば、バウム・ウェルチアルゴリズム
を用いて、学習データに対して尤度が大きくなるよう
に、音響モデルを構成する「隠れマルコフモデル」のパ
ラメータを推定する。音響モデルの推定方法は、上記文
献１の下巻に詳述されている。For example, using the Baum-Welch algorithm, the parameters of the “hidden Markov model” that constitutes the acoustic model are estimated so that the likelihood of the learning data is increased. The method for estimating the acoustic model is described in detail in the lower volume of Document 1 described above.

【００１４】同様に、言語モデルの構築においては、新
聞や会話の書き起こしなどのテキストから、言語モデル
の構造にしたがって、それぞれの発話や発話を構成する
単語の出現する確率を計算する。Similarly, in constructing a language model, the probabilities of appearance of each utterance and words constituting the utterance are calculated from texts such as newspapers and transcripts of conversation in accordance with the structure of the language model.

【００１５】言語モデルの構造としては、直前の単語に
関する「ｎ−１重マルコフモデル」を用いて、後続する
単語の出現確率を予想する「Ｎグラム言語モデル」や
「確率文脈自由文法」、または、それらの組み合わせな
どがよく適用される。As the structure of the language model, an “N-gram language model” or “probabilistic context-free grammar” that predicts the appearance probability of a subsequent word using an “n−1-fold Markov model” for the immediately preceding word, or , And combinations thereof are often applied.

【００１６】特に、Ｎグラム言語モデルは、効果的であ
るうえ、パラメータ推定手段が容易に実現可能であるこ
とから、広く用いられている。そこで、以下の説明で
は、Ｎグラム言語モデルを例にとって、言語モデルの構
築について説明する。In particular, the N-gram language model is widely used because it is effective and the parameter estimating means can be easily realized. Thus, in the following description, the construction of a language model will be described using an N-gram language model as an example.

【００１７】たとえば、Ｎグラム言語モデルにおいて、
Ｎ＝２としたとき（バイグラム言語モデルと呼ばれ
る）、上記（３）式内のＰ（Ｗ）は、以下の（４）式の
ように近似される。For example, in an N-gram language model,
When N = 2 (called a bigram language model), P (W) in the above equation (3) is approximated as in the following equation (4).

【００１８】[0018]

【数４】 (Equation 4)

【００１９】Ｎグラム言語モデルのパラメータとなる条
件つき確率Ｐ（ｗ_N｜ｗ₁，・・・，ｗ_N-1）は、学習用
テキストデータ内の隣接する単語列の頻度Ｃ（ｗ₁，・
・・，ｗ_N）から、以下の（５）式のように推定され
る。The conditional probability P (w _N | w ₁ ,..., W _N−1 ) serving as a parameter of the N-gram language model is calculated based on the frequency C (w ₁ ,・
.., W _N ) are estimated as in the following equation (5).

【００２０】[0020]

【数５】 (Equation 5)

【００２１】しかし、単語の条件付き出現確率を、単純
に上記（５）式のように推定すると、学習データに存在
しない単語列を含む場合、文の出現確率は「０」になっ
てしまう。However, if the conditional occurrence probability of a word is simply estimated as in the above equation (5), if a word string that does not exist in the learning data is included, the sentence occurrence probability will be “0”.

【００２２】このような状態を防ぐため、学習用テキス
トに出現しない単語列に対して非零の（「０」でない）
確率を割り当てる処理（一般に、「スムージング」と呼
ばれる）が行われる。In order to prevent such a state, a word string that does not appear in the learning text is non-zero (not "0").
A process of assigning probabilities (generally called "smoothing") is performed.

【００２３】最も一般的なスムージング方法としては、
Ｋａｔｚが提案した「バックオフスムージング」があげ
られる。バックオフスムージングにおいては、上記
（５）式で推定される確率から、頻度に応じて一定の割
合を除き（ディスカウンティングを実行し）、学習デー
タで出現しなかった単語列に確率が割り当てられる。The most common smoothing methods include:
"Back-off smoothing" proposed by Katz. In the back-off smoothing, a probability is assigned to a word string that has not appeared in the learning data, except for a certain ratio (by executing the counting) according to the frequency from the probability estimated by the above equation (5).

【００２４】学習データで出現しなかった単語列に割り
当てられる条件付き確率には、さらに大雑把な言語モデ
ルによって推定された値が用いられる。上記Ｋａｔｚに
よる方法では、Ｎグラムよりも粗いモデルとして、Ｎ−
１グラムが用いられる。この方法の詳細については、上
記文献２の第６７頁に示されている。As the conditional probability assigned to a word string that does not appear in the learning data, a value estimated by a more rough language model is used. In the method by Katz, as a model coarser than N-gram, N-
One gram is used. The details of this method are shown on page 67 of Reference 2 mentioned above.

【００２５】なお、日本語の場合には、テキストが分か
ち書きされないので、単語の定義があいまいである。そ
こで本文では、何らかの手段でテキストを整合性のある
部分に分割したものを、それぞれ、単語と定義する。In the case of Japanese, since the text is not written separately, the definition of the word is ambiguous. Therefore, in the text, words obtained by dividing the text into consistent parts by some means are defined as words.

【００２６】すなわち、単語とは、たとえば文字や形態
素、文節などの言語的な単位や、エントロピー基準に基
づいたテキストの分割、ならびに、これらの組み合わせ
などであり、これら分割された単位に読み方や品詞など
の言語情報が付加された場合を含む。That is, a word is a linguistic unit such as a character, a morpheme, or a phrase, a text division based on an entropy criterion, or a combination thereof. This includes cases where language information such as language information is added.

【００２７】上記統計的手法を用いた言語モデルの構築
においては、言語モデルのパラメータを推定するため
に、大量の音声データおよびテキストデータが必要とな
る。特に、Ｎグラム言語モデルは、学習データに強く依
存するので、対象とするタスク（以下、「対象タスク」
と称する）毎に大量のデータ収集が必要である。In constructing a language model using the above-described statistical method, a large amount of voice data and text data are required to estimate the parameters of the language model. In particular, since the N-gram language model strongly depends on learning data, a target task (hereinafter, “target task”)
) Requires a large amount of data collection.

【００２８】しかし、タスク毎に大量のテキストデータ
を収集することは困難であり、対象タスクに関する少量
のテキストデータから言語モデルを構築できることが望
ましいので、クラス言語モデルの利用や、タスク適応化
などが行われる。However, it is difficult to collect a large amount of text data for each task, and it is desirable to be able to construct a language model from a small amount of text data relating to a target task. Done.

【００２９】クラス言語モデルとは、類似した単語をま
とめ、同一のクラス（グループ）として扱われるもので
あり、言語モデルの推定パラメータ数を削減したり、学
習データに存在しない単語に適当な確率を割り当てるも
のである。The class language model collects similar words and is treated as the same class (group). The class language model reduces the number of estimated parameters of the language model, and provides an appropriate probability for words that do not exist in the learning data. Assign.

【００３０】単語とクラスとの関係定義は、単語やタス
クに応じて人手で決定されたり、データに基づいて決定
され、Ｎグラム言語モデルであっても適用可能である。The definition of the relationship between a word and a class is manually determined according to the word or task, or determined based on data, and can be applied to an N-gram language model.

【００３１】たとえば、バイグラムクラス言語モデルに
おける文の出現確率は、（１）クラス間の遷移確率Ｐ
（ｃ_i｜ｃ_i-1）と、（２）クラス内から特定の単語が選
択される確率Ｐ（ｗ_i｜ｃ_i）との積として、以下の
（６）式のように定義される。For example, the appearance probability of a sentence in the bigram class language model is represented by (1) the transition probability P between classes
And _{_{| (c i c i-1}} ), (2) the probability a particular word from the class is selected P | as the product of the (w _i c _i), it is defined by the following equation (6) .

【００３２】[0032]

【数６】 (Equation 6)

【００３３】たとえば、１０００単語を各１０単語から
なる１００のクラスに分割した場合を考える。このと
き、単語バイグラム言語モデルの場合での推定パラメー
タ数は、１０００²（＝１００００００）である。For example, consider the case where 1000 words are divided into 100 classes each consisting of 10 words. At this time, the number of estimated parameters in the case of the word bigram language model is 1000 ² (= 1,000,000).

【００３４】これに対して、クラスバイグラム言語モデ
ルの場合での推定パラメータ数は、（１）クラス間の遷
移と、（２）クラスと単語との写像との和として表さ
れ、１００²＋１００×１０（＝１１０００）に減少す
る。On the other hand, the number of estimated parameters in the case of the class bigram language model is expressed as the sum of (1) transition between classes and (2) mapping between classes and words, and is 100 ² + 100 × It is reduced to 10 (= 11000).

【００３５】単語とクラスとの対応関係は、人手で決定
されてもよく、言語データから単語クラスタリングを実
行して求めてもよい。図２０はクラス定義の一例を示す
説明図である。図２０において、単語ｗと、単語ｗが所
属するクラスｃと、単語ｗが所属するクラスｃから出力
される確率Ｐ（ｗ｜ｃ）とが記述されている。The correspondence between words and classes may be determined manually or may be obtained by executing word clustering from language data. FIG. 20 is an explanatory diagram showing an example of the class definition. In FIG. 20, a word w, a class c to which the word w belongs, and a probability P (w | c) output from the class c to which the word w belongs are described.

【００３６】クラスＮグラム言語モデルのうち、クラス
間遷移モデルの推定は、通常の単語Ｎグラムの場合と同
様である。クラスＮグラム言語モデルの構築方法に関し
ては、上記文献２の第７２頁以降に詳述されている。In the class N-gram language model, the estimation of the inter-class transition model is the same as in the case of a normal word N-gram. The method of constructing the class N-gram language model is described in detail in the above-mentioned document 2 starting from page 72.

【００３７】一方、タスク適応化とは、対象タスク以外
のテキストデータを合わせて利用し、学習データの不足
を補うものである。ここでは、対象タスク以外のタスク
を含むテキストデータを一般タスク言語データと呼ぶこ
とにする。On the other hand, the task adaptation is to make use of text data other than the target task together to compensate for the lack of learning data. Here, text data including tasks other than the target task will be referred to as general task language data.

【００３８】タスク適応化に関しては、「Ｎ−ｇｒａｍ
のタスク適応における語彙の設定法の検討」（伊藤彰
則、好田正紀、電子情報通信学会研究技術報告、第５１
−５８頁、ＳＰ９７−２５、１９９７）（以下、文献３
と称する）で述べられている方法が提案されている。Regarding task adaptation, "N-gram
Of vocabulary setting method for task adaptation "(Akinori Ito, Masaki Yoshida, IEICE Technical Report, No. 51)
-58, SP97-25, 1997) (hereinafter referred to as Reference 3)
) Are proposed.

【００３９】この方法は、Ｎグラム言語モデルを対象と
して、対象タスクと一般タスクとの学習データを重みづ
けして加えることにより、タスク適応を行うというもの
である。In this method, task adaptation is performed by weighting and adding learning data of a target task and a general task to an N-gram language model.

【００４０】図２１は上記文献３で述べられている音声
認識用の言語モデル構築方法を適用した装置を概略的に
示すブロック構成図である。図２１において、１００は
タスク適応化済みの言語モデルを生成する言語モデル推
定手段である。FIG. 21 is a block diagram schematically showing an apparatus to which the language model construction method for speech recognition described in the above-mentioned reference 3 is applied. In FIG. 21, reference numeral 100 denotes a language model estimating unit that generates a task-adapted language model.

【００４１】１０１は対象タスク言語データであり、対
象タスクのテキストデータを集積し、対象タスクで認識
すべき文を表すテキストを単語に分割している。１０２
は一般タスク言語データであり、対象タスク以外のタス
クを含む一般タスクのテキストデータを集積し、一般タ
スクに含まれる文を表すテキストを単語に分割してい
る。Reference numeral 101 denotes target task language data which accumulates text data of the target task and divides a text representing a sentence to be recognized by the target task into words. 102
Is general task language data, which accumulates text data of general tasks including tasks other than the target task, and divides text representing sentences included in the general tasks into words.

【００４２】言語モデル推定手段１００は、対象タスク
言語データ１０１および一般タスク言語データ１０２を
読み込み、それぞれ適当な重み付け処理を施して、単語
列の頻度を数え上げ、統計的手法を用いて言語モデルの
パラメータを推定する。The language model estimating means 100 reads the target task language data 101 and the general task language data 102, performs appropriate weighting processing on each of them, counts up the frequency of word strings, and uses a statistical method to calculate the parameters of the language model. Is estimated.

【００４３】重み付け処理は、それぞれの入力について
与えられる。たとえば、「私、は」という単語列が対象
タスクで２回、一般タスクで４回出現したとして、対象
タスクの頻度重みが「３」、一般タスクの頻度重みが
「１」であれば、単語列「私、は」の頻度は、「１０
（＝３×２＋１×４）」と見積もられる。The weighting process is given for each input. For example, assuming that the word string “I, wa” appears twice in the target task and four times in the general task, if the frequency weight of the target task is “3” and the frequency weight of the general task is “1”, the word The frequency of the column "I, ha" is "10
(= 3 × 2 + 1 × 4) ”.

【００４４】なお、重み付け係数は、整数でなくてもよ
い。また、数え上げの際、必要であれば、頻度が小さい
単語は取り除き、取り除いた確率を認識に必要な単語に
等確率で再配分することができる。Note that the weighting coefficients need not be integers. In counting, if necessary, words with low frequency can be removed, and the removed probability can be redistributed with equal probability to the words required for recognition.

【００４５】こうして得られた頻度情報「１０」から、
たとえばＫａｔｚのバックオフスムージング法により、
既知および未知の単語列について確率を推定する。な
お、頻度重みの決定は、たとえば最終的に得られる言語
モデルのテストデータに対する出現確率を高めるよう
に、削除推定法を用いて定めることができる。また、削
除推定法については、上記文献２の第４９頁に述べられ
ている。From the frequency information “10” thus obtained,
For example, by the back-off smoothing method of Katz,
Estimate probabilities for known and unknown word strings. The frequency weight can be determined using, for example, a deletion estimation method so as to increase the appearance probability of the finally obtained language model with respect to the test data. The deletion estimation method is described on page 49 of the above reference 2.

【００４６】次に、図２２のフローチャートを参照しな
がら、図２１に示した従来装置および従来方法に基づく
タスク適応による言語モデルの学習手順について説明す
る。まず、言語モデル推定手段１００は、重みパラメー
タ保存手段（図示せず）から、入力に対する重みパラメ
ータを読み込む（ステップＳ２２０１）。Next, a learning procedure of a language model by task adaptation based on the conventional apparatus and the conventional method shown in FIG. 21 will be described with reference to the flowchart of FIG. First, the language model estimation unit 100 reads a weight parameter for an input from a weight parameter storage unit (not shown) (step S2201).

【００４７】次に、対象タスク言語データ１０１および
一般タスク言語データ１０２から単語に区切られた学習
用テキストを読み込み、重みパラメータにしたがって重
み付けされたｎ単語以下の単語列の頻度を求める（ステ
ップＳ２２０２）。Next, the learning text divided into words is read from the target task language data 101 and the general task language data 102, and the frequency of a word string of n words or less weighted according to the weight parameter is obtained (step S2202). .

【００４８】最後に、たとえばＫａｔｚのバックオフス
ムージング法を用いたスムージングを実行して、言語モ
デルのパラメータを推定し（ステップＳ２２０３）、図
２２の処理ルーチンを終了する。Finally, smoothing using, for example, Katz's back-off smoothing method is performed to estimate the parameters of the language model (step S2203), and the processing routine of FIG. 22 ends.

【００４９】上記手法は、一般タスク言語データ１０２
のテキストデータを合わせて利用することにより、対象
タスクに関する少量の学習データから取得困難な多彩な
表現を表す単語列の出現確率を、さらに妥当に推定する
ことができる。The above-described method is applied to the general task language data 102.
By using the text data in combination, it is possible to more appropriately estimate the appearance probabilities of word strings representing various expressions that are difficult to obtain from a small amount of learning data on the target task.

【００５０】また、同時に、対象タスク言語データ１０
１に重み付けすることにより、対象タスクのコーパスに
出現した単語列に対して、さらに大きい確率を与えるこ
とができ、認識精度を向上させることができる。At the same time, the target task language data 10
By weighting 1, a greater probability can be given to a word string appearing in the corpus of the target task, and the recognition accuracy can be improved.

【００５１】しかしながら、上記言語モデルのタスク適
応化方法では、対象タスクで固有の単語や一般タスクで
出現した単語列の出現確率を良好に推定できるものの、
対象タスクに特有の単語と一般タスクで出現した単語と
の組み合わせを考慮していないので、対象タスクのテキ
ストデータが少ないときには、対象タスク特有の単語の
周辺で言語モデルのパラメータ推定精度が悪化するとい
う問題がある。However, in the language model task adaptation method, although it is possible to satisfactorily estimate the appearance probability of a unique word in a target task or a word string appearing in a general task,
Since the combination of words specific to the target task and words that appeared in the general task is not considered, the accuracy of language model parameter estimation deteriorates around words specific to the target task when the text data of the target task is small. There's a problem.

【００５２】たとえば、対象タスクがホテル予約業務で
あって、類似したホテル以外の予約業務タスクで発声さ
れたテキストデータを一般タスク言語データ１０２とし
て利用する場合を考える。For example, consider a case where the target task is a hotel reservation task and text data uttered by a similar reservation task other than a hotel is used as general task language data 102.

【００５３】この場合、「それ、を、お願い」といった
予約業務一般で出現する単語列や、「ホテル」という対
象タスク特有の単語は、それぞれ、一般タスク言語デー
タ１０２および対象タスク言語データ１０１から、頻度
に応じて出現確率が見積もられる。In this case, a word string that appears in general reservation tasks such as “Please, that, please” and a word specific to the target task such as “hotel” are obtained from the general task language data 102 and the target task language data 101, respectively. The appearance probability is estimated according to the frequency.

【００５４】しかし、単語の組み合わせの種類数が非常
に大きいので、対象タスクのテキストデータが少量であ
る場合、「ホテル、を、お願い」といった対象タスク特
有の単語を含む単語列は、十分にテキストデータでカバ
ーされていないことが多い。However, since the number of types of word combinations is very large, if the text data of the target task is small, the word string including the word specific to the target task, such as “Hotel, please, please”, is not enough text. Often not covered by data.

【００５５】この結果、単語列に不適切な出現確率が割
り当てられてしまい、認識精度が低下するおそれがあ
る。特に、対象タスク特有の単語は、タスクを遂行する
うえで重要な場合が多く、これらの単語周辺における認
識精度の低下は、システム全体の性能に大きな影響をお
よぼす可能性が高い。As a result, an inappropriate appearance probability is assigned to the word string, and the recognition accuracy may be reduced. In particular, words specific to the target task are often important in performing the task, and a decrease in recognition accuracy around these words is likely to greatly affect the performance of the entire system.

【００５６】[0056]

【発明が解決しようとする課題】従来の言語モデル学習
装置およびそれを用いた音声認識装置ならびに言語モデ
ル学習方法およびそれを用いた音声認識方法は以上のよ
うに、対象タスクに特有の単語と一般タスクで出現した
単語との組み合わせを考慮していないので、対象タスク
のテキストデータが少ない場合に、対象タスク特有の単
語の周辺で言語モデルのパラメータ推定精度が悪化して
しまい、システム全体の性能に悪影響をおよぼすという
問題点があった。As described above, the conventional language model learning apparatus, the speech recognition apparatus using the same, and the language model learning method and the speech recognition method using the same are as described above. Since the combination with the words that appeared in the task is not considered, if the text data of the target task is small, the accuracy of the language model parameter estimation around the words specific to the target task will deteriorate, and the performance of the entire system will be reduced. There is a problem of having an adverse effect.

【００５７】この発明は上記のような問題点を解決する
ためになされたもので、対象タスクに固有の単語と一般
タスクのデータとから類似する単語を求め、タスク固有
の単語を含む単語列の出現確率の推定に利用することに
より、認識精度を高めた言語モデル学習装置およびそれ
を用いた音声認識装置ならびに言語モデル学習方法およ
びそれを用いた音声認識方法ならびにそれらの方法を記
憶した記憶媒体を得ることを目的とする。The present invention has been made in order to solve the above-described problems. A similar word is obtained from a word specific to a target task and data of a general task, and a word string including a word specific to the task is obtained. A language model learning device and a speech recognition device using the same, and a language model learning method and a speech recognition method using the same, and a storage medium storing the methods are provided by using the language model learning device with an improved recognition accuracy by using the probability of appearance. The purpose is to gain.

【００５８】[0058]

【課題を解決するための手段】この発明の請求項１に係
る言語モデル学習装置は、対象タスクのテキストデータ
を集積した対象タスク言語データと、対象タスク以外の
タスクを含む一般タスクのテキストデータを集積した一
般タスク言語データと、対象タスク言語データおよび一
般タスク言語データから、それぞれ言語モデル学習用の
テキストデータを読み込み、タスク適応化済み言語モデ
ルを構築するための、類似単語対抽出手段、類似単語列
合成手段および言語モデル生成手段とを備え、類似単語
対抽出手段は、対象タスク言語データおよび一般タスク
言語データから各テキストデータを読み込み、対象タス
クのテキストデータに含まれる単語と一般タスクのテキ
ストデータに含まれる単語との組み合わせから類似単語
対を抽出し、類似単語列合成手段は、各テキストデータ
を読み込むとともに、類似単語対抽出手段から類似単語
対を読み込み、言語データに含まれない対象タスク内の
単語を含む単語列を合成して出力し、言語モデル生成手
段は、各テキストデータを読み込むとともに、類似単語
列合成手段から単語列を読み込み、各テキストデータ毎
に重み付けて単語列の統計量を求めることにより、タス
ク適応化済み言語モデルを生成するものである。According to a first aspect of the present invention, there is provided a language model learning apparatus comprising: target task language data in which text data of a target task is accumulated; and text data of a general task including tasks other than the target task. Similar word pair extraction means and similar words for reading text data for language model learning from the accumulated general task language data, target task language data and general task language data, respectively, and constructing a task-adapted language model A similar word pair extracting unit that reads each text data from the target task language data and the general task language data, and includes a word included in the target task text data and a general task text data. Extracts similar word pairs from combinations with words contained in The word string synthesizing unit reads each text data, reads a similar word pair from the similar word pair extracting unit, synthesizes and outputs a word string including a word in the target task not included in the language data, and generates a language model. The means reads the respective text data, reads the word string from the similar word string synthesizing means, and weights each text data to obtain a statistic of the word string, thereby generating a task-adapted language model. .

【００５９】また、この発明の請求項２に係る言語モデ
ル学習装置は、対象タスクのテキストデータを集積した
対象タスク言語データと、対象タスク以外のタスクを含
む一般タスクのテキストデータを集積した一般タスク言
語データと、対象タスク言語データおよび一般タスク言
語データからタスク適応化済み言語モデルを構築するた
めの、対象タスク単語クラス化手段、一般タスク単語ク
ラス化手段および言語モデル生成手段とを備え、対象タ
スク単語クラス化手段は、対象タスク言語データから対
象タスクのテキストデータを読み込み、クラス定義に示
されたクラスに単語を置き換えて、言語モデル学習用の
クラス化された第１のテキストデータを出力し、一般タ
スク単語クラス化手段は、一般タスク言語データから一
般タスクのテキストデータを読み込み、クラス定義に示
されたクラスに単語を置き換えて、言語モデル学習用の
クラス化された第２のテキストデータを出力し、言語モ
デル生成手段は、第１および第２のテキストデータを読
み込み、各テキストデータ毎に重み付けて単語列の統計
量を求めることにより、言語モデルを生成するものであ
る。A language model learning apparatus according to a second aspect of the present invention provides a general task language data in which text data of a target task is collected and a general task text data of a general task including tasks other than the target task. A target task word classifier, a general task word classifier, and a language model generator for constructing a task-adapted language model from language data and target task language data and general task language data; The word classifying means reads the text data of the target task from the target task language data, replaces the word with the class indicated in the class definition, and outputs classified first text data for language model learning, The general task word classifier converts general task language data into general task texts. The data is read, the word is replaced with the class indicated in the class definition, and the classified second text data for language model learning is output. The language model generating means converts the first and second text data into The language model is generated by reading and calculating the statistic of the word string by weighting each text data.

【００６０】また、この発明の請求項３に係る言語モデ
ル学習装置は、対象タスクのテキストデータを集積した
対象タスク言語データと、対象タスク以外のタスクを含
む一般タスクのテキストデータを集積した一般タスク言
語データと、対象タスク言語データおよび一般タスク言
語データからタスク適応化済み言語モデルを構築するた
めの、対象タスク単語クラス化手段、一般タスク単語ク
ラス化手段、類似単語対抽出手段、類似単語列合成手段
および言語モデル生成手段とを備え、対象タスク単語ク
ラス化手段は、対象タスク言語データから対象タスクの
テキストデータを読み込み、クラス定義に示されたクラ
スに単語を置き換えて、言語モデル学習用のクラス化さ
れた第１のテキストデータを出力し、一般タスク単語ク
ラス化手段は、一般タスク言語データから一般タスクの
テキストデータを読み込み、クラス定義に示されたクラ
スに単語を置き換えて、言語モデル学習用のクラス化さ
れた第２のテキストデータを出力し、類似単語対抽出手
段は、第１および第２のテキストデータを読み込み、対
象タスクのテキストデータに含まれる単語と一般タスク
のテキストデータに含まれる単語との組み合わせから類
似単語対を抽出し、類似単語列合成手段は、第１および
第２のテキストデータを読み込むとともに、類似単語対
抽出手段から類似単語対を読み込み、言語データに含ま
れない対象タスク内の単語を含む単語列を合成して出力
し、言語モデル生成手段は、第１および第２のテキスト
データを読み込むとともに、類似単語列合成手段から単
語列を読み込み、各テキストデータ毎に重み付けて単語
列の統計量を求めることにより、タスク適応化済み言語
モデルを生成するものである。A language model learning apparatus according to a third aspect of the present invention provides a general task in which target task language data in which text data of a target task is accumulated and text data of general tasks including tasks other than the target task are accumulated. Target task word classifying means, general task word classifying means, similar word pair extracting means, similar word string synthesis for constructing a task-adapted language model from language data and target task language data and general task language data Means and a language model generating means, the target task word classifying means reads text data of the target task from the target task language data, replaces the word with the class indicated in the class definition, and sets a class for language model learning. And outputs the converted first text data. The general task word classifying means outputs The general task text data is read from the task language data, the word is replaced with the class indicated in the class definition, the second text data classified for language model learning is output, and the similar word pair extraction unit includes: The first and second text data are read, and a similar word pair is extracted from a combination of a word included in the text data of the target task and a word included in the text data of the general task. And the second text data, read a similar word pair from the similar word pair extracting means, synthesize and output a word string including a word in the target task not included in the language data, and The first and second text data are read, and a word string is read from the similar word string combining means. By determining the statistics of word sequences Te weighting, and it generates a task adapted already language model.

【００６１】また、この発明の請求項４に係る言語モデ
ル学習装置は、対象タスクのテキストデータを集積した
対象タスク言語データと、対象タスク以外のタスクを含
む一般タスクのテキストデータを集積した一般タスク言
語データと、事前に準備したテキストデータを用いて作
成された初期言語モデルと、対象タスク言語データ、一
般タスク言語データおよび初期言語モデルから、タスク
適応化済み統計的言語モデルを構築するための、類似単
語対抽出手段および類似単語確率補正手段とを備え、類
似単語対抽出手段は、対象タスク言語データおよび一般
タスク言語データから、それぞれ言語モデル学習用のテ
キストデータを読み込み、対象タスクのテキストデータ
に含まれる単語と一般タスクのテキストデータに含まれ
る単語との組み合わせから類似単語対を抽出し、類似単
語確率補正手段は、類似単語対抽出手段から類似単語対
を読み込むとともに、初期言語モデルを読み込み、対象
タスクで出現する単語の出現確率のスムージングを行う
ことにより、タスク適応化済み統計的言語モデルを生成
するものである。A language model learning apparatus according to a fourth aspect of the present invention provides a general task in which target task language data in which text data of a target task is accumulated and text data of general tasks including tasks other than the target task are accumulated. To construct a task-adapted statistical language model from language data, an initial language model created using text data prepared in advance, target task language data, general task language data, and initial language model, A similar word pair extracting unit and a similar word probability correcting unit, wherein the similar word pair extracting unit reads text data for language model learning from the target task language data and the general task language data, and converts the text data into text data of the target task. Combination of words included and words included in general task text data The similar word probability correction unit reads the similar word pair from the similar word pair extraction unit, reads the initial language model, and performs smoothing of the appearance probability of the word appearing in the target task. , To generate a task-adapted statistical language model.

【００６２】また、この発明の請求項５に係る言語モデ
ル学習装置は、対象タスクのテキストデータを集積した
対象タスク言語データと、対象タスク以外のタスクを含
む一般タスクのテキストデータを集積した一般タスク言
語データと、あらかじめ作成された初期クラス言語モデ
ルと、対象タスク言語データ、一般タスク言語データお
よび初期クラス言語モデルから、タスク適応化済みクラ
ス言語モデルを構築するための、対象タスク単語クラス
化手段、一般タスク単語クラス化手段、類似単語対抽出
手段および類似単語確率補正手段とを備え、対象タスク
単語クラス化手段は、対象タスク言語データから対象タ
スクのテキストデータを読み込み、クラス定義に示され
たクラスに単語を置き換えて、言語モデル学習用のクラ
ス化された第１のテキストデータを出力し、一般タスク
単語クラス化手段は、一般タスク言語データから一般タ
スクのテキストデータを読み込み、クラス定義に示され
たクラスに単語を置き換えて、言語モデル学習用のクラ
ス化された第２のテキストデータを出力し、類似単語対
抽出手段は、第１および第２のテキストデータを読み込
み、対象タスクのテキストデータに含まれる単語と一般
タスクのテキストデータに含まれる単語との組み合わせ
から類似単語対を抽出し、類似単語確率補正手段は、類
似単語対抽出手段から類似単語対を読み込むとともに、
初期クラス言語モデルを読み込み、対象タスクで出現す
る単語の出現確率のスムージングを行うことにより、タ
スク適応化済みクラス言語モデルを生成するものであ
る。A language model learning apparatus according to a fifth aspect of the present invention provides a general task in which target task language data in which text data of a target task is accumulated and text data of general tasks including tasks other than the target task are accumulated. Target task word classifying means for constructing a task-adapted class language model from language data, a pre-created initial class language model, target task language data, general task language data, and an initial class language model; General task word classifying means, similar word pair extracting means and similar word probability correcting means, the target task word classifying means reads the text data of the target task from the target task language data, To the first classified language model for language model learning The general task word classifying means reads the general task text data from the general task language data, replaces the word with the class indicated in the class definition, and outputs the class data for the language model learning. 2, the similar word pair extraction means reads the first and second text data, and performs similarity analysis based on a combination of a word included in the text data of the target task and a word included in the text data of the general task. A word pair is extracted, and the similar word probability correction unit reads the similar word pair from the similar word pair extraction unit,
A task-adapted class language model is generated by reading an initial class language model and smoothing the appearance probability of words appearing in a target task.

【００６３】また、この発明の請求項６に係る言語モデ
ル学習装置は、請求項１または請求項４において、類似
単語抽出手段は、距離算出用言語モデル生成手段、統計
的単語間距離算出手段およびしきい値判定手段を含み、
距離算出用言語モデル生成手段は、対象タスク言語デー
タおよび一般タスク言語データから、それぞれ言語モデ
ル学習用のテキストデータを読み込み、各テキストデー
タ毎に重み付けて単語列の統計量を求めて、距離算出用
の統計的言語モデルを生成し、統計的単語間距離算出手
段は、距離算出用言語モデル生成手段から統計的言語モ
デルを読み込み、各テキストデータから抽出した単語か
らなる単語対について、統計的言語モデル上の統計的な
距離を単語間距離として求め、しきい値判定手段は、統
計的単語間距離算出手段から単語対および単語間距離を
読み込み、所定のしきい値を越える単語対を出力するも
のである。According to a sixth aspect of the present invention, there is provided a language model learning apparatus according to the first or the fourth aspect, wherein the similar word extracting means comprises: a language model generating means for calculating distance; Including a threshold determination means,
The language model generating means for distance calculation reads text data for language model learning from the target task language data and the general task language data, obtains a weight of each text data, obtains a statistic of the word string, and calculates the distance. Statistical language model, and the statistical inter-word distance calculating means reads the statistical language model from the distance calculating language model generating means, and calculates a statistical language model for a word pair consisting of words extracted from each text data. The above statistical distance is obtained as an inter-word distance, and the threshold value judging means reads the word pair and the inter-word distance from the statistical inter-word distance calculating means and outputs a word pair exceeding a predetermined threshold. It is.

【００６４】また、この発明の請求項７に係る言語モデ
ル学習装置は、請求項１または請求項４において、類似
単語抽出手段は、距離算出用言語モデル、統計的単語間
距離算出手段およびしきい値判定手段を含み、距離算出
用言語モデルは、事前に準備したテキストデータを用い
て作成されており、統計的単語間距離算出手段は、距離
算出用言語モデルを読み込み、各テキストデータから抽
出した単語からなる単語対について、距離算出用言語モ
デル上の統計的な距離を単語間距離として求め、しきい
値判定手段は、統計的単語間距離算出手段から単語対お
よび単語間距離を読み込み、所定のしきい値を越える単
語対を出力するものである。According to a seventh aspect of the present invention, in the language model learning apparatus according to the first or fourth aspect, the similar word extracting means includes a language model for calculating a distance, a statistical inter-word distance calculating means, and a threshold. Including the value determination means, the language model for distance calculation is created using text data prepared in advance, and the statistical inter-word distance calculation means reads the language model for distance calculation and extracts it from each text data. For a word pair composed of words, a statistical distance on a distance calculation language model is determined as an inter-word distance, and the threshold value determination unit reads the word pair and the inter-word distance from the statistical inter-word distance calculation unit, and determines a predetermined distance. Is output.

【００６５】また、この発明の請求項８に係る言語モデ
ル学習装置は、請求項３または請求項５において、類似
単語抽出手段は、距離算出用言語モデル生成手段、統計
的単語間距離算出手段およびしきい値判定手段を含み、
距離算出用言語モデル生成手段は、対象タスク単語クラ
ス化手段および一般タスク単語クラス化手段から第１お
よび第２のテキストデータを読み込み、各テキストデー
タ毎に重み付けて単語列の統計量を求めて、距離算出用
の統計的言語モデルを生成し、統計的単語間距離算出手
段は、距離算出用言語モデル生成手段から統計的言語モ
デルを読み込み、各テキストデータから抽出した単語か
らなる単語対について、統計的言語モデル上の統計的な
距離を単語間距離として求め、しきい値判定手段は、統
計的単語間距離算出手段から単語対および単語間距離を
読み込み、所定のしきい値を越える単語対を出力するも
のである。According to an eighth aspect of the present invention, in the language model learning apparatus according to the third or fifth aspect, the similar word extracting means includes: a language model generating means for calculating a distance; Including a threshold determination means,
The language model generating means for distance calculation reads the first and second text data from the target task word classifying means and the general task word classifying means, obtains a weight of each text data and obtains a statistic of the word string, A statistical language model for distance calculation is generated, and the statistical inter-word distance calculation means reads the statistical language model from the distance calculation language model generation means, and performs statistical processing on a word pair consisting of words extracted from each text data. The statistical distance on the statistical language model is determined as the inter-word distance, and the threshold determining unit reads the word pair and the inter-word distance from the statistical inter-word distance calculating unit, and determines a word pair exceeding a predetermined threshold. Output.

【００６６】また、この発明の請求項９に係る言語モデ
ル学習装置は、請求項３または請求項５において、類似
単語抽出手段は、距離算出用クラス言語モデル、統計的
単語間距離算出手段およびしきい値判定手段を含み、距
離算出用クラス言語モデルは、事前に準備したテキスト
データを用いて作成されており、統計的単語間距離算出
手段は、距離算出用クラス言語モデルを読み込むととも
に、対象タスク単語クラス化手段および一般タスク単語
クラス化手段から第１および第２のテキストデータを読
み込み、各テキストデータから抽出した単語からなる単
語対について、距離算出用クラス言語モデル上の統計的
な距離を単語間距離として求め、しきい値判定手段は、
統計的単語間距離算出手段から単語対および単語間距離
を読み込み、所定のしきい値を越える単語対を出力する
ものである。According to a ninth aspect of the present invention, in the language model learning apparatus according to the third or fifth aspect, the similar word extracting means includes a class language model for distance calculation, a statistical interword distance calculating means, The class language model for distance calculation includes threshold value determination means, and the class language model for distance calculation is created using text data prepared in advance. The first and second text data are read from the word classifying means and the general task word classifying means, and the statistical distance on the distance calculating class language model is calculated for the word pair consisting of the words extracted from each text data. The distance is determined as the distance,
It reads word pairs and inter-word distances from the statistical inter-word distance calculation means and outputs word pairs exceeding a predetermined threshold.

【００６７】また、この発明の請求項１０に係る言語モ
デル学習装置は、請求項６から請求項９までのいずれか
において、統計的単語間距離算出手段は、Ｎグラム言語
モデル上のユークリッド距離を用いて、単語間距離を測
定するものである。In the language model learning apparatus according to a tenth aspect of the present invention, in any one of the sixth to ninth aspects, the statistical inter-word distance calculating means may calculate the Euclidean distance on the N-gram language model. To measure the inter-word distance.

【００６８】また、この発明の請求項１１に係る言語モ
デル学習装置は、請求項６から請求項９までのいずれか
において、統計的単語間距離算出手段は、Ｎグラム言語
モデル上のクロスエントロピーを用いて、単語間距離を
測定するものである。Further, in the language model learning apparatus according to claim 11 of the present invention, in any one of claims 6 to 9, the statistical inter-word distance calculating means calculates the cross entropy on the N-gram language model. To measure the inter-word distance.

【００６９】また、この発明の請求項１２に係る音声認
識装置は、請求項１から請求項１１までのいずれかの言
語モデル学習装置を用いた音声認識装置であって、言語
モデルまたはクラス言語モデルは、音声認識に用いられ
るものである。A speech recognition apparatus according to a twelfth aspect of the present invention is a speech recognition apparatus using the language model learning apparatus according to any one of the first to eleventh aspects, wherein the language model or the class language model is used. Is used for speech recognition.

【００７０】また、この発明の請求項１３に係る言語モ
デルの学習方法は、対象タスクのテキストデータを集積
する第１のステップと、対象タスク以外のタスクを含む
一般タスクのテキストデータを集積する第２のステップ
と、対象タスク言語データおよび一般タスク言語データ
から、それぞれ言語モデル学習用のテキストデータを読
み込み、タスク適応化済み言語モデルを構築するため
の、類似単語対抽出ステップ、類似単語列合成ステップ
および言語モデル生成ステップとを備え、類似単語対抽
出ステップは、対象タスク言語データおよび一般タスク
言語データから各テキストデータを読み込み、対象タス
クのテキストデータに含まれる単語と一般タスクのテキ
ストデータに含まれる単語との組み合わせから類似単語
対を抽出し、類似単語列合成ステップは、各テキストデ
ータおよび類似単語対を読み込み、言語データに含まれ
ない対象タスク内の単語を含む単語列を合成して出力
し、言語モデル生成ステップは、各テキストデータおよ
び単語列を読み込み、各テキストデータ毎に重み付けて
単語列の統計量を求めることにより、タスク適応化済み
言語モデルを生成するものである。A language model learning method according to a thirteenth aspect of the present invention includes a first step of collecting text data of a target task and a step of collecting text data of a general task including tasks other than the target task. Step 2, similar word pair extraction step, similar word string synthesizing step for reading language model learning text data from the target task language data and the general task language data and constructing a task-adapted language model And a language model generation step. The similar word pair extraction step reads each text data from the target task language data and the general task language data, and includes words included in the text data of the target task and text data of the general task. A similar word pair is extracted from a combination with a word, The column synthesizing step reads each text data and a similar word pair, synthesizes and outputs a word string including a word in the target task not included in the language data, and the language model generating step converts each text data and the word string into A task-adapted language model is generated by reading and calculating the statistic of a word string by weighting each text data.

【００７１】また、この発明の請求項１４に係る言語モ
デルの学習方法は、対象タスクのテキストデータを集積
する第１のステップと、対象タスク以外のタスクを含む
一般タスクのテキストデータを集積する第２のステップ
と、対象タスク言語データおよび一般タスク言語データ
からタスク適応化済み言語モデルを構築するための、対
象タスク単語クラス化ステップ、一般タスク単語クラス
化ステップおよび言語モデル生成ステップとを備え、対
象タスク単語クラス化ステップは、対象タスク言語デー
タから対象タスクのテキストデータを読み込み、クラス
定義に示されたクラスに単語を置き換えて、言語モデル
学習用のクラス化された第１のテキストデータを出力
し、一般タスク単語クラス化ステップは、一般タスク言
語データから一般タスクのテキストデータを読み込み、
クラス定義に示されたクラスに単語を置き換えて、言語
モデル学習用のクラス化された第２のテキストデータを
出力し、言語モデル生成ステップは、第１および第２の
テキストデータを読み込み、各テキストデータ毎に重み
付けて単語列の統計量を求めることにより、言語モデル
を生成するものである。The language model learning method according to claim 14 of the present invention includes a first step of accumulating text data of the target task and a step of accumulating text data of general tasks including tasks other than the target task. And a target task word classifying step, a general task word classifying step, and a language model generating step for constructing a task-adapted language model from the target task language data and the general task language data. The task word classifying step reads the text data of the target task from the target task language data, replaces the word with the class indicated in the class definition, and outputs classified first text data for language model learning. The general task word classifying step converts general task language data into general tasks. It reads the text data of the click,
The word is replaced with the class indicated in the class definition, and the classified second text data for language model learning is output. The language model generating step reads the first and second text data and reads each text. A language model is generated by weighting each data to obtain a statistic of a word string.

【００７２】また、この発明の請求項１５に係る言語モ
デルの学習方法は、対象タスクのテキストデータを集積
する第１のステップと、対象タスク以外のタスクを含む
一般タスクのテキストデータを集積する第２のステップ
と、対象タスク言語データおよび一般タスク言語データ
からタスク適応化済み言語モデルを構築するための、対
象タスク単語クラス化ステップ、一般タスク単語クラス
化ステップ、類似単語対抽出ステップ、類似単語列合成
ステップおよび言語モデル生成ステップとを備え、対象
タスク単語クラス化ステップは、対象タスク言語データ
から対象タスクのテキストデータを読み込み、クラス定
義に示されたクラスに単語を置き換えて、言語モデル学
習用のクラス化された第１のテキストデータを出力し、
一般タスク単語クラス化ステップは、一般タスク言語デ
ータから一般タスクのテキストデータを読み込み、クラ
ス定義に示されたクラスに単語を置き換えて、言語モデ
ル学習用のクラス化された第２のテキストデータを出力
し、類似単語対抽出ステップは、第１および第２のテキ
ストデータを読み込み、対象タスクのテキストデータに
含まれる単語と一般タスクのテキストデータに含まれる
単語との組み合わせから類似単語対を抽出し、類似単語
列合成ステップは、第１および第２のテキストデータと
類似単語対とを読み込み、言語データに含まれない対象
タスク内の単語を含む単語列を合成して出力し、言語モ
デル生成ステップは、第１および第２のテキストデータ
と単語列とを読み込み、各テキストデータ毎に重み付け
て単語列の統計量を求めることにより、タスク適応化済
み言語モデルを生成するものである。In the language model learning method according to the present invention, a first step of accumulating text data of a target task and a first step of accumulating text data of a general task including tasks other than the target task. Step 2, a target task word classifying step, a general task word classifying step, a similar word pair extraction step, and a similar word string for constructing a task-adapted language model from the target task language data and the general task language data. The target task word classifying step reads the text data of the target task from the target task language data, replaces the word with the class indicated in the class definition, and performs language model learning. Outputting the first text data in a class,
The general task word classifying step reads the general task text data from the general task language data, replaces the word with the class indicated in the class definition, and outputs the classified second text data for language model learning. The similar word pair extracting step reads the first and second text data and extracts a similar word pair from a combination of a word included in the text data of the target task and a word included in the text data of the general task, The similar word string synthesizing step reads the first and second text data and the similar word pair, synthesizes and outputs a word string including a word in the target task that is not included in the language data, and includes a language model generating step. , The first and second text data and the word string are read and weighted for each text data, and the statistics of the word string are read. By obtaining, and generates a task adapted already language model.

【００７３】また、この発明の請求項１６に係る言語モ
デルの学習方法は、対象タスクのテキストデータを集積
する第１のステップと、対象タスク以外のタスクを含む
一般タスクのテキストデータを集積する第２のステップ
と、事前に準備したテキストデータを用いて初期言語モ
デルを作成する第３のステップと、対象タスク言語デー
タ、一般タスク言語データおよび初期言語モデルから、
タスク適応化済み統計的言語モデルを構築するための、
類似単語対抽出ステップおよび類似単語確率補正ステッ
プとを備え、類似単語対抽出ステップは、対象タスク言
語データおよび一般タスク言語データから、それぞれ言
語モデル学習用のテキストデータを読み込み、対象タス
クのテキストデータに含まれる単語と一般タスクのテキ
ストデータに含まれる単語との組み合わせから類似単語
対を抽出し、類似単語確率補正ステップは、類似単語対
および初期言語モデルを読み込み、対象タスクで出現す
る単語の出現確率のスムージングを行うことにより、タ
スク適応化済み統計的言語モデルを生成するものであ
る。Further, in the language model learning method according to claim 16 of the present invention, the first step of accumulating text data of the target task and the step of accumulating text data of a general task including tasks other than the target task. Step 2; a third step of creating an initial language model using text data prepared in advance; and target task language data, general task language data, and the initial language model.
To build a task-adapted statistical language model,
A similar word pair extraction step and a similar word probability correction step, wherein the similar word pair extraction step reads text data for language model learning from the target task language data and the general task language data, respectively, and converts the text data into text data of the target task. A similar word pair is extracted from the combination of the included word and the word included in the text data of the general task. The similar word probability correction step reads the similar word pair and the initial language model, and reads the appearance probability of the word appearing in the target task. Is performed to generate a task-adapted statistical language model.

【００７４】また、この発明の請求項１７に係る言語モ
デルの学習方法は、対象タスクのテキストデータを集積
する第１のステップと、対象タスク以外のタスクを含む
一般タスクのテキストデータを集積する第２のステップ
と、初期クラス言語モデルをあらかじめ作成する第４の
ステップと、対象タスク言語データ、一般タスク言語デ
ータおよび初期クラス言語モデルから、タスク適応化済
みクラス言語モデルを構築するための、対象タスク単語
クラス化ステップ、一般タスク単語クラス化ステップ、
類似単語対抽出ステップおよび類似単語確率補正ステッ
プとを備え、対象タスク単語クラス化ステップは、対象
タスク言語データから対象タスクのテキストデータを読
み込み、クラス定義に示されたクラスに単語を置き換え
て、言語モデル学習用のクラス化された第１のテキスト
データを出力し、一般タスク単語クラス化ステップは、
一般タスク言語データから一般タスクのテキストデータ
を読み込み、クラス定義に示されたクラスに単語を置き
換えて、言語モデル学習用のクラス化された第２のテキ
ストデータを出力し、類似単語対抽出ステップは、第１
および第２のテキストデータを読み込み、対象タスクの
テキストデータに含まれる単語と一般タスクのテキスト
データに含まれる単語との組み合わせから類似単語対を
抽出し、類似単語確率補正ステップは、類似単語対およ
び初期クラス言語モデルを読み込み、対象タスクで出現
する単語の出現確率のスムージングを行うことにより、
タスク適応化済みクラス言語モデルを生成するものであ
る。The language model learning method according to claim 17 of the present invention further comprises a first step of collecting text data of the target task and a step of collecting text data of a general task including tasks other than the target task. Step 2, a fourth step of creating an initial class language model in advance, and a target task for constructing a task-adapted class language model from the target task language data, general task language data, and the initial class language model Word classifying step, general task word classifying step,
A similar word pair extraction step and a similar word probability correction step, the target task word classifying step reads text data of the target task from the target task language data, replaces the word with the class indicated in the class definition, and sets the language The first text data for classifying for model learning is output, and the general task word classifying step includes:
The general task text data is read from the general task language data, the word is replaced with the class indicated in the class definition, the second text data classified for language model learning is output, and the similar word pair extraction step includes , First
And the second text data is read, and a similar word pair is extracted from a combination of a word included in the text data of the target task and a word included in the text data of the general task. By reading the initial class language model and smoothing the appearance probability of words appearing in the target task,
A task-adapted class language model is generated.

【００７５】また、この発明の請求項１８に係る言語モ
デルの学習方法は、請求項１３または請求項１６におい
て、類似単語抽出ステップは、距離算出用言語モデル生
成ステップ、統計的単語間距離算出ステップおよびしき
い値判定ステップを含み、距離算出用言語モデル生成ス
テップは、対象タスク言語データおよび一般タスク言語
データから、それぞれ言語モデル学習用のテキストデー
タを読み込み、各テキストデータ毎に重み付けて単語列
の統計量を求めて、距離算出用の統計的言語モデルを生
成し、統計的単語間距離算出ステップは、統計的言語モ
デルを読み込み、各テキストデータから抽出した単語か
らなる単語対について、統計的言語モデル上の統計的な
距離を単語間距離として求め、しきい値判定ステップ
は、単語対および単語間距離を読み込み、所定のしきい
値を越える単語対を出力するものである。According to a eighteenth aspect of the present invention, there is provided the language model learning method according to the thirteenth or sixteenth aspect, wherein the similar word extracting step includes the step of generating a language model for distance calculation and the step of calculating statistical inter-word distance. And a threshold determination step.The language model generation step for distance calculation reads text data for language model learning from the target task language data and the general task language data, and weights each text data to form a word string. A statistical language model for distance calculation is generated by calculating a statistic, and a statistical word-to-word distance calculating step reads the statistical language model and generates a statistical language model for a word pair composed of words extracted from each text data. The statistical distance on the model is determined as the word-to-word distance, and the threshold value determination step includes Reading between distance, and outputs a word pair exceeding a predetermined threshold.

【００７６】また、この発明の請求項１９に係る言語モ
デルの学習方法は、請求項１３または請求項１６におい
て、類似単語抽出ステップは、距離算出用言語モデル作
成ステップ、統計的単語間距離算出ステップおよびしき
い値判定ステップを含み、距離算出用言語モデル作成ス
テップは、事前に準備したテキストデータを用いて距離
算出用言語モデルを作成し、統計的単語間距離算出ステ
ップは、距離算出用言語モデルを読み込み、各テキスト
データから抽出した単語からなる単語対について、距離
算出用言語モデル上の統計的な距離を単語間距離として
求め、しきい値判定ステップは、単語対および単語間距
離を読み込み、所定のしきい値を越える単語対を出力す
るものである。According to a nineteenth aspect of the present invention, in the method for learning a language model according to the thirteenth or sixteenth aspect, the similar word extracting step includes the steps of: creating a language model for calculating a distance; and calculating a statistical inter-word distance. And a threshold value determination step, wherein the distance calculation language model creation step creates a distance calculation language model using text data prepared in advance, and the statistical inter-word distance calculation step includes the distance calculation language model. Is read, and for a word pair consisting of words extracted from each text data, a statistical distance on a distance calculation language model is determined as an inter-word distance.The threshold value determination step reads the word pair and the inter-word distance, A word pair exceeding a predetermined threshold is output.

【００７７】また、この発明の請求項２０に係る言語モ
デルの学習方法は、請求項１５または請求項１７におい
て、類似単語抽出ステップは、距離算出用言語モデル生
成ステップ、統計的単語間距離算出ステップおよびしき
い値判定ステップを含み、距離算出用言語モデル生成ス
テップは、第１および第２のテキストデータを読み込
み、各テキストデータ毎に重み付けて単語列の統計量を
求めて、距離算出用の統計的言語モデルを生成し、統計
的単語間距離算出ステップは、統計的言語モデルを読み
込み、各テキストデータから抽出した単語からなる単語
対について、統計的言語モデル上の統計的な距離を単語
間距離として求め、しきい値判定ステップは、単語対お
よび単語間距離を読み込み、所定のしきい値を越える単
語対を出力するものである。According to a twentieth aspect of the present invention, there is provided the language model learning method according to the fifteenth or seventeenth aspect, wherein the similar word extracting step includes the step of generating a language model for distance calculation, the step of calculating statistical inter-word distance. And a threshold value determination step. The distance calculation language model generation step reads the first and second text data, obtains a statistic of a word string by weighting each text data, and obtains a distance calculation statistic. Generating a statistical language model, and calculating a statistical inter-word distance, the statistical language model is read, and for a word pair consisting of words extracted from each text data, the statistical distance on the statistical language model is calculated as the inter-word distance. And the threshold determination step reads the word pair and the inter-word distance and outputs a word pair exceeding a predetermined threshold. A.

【００７８】また、この発明の請求項２１に係る言語モ
デルの学習方法は、請求項１５または請求項１７におい
て、類似単語抽出ステップは、距離算出用クラス言語モ
デル作成ステップ、統計的単語間距離算出ステップおよ
びしきい値判定ステップを含み、距離算出用クラス言語
モデル作成ステップは、事前に準備したテキストデータ
を用いて距離算出用クラス言語モデルを作成し、統計的
単語間距離算出ステップは、距離算出用クラス言語モデ
ルならびに第１および第２のテキストデータを読み込
み、各テキストデータから抽出した単語からなる単語対
について、距離算出用クラス言語モデル上の統計的な距
離を単語間距離として求め、しきい値判定ステップは、
単語対および単語間距離を読み込み、所定のしきい値を
越える単語対を出力するものである。According to a twenty-first aspect of the present invention, there is provided a method for learning a language model according to the fifteenth or seventeenth aspect, wherein the similar word extracting step includes the step of creating a class language model for distance calculation, A step of creating a class language model for distance calculation includes creating a class language model for distance calculation using text data prepared in advance, and the step of calculating a statistical inter-word distance includes a step of calculating a distance between words. Class language model and the first and second text data are read, and for a word pair consisting of words extracted from each text data, a statistical distance on the distance calculation class language model is obtained as an inter-word distance. The value determination step is
A word pair and an inter-word distance are read, and a word pair exceeding a predetermined threshold is output.

【００７９】また、この発明の請求項２２に係る言語モ
デルの学習方法は、請求項１８から請求項２１までのい
ずれかにおいて、統計的単語間距離算出ステップは、Ｎ
グラム言語モデル上のユークリッド距離を用いて、単語
間距離を測定するものである。Further, in the method for learning a language model according to claim 22 of the present invention, in any one of claims 18 to 21, the step of calculating the statistical inter-word distance includes the step of:
It measures the distance between words using the Euclidean distance on the Gram language model.

【００８０】また、この発明の請求項２３に係る言語モ
デルの学習方法は、請求項１８から請求項２１までのい
ずれかにおいて、統計的単語間距離算出ステップは、Ｎ
グラム言語モデル上のクロスエントロピーを用いて、単
語間距離を測定するものである。According to a twenty-third aspect of the present invention, in the method for learning a language model according to any one of the eighteenth to twenty-first aspects, the step of calculating the statistical inter-word distance comprises the step of:
It measures the distance between words using cross entropy on a Gram language model.

【００８１】また、この発明の請求項２４に係る音声認
識方法は、請求項１３から請求項２３までのいずれかの
言語モデル学習方法を用いた音声認識方法であって、言
語モデルまたはクラス言語モデルは、音声認識に用いら
れるものである。A speech recognition method according to a twenty-fourth aspect of the present invention is a speech recognition method using any one of the language model learning methods according to the thirteenth to twenty-third aspects. Is used for speech recognition.

【００８２】また、この発明の請求項２５に係る記録媒
体は、請求項１３から請求項２３までのいずれかの言語
モデル学習方法を記憶したものである。A recording medium according to a twenty-fifth aspect of the present invention stores the language model learning method according to any one of the thirteenth to twenty-third aspects.

【００８３】また、この発明の請求項２６に係る記録媒
体は、請求項２４の言語モデル学習方法を用いた音声認
識方法を記憶したものである。A recording medium according to a twenty-sixth aspect of the present invention stores a speech recognition method using the language model learning method of the twenty-fourth aspect.

【００８４】[0084]

【発明の実施の形態】実施の形態１．以下、図面を参照
しながら、この発明の実施の形態１について詳細に説明
する。ここでは、Ｎグラム言語モデルを例にとって説明
するが、任意の統計的言語モデルに対して適用可能であ
ることは言うまでもない。DESCRIPTION OF THE PREFERRED EMBODIMENTS Embodiment 1 Hereinafter, Embodiment 1 of the present invention will be described in detail with reference to the drawings. Here, an N-gram language model will be described as an example, but it goes without saying that the present invention is applicable to any statistical language model.

【００８５】図１はこの発明の実施の形態１による言語
モデル学習装置を概略的に示すブロック構成図であり、
音声認識用の言語モデル学習装置の構成例を示してい
る。図１において、１０１は対象タスクにおける単語に
分割された対象タスク言語データ、１０２は一般タスク
における単語に分割された一般タスク言語データであ
り、これらは前述（図２１参照）と同様のものである。FIG. 1 is a block diagram schematically showing a language model learning apparatus according to Embodiment 1 of the present invention.
3 shows a configuration example of a language model learning device for speech recognition. In FIG. 1, reference numeral 101 denotes target task language data divided into words in a target task, and reference numeral 102 denotes general task language data divided into words in a general task, which are the same as those described above (see FIG. 21). .

【００８６】１０３は類似単語対抽出手段、１０４は類
似単語列合成手段、１０５は言語モデル生成手段であ
り、これらの手段１０３〜１０５は、対象タスク言語デ
ータ１０１および一般タスク言語データ１０２と関連し
て、タスク適応化済み言語モデルを生成する。Reference numeral 103 denotes similar word pair extracting means, 104 denotes similar word string synthesizing means, and 105 denotes language model generating means. These means 103 to 105 are associated with the target task language data 101 and the general task language data 102. To generate a task-adapted language model.

【００８７】言語モデル生成手段１０５は、前述の言語
モデル推定手段１００に対応しており、タスク適応化済
み言語モデルを生成する。類似単語対抽出手段１０３お
よび類似単語列合成手段１０４は、前述の従来装置とは
異なり、この発明の特徴的な部分を構成している。The language model generating means 105 corresponds to the language model estimating means 100 and generates a task-adapted language model. The similar word pair extracting means 103 and the similar word string synthesizing means 104 constitute a characteristic part of the present invention, unlike the above-described conventional apparatus.

【００８８】すなわち、各手段１０３および１０４によ
り、対象タスク固有の単語について類似した一般タスク
の単語を求め、学習テキスト中の一般タスクの単語を類
似する対象タスクの単語で置き換えた単語列を合成し
て、言語モデルの学習テキストに追加することにより、
言語モデル構築の際に、対象タスクのテキストデータが
少量であっても、認識精度を高めることができるように
なっている。That is, by means 103 and 104, similar general task words are obtained for words specific to the target task, and a word string in which the general task words in the learning text are replaced with similar target task words is synthesized. And by adding it to the language model learning text,
In constructing a language model, even if the text data of the target task is small, the recognition accuracy can be improved.

【００８９】以下、図１内の各手段１０３〜１０５の機
能について、各種モデルおよび各種データと関連させな
がら具体的に説明する。ただし、前述と同様の機能ブロ
ックおよびモデルについては、同一符号を付して詳述を
省略する。Hereinafter, the function of each of the units 103 to 105 in FIG. 1 will be specifically described with reference to various models and various data. However, the same functional blocks and models as described above are denoted by the same reference numerals, and detailed descriptions thereof will be omitted.

【００９０】まず、類似単語対抽出手段１０３は、対象
タスク言語データ１０１に含まれる単語ｗＴと、一般タ
スク言語データ１０２に含まれる単語ｗＧとの任意の組
み合わせ（ｗＴ_,ｗＧ）について、あらかじめ定義され
た距離尺度に基づき、単語間の距離を計算する。First, the similar word pair extraction means 103 defines in advance any combination (wT _, wG) of the word wT included in the target task language data 101 and the word wG included in the general task language data 102. The distance between words is calculated based on the distance scale.

【００９１】このとき、類似単語対抽出手段１０３は、
単語間距離の算出値があらかじめ設定されたしきい値ｔ
ｈよりも小さい場合に、その類似単語対（ｗＴ_,ｗＧ）
を類似単語列合成手段１０４に出力する。At this time, the similar word pair extraction means 103
The calculated value of the inter-word distance is a predetermined threshold value t
h, the similar word pair (wT _, wG)
Is output to the similar word string synthesizing unit 104.

【００９２】単語間の距離ｄ（ｗＴ，ｗＧ）は、たとえ
ば、あらかじめ各単語と対応する意味分類を概念の広さ
にしたがって木構造にしておき、各単語が対応する意味
ノード間のアーク数を距離として用いることにより得ら
れる。The distance d (wT, wG) between words is determined, for example, by forming a semantic classification corresponding to each word into a tree structure in advance according to the breadth of the concept, and calculating the number of arcs between the meaning nodes corresponding to each word. It is obtained by using it as a distance.

【００９３】次に、類似単語列合成手段１０４は、対象
タスク言語データ１０１および一般タスク言語データ１
０２に含まれる任意の長さの単語列を別々に取り出すと
ともに、類似単語対抽出手段１０３から読み込んだ類似
単語対（ｗＴ，ｗＧ）を参照し、対象タスクの単語列の
それぞれについて、一般タスク内の単語ｗＧが含まれる
か否かを判定する。Next, the similar word string synthesizing means 104 generates the target task language data 101 and the general task language data 1
02 is extracted separately, and a similar word pair (wT, wG) read from the similar word pair extracting means 103 is referred to, and each of the word strings of the target task is included in the general task. It is determined whether or not the word wG is included.

【００９４】この結果、一般タスク内の単語ｗＧを含む
単語列「・・・ｗＧ・・・」が存在する場合には、続い
て、一般タスク内の単語ｗＧを対象タスク内の単語ｗＴ
で置き換えた単語列「・・・ｗＴ・・・」が、一般タス
クまたは対象タスクのデータに存在するか否かを判定す
る。As a result, when there is a word string “... WG...” Including the word wG in the general task, subsequently, the word wG in the general task is replaced with the word wT in the target task.
Is determined in the data of the general task or the target task.

【００９５】この結果、単語列「・・・ｗＴ・・・」が
一般タスクまたは対象タスクのデータに存在しない場
合、類似単語列合成手段１０４は、一般タスクの単語ｗ
Ｇを対象タスクの単語ｗＴで置き換えた単語列「・・・
ｗＴ・・・」を合成し、言語モデル生成手段１０５に出
力する。As a result, if the word string “... WT...” Does not exist in the data of the general task or the target task, the similar word string synthesizing means 104 outputs the word w of the general task.
A word string "..." in which G is replaced by the word wT of the target task
.. wT... are output to the language model generating means 105.

【００９６】最後に、言語モデル生成手段１０５は、対
象タスク言語データ１０１、一般タスク言語データ１０
２および類似単語列合成手段１０４から、それぞれテキ
ストデータを読み込み、入力される頻度にそれぞれ適当
な重みをつけて単語列の頻度を求め、統計的手法を用い
て言語モデルのパラメータを推定することにより、タス
ク適応化済みの言語モデルを生成する。Finally, the language model generating means 105 generates the target task language data 101, the general task language data 10
2 and the similar word string synthesizing means 104, respectively, reading the text data, assigning appropriate weights to the input frequencies to determine the frequency of the word string, and estimating the parameters of the language model using a statistical method. , Generate a task-adapted language model.

【００９７】次に、図２のフローチャートを参照しなが
ら、図１に示したこの発明の実施の形態１に基づくタス
ク適応による言語モデルの学習手順について、さらに具
体的に説明する。Next, the procedure of learning a language model by task adaptation according to the first embodiment of the present invention shown in FIG. 1 will be described more specifically with reference to the flowchart of FIG.

【００９８】図２において、ステップＳ２０１〜Ｓ２０
３は類似単語対抽出手段１０３により実行される処理、
ステップＳ２０４〜Ｓ２０８は類似単語列合成手段１０
４により実行される処理、ステップＳ２０９〜Ｓ２１１
は言語モデル生成手段１０５により実行される処理であ
る。In FIG. 2, steps S201 to S20
3 is a process executed by the similar word pair extraction unit 103;
Steps S204 to S208 are similar word string synthesizing means 10
4, steps S209 to S211
Is a process executed by the language model generation means 105.

【００９９】まず、類似単語対抽出手段１０３は、対象
タスク言語データ１０１および一般タスク言語データ１
０２から、単語に区切られた学習用テキストを読み込
み、単語対（ｗＴ，ｗＧ）を作成する（ステップＳ２０
１）。First, the similar word pair extraction means 103 extracts the target task language data 101 and the general task language data 1.
02, the learning text divided into words is read, and a word pair (wT, wG) is created (step S20).
1).

【０１００】また、対象タスク言語データ１０１に含ま
れる単語ｗＴと、一般タスク言語データ１０２に含まれ
る単語ｗＧ（単語ｗＴとは異なる）との組み合わせにつ
いて距離ｄ（ｗＴ，ｗＧ）を計算する（ステップＳ２０
２）。Further, a distance d (wT, wG) is calculated for a combination of the word wT included in the target task language data 101 and the word wG (different from the word wT) included in the general task language data 102 (step). S20
2).

【０１０１】続いて、算出された距離ｄ（ｗＴ，ｗＧ）
を所定のしきい値ｔｈと比較し、距離ｄ（ｗＴ，ｗＧ）
がしきい値ｔｈよりも小さいか否かを判定する（ステッ
プＳ２０３）。Subsequently, the calculated distance d (wT, wG)
Is compared with a predetermined threshold th, and the distance d (wT, wG)
Is smaller than the threshold th (step S203).

【０１０２】類似単語対抽出手段１０３は、ステップＳ
２０３において、ｄ（ｗＴ，ｗＧ）≧ｔｈ（すなわち、
Ｎｏ）と判定されれば、ステップＳ２０２に戻って距離
ｄ（ｗＴ，ｗＧ）の計算を繰り返し、ｄ（ｗＴ，ｗＧ）
＜ｔｈ（すなわち、Ｙｅｓ）と判定されれば、そのとき
の単語対（ｗＴ，ｗＧ）を類似単語列合成手段１０４に
出力する。The similar word pair extracting means 103 determines whether the
At 203, d (wT, wG) ≧ th (ie,
If No), the process returns to step S202 and the calculation of the distance d (wT, wG) is repeated, and d (wT, wG)
If determined as <th (that is, Yes), the word pair (wT, wG) at that time is output to the similar word string synthesizing unit 104.

【０１０３】類似単語列合成手段１０４は、対象タスク
言語データ１０１および一般タスク言語データ１０２か
ら単語に区切られたテキストデータを読み込み、データ
に含まれる全てのｎ単語の単語列を取り出して記憶する
（ステップＳ２０４）。The similar word string synthesizing means 104 reads text data delimited by words from the target task language data 101 and the general task language data 102, and extracts and stores word strings of all n words included in the data ( Step S204).

【０１０４】また、読み込んだ単語列から、類似単語対
抽出手段１０３によって選択された単語対（ｗＴ，ｗ
Ｇ）のうち、一般タスクの単語ｗＧが含まれる単語列
「・・・ｗＧ・・・」を取り出す（ステップＳ２０
５）。The word pair (wT, wT) selected from the read word string by the similar word pair extracting means 103
G), a word string “... WG...” Including the general task word wG is extracted (step S20).
5).

【０１０５】続いて、取り出した単語列のうち、一般タ
スク単語ｗＧを対象タスク単語ｗＴに置き換えた単語列
「・・・ｗＴ・・・」が、既に記憶されている単語列に
存在する否かを判定する（ステップＳ２０６）。Subsequently, of the word strings extracted, a word string “... WT...” Obtained by replacing the general task word wG with the target task word wT is present in the already stored word strings. Is determined (step S206).

【０１０６】ステップＳ２０６において、単語列「・・
・ｗＴ・・・」が、既に記憶されている単語列に存在す
る（すなわち、Ｙｅｓ）と判定されば、ステップＳ２０
５に戻り、単語列「・・・ｗＴ・・・」が存在しない
（すなわち、Ｎｏ）と判定されれば、その単語列「・・
・ｗＴ・・・」をテキストデータとして出力する（ステ
ップＳ２０７）。In step S206, the word string "..
··· wT... Are present in the already stored word string (that is, Yes), step S20
5, if it is determined that the word string "... wT ..." does not exist (that is, No), the word string "...
.. WT... Are output as text data (step S207).

【０１０７】次に、全ての類似単語対（ｗＴ，ｗＧ）に
対する処理を終了したか否かを判定し（ステップＳ２０
８）、終了していない（すなわち、Ｎｏ）と判定されれ
ばステップＳ２０２に戻り、終了した（すなわち、Ｙｅ
ｓ）と判定されれば、ステップＳ２０９に進む。これに
より、処理ステップＳ２０２〜Ｓ２０７は、全ての類似
単語対（ｗＴ，ｗＧ）について実行される。Next, it is determined whether or not the processing for all the similar word pairs (wT, wG) has been completed (step S20).
8) If it is determined that the processing has not been completed (that is, No), the process returns to step S202, and the processing has been completed (that is, Ye).
If determined to be s), the process proceeds to step S209. Thereby, the processing steps S202 to S207 are executed for all the similar word pairs (wT, wG).

【０１０８】ここで、具体例として、対象タスクの単語
［横浜駅」と一般タスクの単語「成田空港」との距離が
しきい値ｔｈよりも小さく、各単語列「成田空港、ま
で」および「から、成田空港」が一般テキストデータに
存在している場合を考える。Here, as a specific example, the distance between the word “Yokohama Station” of the target task and the word “Narita Airport” of the general task is smaller than the threshold th, and the respective word strings “to Narita Airport” and “Narita Airport” Consider the case where "Narita Airport" exists in the general text data.

【０１０９】このとき、さらに、対象テキストデータに
単語列「横浜駅、まで」は存在するものの、単語列「か
ら、横浜駅」が存在しない場合であれば、類似単語列合
成手段１０４は、単語列「から、横浜駅」を合成して出
力することになる。この結果、単語の類似情報を用い
て、対象タスクで出現が予想される単語列を学習用テキ
ストデータに追加することになる。At this time, if the target text data includes the word string “Yokohama Station,” but the word string “from Yokohama Station” does not exist, the similar word string synthesizing means 104 The column "kara Yokohama Station" will be synthesized and output. As a result, a word string expected to appear in the target task is added to the learning text data using the word similarity information.

【０１１０】次に、図２において、言語モデル生成手段
１０５は、重みパラメータ保存手段（図示せず）から、
それぞれの入力に対応する重みパラメータを読み込む
（ステップＳ２０９）。Next, in FIG. 2, the language model generation means 105 receives a weight parameter from the weight parameter storage means (not shown).
A weight parameter corresponding to each input is read (step S209).

【０１１１】また、対象タスク言語データ１０１、一般
タスク言語データ１０２および類似単語列合成手段１０
４から、単語に区切られた学習用テキストを読み込み、
単語列の頻度を求める（ステップＳ２１０）。このと
き、Ｎグラム言語モデルの場合には、ｎ単語以下の単語
列について頻度を計算する必要がある。The target task language data 101, general task language data 102, and similar word string synthesizing means 10
From 4, read the training text divided into words,
The frequency of the word string is obtained (step S210). At this time, in the case of the N-gram language model, it is necessary to calculate the frequency for a word string of n words or less.

【０１１２】さらに、言語モデル生成手段１０５は、た
とえば、Ｋａｔｚのバックオフスムージング法を用いた
スムージングを行い、言語モデルのパラメータを推定す
ることにより、タスク適応化済み言語モデルを生成し
（ステップＳ２１１）、図２の処理ルーチンを終了す
る。Further, the language model generating means 105 generates a task-adapted language model by performing smoothing using, for example, Katz's back-off smoothing method and estimating the parameters of the language model (step S211). Then, the processing routine of FIG. 2 ends.

【０１１３】こうして得られた言語モデルの学習データ
には、対象タスクに特徴的な単語を含む単語列が追加さ
れているので、対象タスクに対する言語モデルの予測精
度が向上する。Since the word string including words characteristic of the target task is added to the learning data of the language model thus obtained, the prediction accuracy of the language model for the target task is improved.

【０１１４】したがって、対象以外のタスクを含む大量
データ（一般タスク言語データ１０２）と対象タスクに
関する少量データ（対象タスク言語データ１０１）とか
ら、音声認識用の高精度の言語モデルを推定することが
できる。、タスク適応化済み言語モデルを生成し（ステ
ップＳ２１１）、図２の処理ルーチンを終了する。Therefore, a high-accuracy language model for speech recognition can be estimated from a large amount of data (general task language data 102) including a task other than the target and a small amount of data relating to the target task (target task language data 101). it can. Then, a task-adapted language model is generated (step S211), and the processing routine of FIG. 2 ends.

【０１１５】なお、上記のように得られる言語モデル
は、音声認識に限らず、言語処理を必要とする文字認識
や、自然言語のテキスト処理に対しても適用可能であ
る。The language model obtained as described above is applicable not only to speech recognition but also to character recognition requiring language processing and natural language text processing.

【０１１６】また、図１のように構成される音声認識用
の言語モデル学習装置をプログラムとして記録媒体に記
録することもできる。Further, the language model learning apparatus for speech recognition configured as shown in FIG. 1 can be recorded as a program on a recording medium.

【０１１７】すなわち、図１内の類似単語対抽出手段１
０３と同様の処理を行う類似単語対抽出機能と、類似単
語列合成手段１０４と同様の処理を行う類似単語列合成
機能と、言語モデル生成手段１０５と同様の処理を行う
言語モデル生成機能とから構成されるソフトウェアによ
り、音声認識用言語モデル学習プログラムを実現するこ
とができる。That is, the similar word pair extracting means 1 in FIG.
03, a similar word string synthesizing function that performs the same processing as the similar word string synthesizing unit 104, and a language model generating function that performs the same processing as the language model generating unit 105. With the software configured, a language model learning program for speech recognition can be realized.

【０１１８】実施の形態２．なお、上記実施の形態１で
は、対象タスク言語データ１０１および一般タスク言語
データ１０２からの各テキストデータをそのまま用いた
が、クラス化されたテキストデータを用いてもよい。Embodiment 2 In the first embodiment, the text data from the target task language data 101 and the general task language data 102 are used as they are, but text data classified into classes may be used.

【０１１９】図３はこの発明の実施の形態２による音声
認識装置用の言語モデル学習装置を概略的に示すブロッ
ク構成図であり、前述（図１参照）と同様のものについ
ては、同一符号を付して、または、符号の後に「Ａ」を
付して詳述を省略する。FIG. 3 is a block diagram schematically showing a language model learning device for a speech recognition device according to a second embodiment of the present invention. The same components as those described above (see FIG. 1) are denoted by the same reference numerals. Attaching or adding “A” after the reference numeral will omit detailed description.

【０１２０】図３において、３０１は対象タスク単語ク
ラス化手段であり、対象タスク言語データ１０１と言語
モデル生成手段１０５Ａとの間に挿入されている。３０
２は一般タスク単語クラス化手段であり、一般タスク言
語データ１０２と言語モデル生成手段１０５Ａとの間に
挿入されている。In FIG. 3, reference numeral 301 denotes a target task word classifying means, which is inserted between the target task language data 101 and the language model generating means 105A. 30
Reference numeral 2 denotes a general task word classifying unit, which is inserted between the general task language data 102 and the language model generating unit 105A.

【０１２１】この場合の特徴的な機能は、対象タスク単
語クラス化手段３０１と、一般タスク単語クラス化手段
３０２とを設け、対象タスクおよび一般タスクのテキス
トコーパスの単語をクラス化して、言語モデルの推定パ
ラメータ数を減少させることにより、言語モデル学習の
際に対象タスクのデータが少量であっても高精度の認識
を可能にしたことにある。The characteristic function in this case is that a target task word classifying means 301 and a general task word classifying means 302 are provided to classify the words of the text corpus of the target task and the general task into classes. By reducing the number of estimated parameters, high-accuracy recognition is enabled even when the data of the target task is small in language model learning.

【０１２２】以下、図３内の各手段３０１、３０２の機
能について、各種モデルおよび各種データと関連させな
がら具体的に説明する。単語クラス定義データ（図示せ
ず）は、たとえば、前述（図２０参照）のように、単語
ｗ、単語ｗが所属するクラスｃ、および、単語ｗが所属
するクラスｃから出力される確率Ｐ（ｗ｜ｃ）を記述し
ている。図２０のような単語クラス定義データは、人手
で作成してもよく、計算により学習データから作成して
もよい。Hereinafter, the function of each of the means 301 and 302 in FIG. 3 will be specifically described with reference to various models and various data. The word class definition data (not shown) includes, for example, the word w, the class c to which the word w belongs, and the probability P () output from the class c to which the word w belongs, as described above (see FIG. 20). w | c) is described. The word class definition data as shown in FIG. 20 may be created manually or may be created from learning data by calculation.

【０１２３】対象タスク単語クラス化手段３０１は、単
語クラス定義データにしたがい、入力された対象タスク
言語データ１０１の単語のうちでクラス定義されている
ものを順次クラス化し、言語モデル生成手段１０５Ａに
出力する。The target task word classifying means 301 sequentially classifies, among the words of the input target task language data 101, the classes of which are defined according to the word class definition data, and outputs the words to the language model generating means 105A. I do.

【０１２４】一般タスク単語クラス化手段３０２は、単
語クラス定義データにしたがい、入力された一般タスク
言語データ１０２の単語のうちでクラス定義されている
ものを順次クラス化し、言語モデル生成手段１０５Ａに
出力する。The general task word classifying means 302 sequentially classifies the words of the input general task language data 102 whose classes are defined in accordance with the word class definition data, and outputs the class to the language model generating means 105A. I do.

【０１２５】次に、図４のフローチャートを参照しなが
ら、図３に示したこの発明の実施の形態２に基づくタス
ク適応による言語モデルの学習手順について、さらに具
体的に説明する。Next, the procedure of learning a language model by task adaptation according to the second embodiment of the present invention shown in FIG. 3 will be described more specifically with reference to the flowchart of FIG.

【０１２６】図４において、ステップＳ４０１〜Ｓ４０
３は、対象タスク単語クラス化手段３０１および一般タ
スク単語クラス化手段３０２により実行される処理であ
る。In FIG. 4, steps S401 to S40
3 is a process executed by the target task word classifying means 301 and the general task word classifying means 302.

【０１２７】また、ステップＳ４０４〜Ｓ４０６は、言
語モデル生成手段１０５Ａにより実行される処理であ
り、前述（図２参照）のステップＳ２０９〜Ｓ２１１に
それぞれ対応している。Steps S404 to S406 are processes executed by the language model generating means 105A, and correspond to steps S209 to S211 described above (see FIG. 2).

【０１２８】まず、対象タスク単語クラス化手段３０１
および一般タスク単語クラス化手段３０２は、それぞ
れ、単語クラス定義データ（図示せず）を読み込む（ス
テップＳ４０１）。First, the target task word classifying means 301
The general task word classifier 302 reads word class definition data (not shown) (step S401).

【０１２９】また、対象タスク単語クラス化手段３０１
は、対象タスク言語データ１０１を読み込み、単語クラ
ス定義で定義される単語に関して、単語をクラスに置き
換えたテキストを生成し、これを出力する（ステップＳ
４０２）。The target task word classifying means 301
Reads the target task language data 101, generates a text in which the word is replaced with a class with respect to the word defined by the word class definition, and outputs the text (step S).
402).

【０１３０】同様に、一般タスク単語クラス化手段３０
２は、一般タスク言語データ１０２を読み込み、単語ク
ラス定義で定義される単語に関して、単語をクラスに置
き換えたテキストを生成し、これを出力する（ステップ
Ｓ４０３）。Similarly, the general task word classifying means 30
2 reads the general task language data 102, generates a text in which the word is replaced with a class with respect to the word defined by the word class definition, and outputs the text (step S403).

【０１３１】次に、言語モデル生成手段１０５Ａは、ま
ず、重みパラメータ保存手段（図示せず）から重みパラ
メータを読み込み（ステップＳ４０４）、続いて、対象
タスク単語クラス化手段３０１および一般タスク単語ク
ラス化手段３０２から、クラスを含む単語列である学習
用テキストを読み込み、それぞれについて与えられた重
みパラメータを乗算することにより、単語および単語列
の頻度を累積演算する（ステップＳ４０５）。Next, the language model generation means 105A first reads the weight parameter from the weight parameter storage means (not shown) (step S404), and then, converts the target task word classifying means 301 and the general task word classification. The learning text which is a word string including a class is read from the means 302, and the frequencies of the word and the word string are cumulatively calculated by multiplying the learning text by a weight parameter given to each of them (step S405).

【０１３２】ここで、クラスＮグラム言語モデルの場
合、前述と同様に、ｎ単語以下のクラス列について頻度
を計算する。最後に、言語モデル生成手段１０５Ａは、
算出された頻度をスムージングし、言語モデルのパラメ
ータを推定して、タスク適応化済みクラス言語モデルを
生成し（ステップＳ４０６）、図４の処理ルーチンを終
了する。Here, in the case of the class N-gram language model, the frequency is calculated for a class string of n words or less in the same manner as described above. Finally, the language model generation means 105A
The calculated frequency is smoothed, the parameters of the language model are estimated, a task-adapted class language model is generated (step S406), and the processing routine of FIG. 4 ends.

【０１３３】上記処理手順と、あらかじめ定義された単
語クラス定義データ（図示せず）とにより、クラス言語
モデルが得られる。このように、対象以外のタスクを含
む大量データ（一般タスク言語データ１０２）と、対象
タスクに関する少量データ（対象タスク言語データ１０
１）とから、音声認識用の高精度の言語モデルを推定す
ることができる。A class language model is obtained from the above-described processing procedure and word class definition data (not shown) defined in advance. As described above, a large amount of data including a task other than the target (general task language data 102) and a small amount of data on the target task (target task language data 10)
From 1), a highly accurate language model for speech recognition can be estimated.

【０１３４】なお、こうして得られる言語モデルは、音
声認識のみならず、言語処理を必要とする文字認識や、
自然言語のテキスト処理に対しても適用可能である。The language model obtained in this way can be used not only for speech recognition, but also for character recognition that requires language processing,
It is also applicable to natural language text processing.

【０１３５】また、図３に示した音声認識用の言語モデ
ル学習装置は、プログラムとして記録媒体に記録するこ
ともできる。Further, the language model learning apparatus for speech recognition shown in FIG. 3 can be recorded as a program on a recording medium.

【０１３６】すなわち、図３内の対象タスク単語クラス
化手段３０１と同様の処理を行う対象単語クラス化機能
と、一般タスク単語クラス化手段３０２と同様の処理を
行う一般単語クラス化機能と、言語モデル生成手段１０
５Ａと同様の処理を行う言語モデル生成機能とから構成
されるソフトウェアにより、音声認識用の言語モデル学
習プログラムを実現することができる。That is, a target word classifying function for performing the same processing as the target task word classifying means 301 in FIG. 3, a general word classifying function for performing the same processing as the general task word classifying means 302, and a language Model generation means 10
A software configured with a language model generating function for performing the same processing as that of 5A can realize a language model learning program for speech recognition.

【０１３７】実施の形態３．なお、上記実施の形態２で
は、言語モデル生成手段１０５Ａのみを用いたが、図１
（実施の形態１）と同様の類似単語対抽出手段および類
似単語列合成手段を併用してもよい。Embodiment 3 In the second embodiment, only the language model generating means 105A is used.
Similar word pair extracting means and similar word string synthesizing means similar to those in the first embodiment may be used together.

【０１３８】図５はこの発明の実施の形態３による音声
認識装置用の言語モデル学習装置を概略的に示すブロッ
ク構成図であり、前述（図１、図３参照）と同様のもの
については、同一符号を付して、または、符号の後に
「Ｂ」を付して詳述を省略する。FIG. 5 is a block diagram schematically showing a language model learning device for a speech recognition device according to Embodiment 3 of the present invention. The same components as those described above (see FIGS. 1 and 3) are used. The same reference numeral is given, or “B” is appended after the reference numeral, and the detailed description is omitted.

【０１３９】この場合の特徴的な機能は、単一のクラス
定義にしたがい、対象タスク単語クラス化手段３０１お
よび一般タスク単語クラス化手段３０２を設け、単語を
クラス化して言語モデルのパラメータ数を減少させると
ともに、類似単語対抽出手段１０３Ｂおよび類似単語列
合成手段１０４Ｂを設けることにより、言語モデル構築
の際に対象タスクのデータが少量であっても高精度の認
識を可能にしたことにある。The characteristic function in this case is to provide a target task word classifying means 301 and a general task word classifying means 302 according to a single class definition, and to classify the words to reduce the number of parameters of the language model. In addition, by providing the similar word pair extracting means 103B and the similar word string synthesizing means 104B, high-precision recognition is possible even when the data of the target task is small when constructing the language model.

【０１４０】次に、図６のフローチャートを参照しなが
ら、図５に示したこの発明の実施の形態３に基づくタス
ク適応による言語モデルの学習手順について、さらに具
体的に説明する。Next, the procedure of learning a language model by task adaptation according to the third embodiment of the present invention shown in FIG. 5 will be described more specifically with reference to the flowchart of FIG.

【０１４１】図６において、ステップＳ６０１〜Ｓ６０
３は、前述（図４参照）のステップＳ４０１〜Ｓ４０３
にそれぞれ対応しており、ステップＳ６０４〜Ｓ６１４
は、前述（図２参照）のステップＳ２０１〜Ｓ２１１に
それぞれ対応している。In FIG. 6, steps S601 to S60
3 corresponds to steps S401 to S403 described above (see FIG. 4).
, Respectively, and steps S604 to S614
Correspond to steps S201 to S211 described above (see FIG. 2).

【０１４２】まず、対象タスク単語クラス化手段３０１
および一般タスク単語クラス化手段３０２は、それぞれ
単語クラス定義データ（図示せず）を読み込む（ステッ
プＳ６０１）。First, the target task word classifying means 301
The general task word classifying means 302 reads word class definition data (not shown) (step S601).

【０１４３】対象タスク単語クラス化手段３０１は、対
象タスク言語データ１０１を読み込み、単語クラス定義
で定義される単語に関して単語をクラスに置き換えたテ
キストを生成して出力する（ステップＳ６０２）。The target task word classifying means 301 reads the target task language data 101, and generates and outputs text in which words are replaced with classes with respect to the words defined by the word class definition (step S602).

【０１４４】また、一般タスク単語クラス化手段３０２
は、一般タスク言語データ１０２を読み込み、単語クラ
ス定義で定義される単語に関して単語をクラスに置き換
えたテキストを生成して出力する（ステップＳ６０
３）。The general task word classifying means 302
Reads the general task language data 102, generates and outputs a text in which words are replaced with classes with respect to the words defined in the word class definition (step S60).
3).

【０１４５】類似単語対抽出手段１０３Ｂは、対象タス
ク単語クラス化手段３０１および一般タスク単語クラス
化手段３０２から、対象タスク言語データに含まれるク
ラスｃＴと、一般タスク言語データに含まれるクラスｃ
Ｇ（クラスｃＴとは異なる）との組み合わせからなる単
語クラス対（ｃＴ，ｃＧ）のリストを作成し、これを記
憶する（ステップＳ６０４）。The similar word pair extracting unit 103B outputs the class cT included in the target task language data and the class c included in the general task language data from the target task word classifying unit 301 and the general task word classifying unit 302.
A list of word class pairs (cT, cG) composed of a combination with G (different from class cT) is created and stored (step S604).

【０１４６】また、類似単語対抽出手段１０３Ｂは、対
象タスク言語データに含まれるクラスｃＴと、一般タス
ク言語データに含まれるクラスｃＧ（クラスｃＴとは異
なる）とについて、単語クラス対間の距離ｄ（ｃＴ，ｃ
Ｇ）を求め（ステップＳ６０５）、あらかじめ与えられ
たしきい値ｔｈｃよりも小さいか否かを判定する（ステ
ップＳ６０６）。The similar word pair extracting means 103B calculates the distance d between the word class pair for the class cT included in the target task language data and the class cG (different from the class cT) included in the general task language data. (CT, c
G) is determined (step S605), and it is determined whether the threshold value is smaller than a predetermined threshold thc (step S606).

【０１４７】ステップＳ６０６において、ｄ（ｃＴ，ｃ
Ｇ）≧ｔｈｃ（すなわち、Ｎｏ）と判定されればステッ
プＳ６０５に戻り、ｄ（ｃＴ，ｃＧ）＜ｔｈｃ（すなわ
ち、Ｙｅｓ）と判定されれば、そのときのクラス対（ｃ
Ｔ，ｃＧ）を類似単語対として類似単語列合成手段１０
４Ｂに出力する（ステップＳ６０６）。In step S606, d (cT, c
G) ≧ thc (ie, No), the process returns to step S605, and if it is determined that d (cT, cG) <thc (ie, Yes), the class pair (c
T, cG) as a similar word pair
4B (step S606).

【０１４８】類似単語列合成手段１０４Ｂは、対象タス
ク単語クラス化手段３０１および一般タスク単語クラス
化手段３０２から、クラスに区切られた学習用テキスト
データを読み込み、これを長さｎ以下のクラス列に区切
って記憶する（ステップＳ６０７）。The similar word string synthesizing means 104B reads the learning text data divided into classes from the target task word classifying means 301 and the general task word classifying means 302, and converts them into a class string having a length of n or less. The data is stored separately (step S607).

【０１４９】また、各単語クラス化手段３０１および３
０２から読み込んだクラス列に基づき、類似単語対抽出
手段１０３Ｂにより選択されたクラス対（ｃＴ，ｃＧ）
のうち、一般タスクのクラスｃＧが含まれるクラス列
「・・・ｃＧ・・・」を取り出す（ステップＳ６０
８）。The word classifying means 301 and 3
Class pair (cT, cG) selected by the similar word pair extracting means 103B based on the class sequence read from "02".
Of the general task class cG are extracted (step S60).
8).

【０１５０】さらに、類似単語列合成手段１０４Ｂは、
各単語クラス化手段３０１および３０２から読み込んで
記憶したクラス列を参照し、一般タスクのクラスｃＧを
対象タスクのクラスｃＴで置き換えたクラス列「・・・
ｃＴ・・・」が、対象タスク言語データ１０１または一
般タスク言語データ１０２に存在するか否かを判定する
（ステップＳ６０９）。Further, the similar word string synthesizing means 104B
By referring to the class sequence read and stored from each of the word classifying means 301 and 302, the class sequence "..." in which the class cG of the general task is replaced by the class cT of the target task.
It is determined whether or not “cT...” exists in the target task language data 101 or the general task language data 102 (step S609).

【０１５１】ステップＳ６０９において、各言語データ
１０１または１０２にクラス列「・・・ｃＴ・・・」が
存在する（すなわち、Ｙｅｓ）と判定されれば、ステッ
プＳ６０８に戻り、クラス列が存在しない（すなわち、
Ｎｏ）と判定されれば、そのクラス列「・・・ｃＴ・・
・」を合成して、学習用テキストデータとして出力する
（ステップＳ６１０）。If it is determined in step S609 that the class string “... CT...” Exists in each language data 101 or 102 (ie, Yes), the process returns to step S608, and the class string does not exist ( That is,
If it is determined as No), the class string “... CT.
Are combined and output as learning text data (step S610).

【０１５２】次に、全ての類似クラス対に対して処理を
終了したか否かを判定し（ステップＳ６１１）、終了し
ていない（すなわち、Ｎｏ）と判定されればステップＳ
６０５に戻り、終了した（すなわち、Ｙｅｓ）と判定さ
れれば、言語モデル生成手段１０５Ｂによる処理ステッ
プ（Ｓ６１２〜Ｓ６１４）に進む。これにより、上記処
理は全ての類似単語クラス対（ｃＴ，ｃＧ）に対して繰
り返し実行される。Next, it is determined whether or not the processing has been completed for all the similar class pairs (step S611). If it is determined that the processing has not been completed (that is, No), the process proceeds to step S611.
Returning to 605, if it is determined that the processing has been completed (that is, Yes), the processing proceeds to the processing steps (S612 to S614) by the language model generating unit 105B. Thus, the above processing is repeatedly executed for all the similar word class pairs (cT, cG).

【０１５３】言語モデル生成手段１０５Ｂは、まず、重
みパラメータ保存手段（図示せず）から重みパラメータ
を読み込み（ステップＳ６１２）、続いて、対象タスク
言語データ１０１、一般タスク言語データ１０２および
類似単語列合成手段１０４Ｂから、重みパラメータによ
り頻度の重み付けされて単語に区切られた学習用テキス
トを読み込む（ステップＳ６１３）。The language model generation means 105B first reads the weight parameters from the weight parameter storage means (not shown) (step S612), and subsequently, the target task language data 101, the general task language data 102, and the similar word string synthesis From the means 104B, a learning text which is divided into words by weighting the frequency with the weight parameter is read (step S613).

【０１５４】また、頻度のスムージングを行うことによ
り、言語モデルのパラメータを推定し（ステップＳ６１
４）、図６の処理ルーチンを終了する。上記処理手順お
よびあらかじめ定義される単語クラス定義データ（図示
せず）により、タスク適応化したクラス言語モデルが得
られる。Further, by performing frequency smoothing, the parameters of the language model are estimated (step S61).
4), the processing routine of FIG. 6 ends. A task-adapted class language model can be obtained by the above-described processing procedure and word class definition data (not shown) defined in advance.

【０１５５】このように、対象以外のタスクを含む大量
データと、対象タスクに関する少量データとから、音声
認識のための高精度の言語モデルを学習することができ
る。As described above, a high-precision language model for speech recognition can be learned from a large amount of data including a task other than a target task and a small amount of data relating to the target task.

【０１５６】なお、こうして得られる言語モデルは、音
声認識のみならず、言語処理を必要とする文字認識、自
然言語によるテキスト処理などにも適用可能である。The language model thus obtained can be applied not only to speech recognition but also to character recognition requiring language processing, text processing in a natural language, and the like.

【０１５７】また、図５に示した音声認識用の言語モデ
ル学習装置は、プログラムとして記録媒体に記録するこ
ともできる。The language model learning apparatus for speech recognition shown in FIG. 5 can be recorded on a recording medium as a program.

【０１５８】すなわち、図５内の対象タスク単語クラス
化手段３０１と同様の処理を行う対象単語クラス化機能
と、一般タスク単語クラス化手段３０２と同様の処理を
行う一般単語クラス化機能と、類似単語対抽出手段１０
３Ｂと同様の処理を行う類似単語対抽出機能と、類似単
語列合成手段１０４Ｂと同様の処理を行う類似単語列合
成機能と、言語モデル生成手段１０５Ｂと同様の処理を
行う言語モデル生成機能とから構成されるソフトウェア
により、音声認識用の言語モデル学習プログラムを実現
することができる。That is, the target word classifying function for performing the same processing as the target task word classifying means 301 in FIG. 5 and the general word classifying function for performing the same processing as the general task word classifying means 302 are similar. Word pair extraction means 10
3B, a similar word string synthesizing function that performs the same processing as the similar word string synthesizing unit 104B, and a language model generating function that performs the same processing as the language model generating unit 105B. With the software configured, a language model learning program for speech recognition can be realized.

【０１５９】実施の形態４．なお、上記実施の形態１〜
３では、タスク適応化済み言語モデルを生成するため
に、言語モデル生成手段１０５、１０５Ａまたは１０５
Ｂを用いたが、事前に作成された初期言語モデルと、単
語出現確率のスムージングを実行する類似単語確率補正
手段とを用いてもよい。Embodiment 4 FIG. It should be noted that the first to the first embodiments
In 3, the language model generating means 105, 105A, or 105A is used to generate a task-adapted language model.
Although B was used, an initial language model created in advance and a similar word probability correction unit that executes smoothing of the word appearance probability may be used.

【０１６０】図７はこの発明の実施の形態４による音声
認識装置用の言語モデル学習装置を概略的に示すブロッ
ク構成図であり、前述（図１参照）と同様のものについ
ては、同一符号を付して詳述を省略する。FIG. 7 is a block diagram schematically showing a language model learning device for a speech recognition device according to a fourth embodiment of the present invention. The same components as those described above (see FIG. 1) are designated by the same reference numerals. The detailed description is omitted.

【０１６１】図７において、７０１は初期言語モデル、
７０２は類似単語確率補正手段である。類似単語確率補
正手段７０２は、類似単語対抽出手段１０３からの類似
単語対と、初期言語モデル７０１からの事前の言語モデ
ルとに基づいて、タスク適応化済み統計的言語モデルを
生成する。In FIG. 7, reference numeral 701 denotes an initial language model,
Reference numeral 702 denotes a similar word probability correction unit. The similar word probability correction unit 702 generates a task-adapted statistical language model based on the similar word pair from the similar word pair extraction unit 103 and the prior language model from the initial language model 701.

【０１６２】この場合の特徴的な機能は、類似単語対抽
出手段１０３および類似単語確率補正手段７０２を設
け、対象タスクに特有の単語について一般タスクのテキ
ストデータに出現する類似単語の性質を反映させるた
め、統計的言語モデル構築の際に、対象タスクのデータ
が少量であっても高精度の認識を可能にしたことにあ
る。The characteristic function in this case is that the similar word pair extracting unit 103 and the similar word probability correcting unit 702 are provided, and the characteristic of the similar word appearing in the text data of the general task is reflected on the word specific to the target task. Therefore, when constructing a statistical language model, high-precision recognition is possible even if the data of the target task is small.

【０１６３】以下、図７内の各手段の機能について、各
種モデルおよび各種データと関連させながら具体的に説
明する。初期言語モデル７０１は、周知の従来方法や上
記実施の形態１などの方法によりパラメータ推定された
統計的言語モデルからなる。Hereinafter, the function of each means in FIG. 7 will be specifically described with reference to various models and various data. The initial language model 701 is composed of a statistical language model whose parameters are estimated by a well-known conventional method or the method of the first embodiment.

【０１６４】類似単語確率補正手段７０２は、初期言語
モデル７０１および類似単語対抽出手段１０３から、対
象タスクと一般タスク間の類似単語対を読み込み、対象
タスクの単語が含まれる単語列の条件付き出現確率を補
正する。このときの単語列出現確率の補正処理において
は、類似した一般タスクの単語が含まれる単語列の条件
付き出現確率が用いられる。The similar word probability correcting means 702 reads a similar word pair between the target task and the general task from the initial language model 701 and the similar word pair extracting means 103, and conditionally generates a word string including the word of the target task. Correct the probability. In the correction process of the word string appearance probability at this time, a conditional appearance probability of a word string including words of similar general tasks is used.

【０１６５】類似単語確率補正手段７０２が割り当てる
確率は、学習テキストデータで未出現の単語列の出現確
率として求められ、出現した単語列の条件付き確率から
除いた（ディスカウントした）確率の一部である。すな
わち、学習用テキストデータに存在する単語列の条件付
き出現確率は、初期言語モデル７０１と等しいままで保
存される。The probability assigned by the similar word probability correcting means 702 is obtained as the appearance probability of a word string that has not appeared in the learning text data, and is a part of the probability that has been excluded (discounted) from the conditional probability of the word string that has appeared. is there. That is, the conditional occurrence probability of a word string existing in the learning text data is stored while being equal to the initial language model 701.

【０１６６】次に、図８のフローチャートを参照しなが
ら、図７に示したこの発明の実施の形態４に基づくタス
ク適応による言語モデルの学習手順について、さらに具
体的に説明する。Next, the procedure of learning a language model by task adaptation according to the fourth embodiment of the present invention shown in FIG. 7 will be described more specifically with reference to the flowchart of FIG.

【０１６７】図８において、ステップＳ８０１〜Ｓ８０
３およびＳ８０５は、前述（図２参照）のステップＳ２
０１〜Ｓ２０３およびＳ２０８にそれぞれ対応してい
る。また、ステップＳ８０６〜Ｓ８１２は、類似単語確
率補正手段７０２により実行される処理である。In FIG. 8, steps S801 to S80
3 and S805 are performed in step S2 described above (see FIG. 2).
01 to S203 and S208, respectively. Steps S806 to S812 are processes executed by the similar word probability correction unit 702.

【０１６８】まず、類似単語対抽出手段１０３は、対象
タスク言語データ１０１および一般タスク言語データ１
０２から、単語に区切られた学習用テキストを読み込み
（ステップＳ８０１）、対象タスク言語データに含まれ
る単語ｗＴと一般タスク言語データに含まれる単語ｗＧ
（ｗＴとは異なる）とについて、距離ｄ（ｗＴ，ｗＧ）
を求める（ステップＳ８０２）。First, the similar word pair extracting means 103 extracts the target task language data 101 and the general task language data 1.
02, the text for learning divided into words is read (step S801), and the word wT included in the target task language data and the word wG included in the general task language data are read.
(Different from wT), the distance d (wT, wG)
(Step S802).

【０１６９】続いて、単語間の距離ｄ（ｗＴ，ｗＧ）が
しきい値ｔｈよりも小さいか否かを判定し（ステップＳ
８０３）、ｄ（ｗＴ，ｗＧ）≧ｔｈ（すなわち、Ｎｏ）
と判定されればステップＳ８０２に戻り、ｄ（ｗＴ，ｗ
Ｇ）＜ｔｈ（すなわち、Ｙｅｓ）と判定されれば、その
ときの単語対（ｗＴ，ｗＧ）を類似単語対に追加する
（ステップＳ８０４）。Subsequently, it is determined whether or not the distance d (wT, wG) between words is smaller than a threshold value th (step S).
803), d (wT, wG) ≧ th (that is, No)
Is determined, the process returns to step S802, and d (wT, w
G) If it is determined that <th (that is, Yes), the word pair (wT, wG) at that time is added to the similar word pair (step S804).

【０１７０】以下、上記処理を全ての単語対について計
算終了したか否かを判定し（ステップＳ８０５）、終了
していない（すなわち、Ｎｏ）と判定されればステップ
Ｓ８０２に戻り、終了した（すなわち、Ｙｅｓ）と判定
されれば、次の処理ステップＳ８０６に進む。これによ
り、全単語対についての計算が順次行われ、作成された
類似単語対（ｗＴ，ｗＧ）の一覧が類似単語確率補正手
段７０２に出力される。Thereafter, it is determined whether or not the above processing has been completed for all the word pairs (step S805). If it is determined that the processing has not been completed (ie, No), the process returns to step S802, and has been completed (ie, step S805). , Yes), the process proceeds to the next processing step S806. As a result, calculations for all word pairs are sequentially performed, and a list of created similar word pairs (wT, wG) is output to the similar word probability correction unit 702.

【０１７１】類似単語確率補正手段７０２は、まず、初
期言語モデル７０１を読み込み（ステップＳ８０６）、
続いて、類似単語対抽出手段１０３から読み出される類
似単語対（ｗＴ，ｗＧ）について、初期言語モデル７０
１内に定義された条件付き確率のうち、一般タスク単語
ｗＧを含む条件付き確率ＰｗＧ（ｗ_n｜ｗ₁，・・・，ｗ
_n-1）を取り出す（ステップＳ８０７）。First, the similar word probability correcting means 702 reads the initial language model 701 (step S806).
Subsequently, for the similar word pair (wT, wG) read from the similar word pair extraction unit 103, the initial language model 70
1, the conditional probabilities PwG (w _n | w ₁ ,..., W) including the general task word wG among the conditional probabilities defined in
_n-1 ) is taken out (step S807).

【０１７２】次に、取り出したそれぞれの条件付き確率
について、一般タスク単語ｗＧを対象タスク単語ｗＴで
置き換えた条件付き確率ＰｗＴ（ｗ_n｜ｗ₁，・・・，ｗ
_n-1）が、初期言語モデル７０１で定義されているか否
かを判定する（ステップＳ８０８）。Next, for each of the extracted conditional probabilities, a conditional probability PwT (w _n | w ₁ ,..., W, where the general task word wG is replaced by the target task word wT.
_n-1 ) is determined in the initial language model 701 (step S808).

【０１７３】ステップＳ８０８において、条件付き確率
ＰｗＴ（ｗ_n｜ｗ₁，・・・，ｗ_n-1）が初期言語モデル
７０１で定義されていない（すなわち、Ｎｏ）と判定さ
れれば、未知の単語列のために除いた確率から一部を割
り当てて、条件付き確率を補正し（ステップＳ８０
９）、次の判定ステップＳ８１０に進む。If it is determined in step S808 that the conditional probability PwT (w _n | w ₁ ,..., W _n-1 ) is not defined in the initial language model 701 (ie, No), the unknown probability is determined. A part of the probabilities removed for the word string is assigned to correct the conditional probabilities (step S80).
9) The process proceeds to the next determination step S810.

【０１７４】一方、条件付き確率ＰｗＧが定義されてお
り、ステップＳ８０８において、条件付き確率ＰｗＴが
定義されている（すなわち、Ｙｅｓ）と判定されれば、
直ちに次の判定ステップＳ８１０に進む。On the other hand, if the conditional probability PwG has been defined, and it is determined in step S808 that the conditional probability PwT has been defined (ie, Yes),
The process immediately proceeds to the next determination step S810.

【０１７５】このとき、ステップＳ８０９において補正
した確率は、たとえば、同一の単語履歴（ｗ₁，・・
・，ｗ_n-1）である条件付き確率のうちの最小値とす
る。At this time, the probability corrected in step S809 is, for example, the same word history (w ₁ ,...)
, W _n-1 ) as the minimum value of the conditional probabilities.

【０１７６】次に、他にも一般単語ｗＧを含む単語列の
条件付き確率が存在するか否かを判定し（ステップＳ８
１０）、一般単語ｗＧを含む単語列が存在する（すなわ
ち、Ｙｅｓ）と判定されれば、ステップＳ８０８に戻
る。Next, it is determined whether or not there is another conditional probability of a word string including the general word wG (step S8).
10) If it is determined that a word string including the general word wG exists (that is, Yes), the process returns to step S808.

【０１７７】一方、ステップＳ８１０において、一般単
語ｗＧを含む条件付き確率が他に存在しない（すなわ
ち、Ｎｏ）と判定されれば、全ての単語対（ｗＴ，ｗ
Ｇ）について、上記処理の実行が終了したか否かを判定
する（ステップＳ８１１）。On the other hand, if it is determined in step S810 that there is no other conditional probability including the general word wG (ie, No), all word pairs (wT, w
Regarding G), it is determined whether or not the execution of the above processing has been completed (step S811).

【０１７８】ステップＳ８１１において、全単語対の処
理が終了していない（すなわち、Ｎｏ）と判定されれば
ステップＳ８０７に戻り、終了した（すなわち、Ｙｅ
ｓ）と判定されれば、次の処理ステップＳ８１２に進
む。If it is determined in step S811 that the processing of all the word pairs has not been completed (ie, No), the process returns to step S807, and has been completed (ie, Ye).
If determined to be s), the process proceeds to the next processing step S812.

【０１７９】これにより、全ての一般単語ｗＧを含む単
語列について、また、全ての一般単語ｗＧを含む単語対
（ｗＴ，ｗＧ）について、上記処理が実行される。最後
に、言語モデルの確率の和が「１」となるように、未知
の単語列のために言語モデルから除いた確率の総和を正
規化して（ステップＳ８１２）、図８の処理ルーチンを
終了する。As a result, the above processing is executed for the word string including all the general words wG, and for the word pair (wT, wG) including all the general words wG. Finally, the sum of the probabilities removed from the language model for the unknown word string is normalized so that the sum of the probabilities of the language model becomes "1" (step S812), and the processing routine of FIG. 8 ends. .

【０１８０】仮に、条件付き確率が定義されていない場
合には、通常は簡易な言語モデルによって与えられる確
率が使われる。たとえば、Ｋａｔｚのバックオフにした
がうＮグラム言語モデルでは、低次のＮ−１グラム言語
モデルが参照されて、小さな確率が割り当てられるが、
この確率の精度は低いので、対象タスクの類似単語を含
む単語列がある場合、実際よりも大き確率が見積もられ
ることになる。If the conditional probability is not defined, the probability given by a simple language model is usually used. For example, an N-gram language model according to Katz's backoff refers to a low-order N-1 gram language model and is assigned a small probability,
Since the accuracy of the probability is low, if there is a word string including a similar word of the target task, a larger probability than the actual word is estimated.

【０１８１】一般単語ｗＧを含む他の条件付き確率Ｐｗ
Ｇについても、ステップＳ８１０により同様に処理さ
れ、また、ステップＳ８０６〜Ｓ８１０の処理は、ステ
ップＳ８１１により、全ての類似単語対（ｗＧ、ｗＴ）
について実行される。Other conditional probabilities Pw containing the general word wG
G is processed in the same manner in step S810, and the processing in steps S806 to S810 is performed in step S811 on all similar word pairs (wG, wT).
Is executed for

【０１８２】このように、類似単語確率補正手段７０２
を用いることにより、一般タスクと対象タスクとの間で
性質が類似する単語について、一般タスクの単語の出現
確率を用いたスムージングが行われ、音声認識用のさら
に精度の高いモデルを推定することができる。As described above, the similar word probability correcting means 702
Is used, smoothing is performed on words that have similar properties between the general task and the target task using the appearance probability of words of the general task, and a more accurate model for speech recognition can be estimated. it can.

【０１８３】なお、こうして得られる言語モデルは、前
述と同様に、言語処理を必要とする文字認識や、テキス
ト処理などにも適用可能である。Note that the language model obtained in this way can be applied to character recognition and text processing that require language processing, as described above.

【０１８４】また、図７に示した音声認識用の言語モデ
ル学習装置は、プログラムとして記録媒体に記録するこ
ともできる。すなわち、図７内の類似単語対抽出手段１
０３と同様の処理を行う類似単語対抽出機能と、類似単
語確率補正手段７０２と同様の処理を行う類似単語確率
補正機能とから構成されるソフトウェアにより、音声認
識用の言語モデル学習プログラムを実現することができ
る。The language model learning apparatus for speech recognition shown in FIG. 7 can be recorded as a program on a recording medium. That is, the similar word pair extracting means 1 in FIG.
A language model learning program for speech recognition is realized by software composed of a similar word pair extraction function performing the same processing as that performed in step S03, and a similar word probability correction function performing the same processing as performed by the similar word probability correcting unit 702. be able to.

【０１８５】実施の形態５．なお、上記実施の形態４で
は、対象タスク言語データ１０１および一般タスク言語
データ１０２からの各テキストデータをそのまま用いた
が、上記実施の形態３（図５参照）のようにクラス化さ
れたテキストデータを用いてもよい。Embodiment 5 FIG. In the fourth embodiment, the text data from the target task language data 101 and the general task language data 102 are used as they are. However, as in the third embodiment (see FIG. 5), the text data is classified into classes. May be used.

【０１８６】図９はこの発明の実施の形態５による音声
認識装置用の言語モデル学習装置を概略的に示すブロッ
ク構成図であり、前述（図５、図７参照）と同様のもの
については、同一符号を付して詳述を省略する。FIG. 9 is a block diagram schematically showing a language model learning device for a speech recognition device according to Embodiment 5 of the present invention. The same components as those described above (see FIGS. 5 and 7) are used. The same reference numerals are given and detailed descriptions are omitted.

【０１８７】図９において、９０１は初期クラス言語モ
デルであり、前述（図７参照）の初期言語モデル７０１
に代えて、類似単語確率補正手段７０２に接続されてい
る。In FIG. 9, reference numeral 901 denotes an initial class language model, which is the initial language model 701 described above (see FIG. 7).
Is connected to the similar word probability correcting means 702.

【０１８８】この場合の特徴的な機能は、類似単語対抽
出手段１０３Ｂ、対象タスク単語クラス化手段３０１、
一般タスク単語クラス化手段３０２および類似単語確率
補正手段７０２を設け、対象タスクに特有のクラスに対
して一般タスクのテキストデータに出現する類似クラス
の性質を反映させることにより、対象タスクのデータが
少量であっても、初期クラス言語モデル９０１から、さ
らに認識精度を高めたクラス言語モデルを生成すること
にある。The characteristic functions in this case are similar word pair extracting means 103B, target task word classifying means 301,
A general task word classifying unit 302 and a similar word probability correcting unit 702 are provided to reflect a characteristic of a similar class appearing in text data of a general task on a class specific to the target task. However, the object is to generate a class language model with further improved recognition accuracy from the initial class language model 901.

【０１８９】以下、図９内の各手段の機能について、各
種モデルおよび各種データと関連させながら具体的に説
明する。初期クラス言語モデル９０１は、周知の従来方
法や上記実施の形態２、３などの方法によりパラメータ
推定された統計的クラス言語モデルからなる。The function of each means in FIG. 9 will be specifically described below in relation to various models and various data. The initial class language model 901 is a statistical class language model whose parameters are estimated by a well-known conventional method or the method according to the second or third embodiment.

【０１９０】類似単語確率補正手段７０２により割り当
てられる確率は、学習テキストデータで未出現の単語ク
ラス列のために出現した単語クラス列の条件付き確率か
ら除いた（ディスカウントした）確率の一部であり、学
習用テキストデータに含まれる単語クラスの条件付き出
現確率が保存される。The probability assigned by the similar word probability correcting means 702 is a part of the probability that is excluded (discounted) from the conditional probability of a word class sequence that has appeared for a word class sequence that has not appeared in the learning text data. The conditional occurrence probabilities of the word classes included in the learning text data are stored.

【０１９１】たとえば、単語クラスに関する条件付き確
率Ｐ（ｃ_n｜ｃ₁，・・・，ｃ_n-1）を変えた場合、単語
クラス列の元の条件付き確率よりも大きくなるように確
率が割り当てられる。For example, when the conditional probability P (c _n | c ₁ ,..., C _n-1 ) for the word class is changed, the probability is set so as to be larger than the original conditional probability of the word class sequence. Assigned.

【０１９２】次に、図１０のフローチャートを参照しな
がら、図９に示したこの発明の実施の形態５に基づくタ
スク適応による言語モデルの学習手順について、さらに
具体的に説明する。Next, the procedure of learning a language model by task adaptation according to the fifth embodiment of the present invention shown in FIG. 9 will be described more specifically with reference to the flowchart of FIG.

【０１９３】図１０において、ステップＳ１００１〜Ｓ
１００３は、前述（図６参照）のステップＳ６０１〜Ｓ
６０３にそれぞれ対応しており、ステップＳ１００４〜
Ｓ１０１５は、前述（図８参照）のステップＳ８０１〜
Ｓ８１２にそれぞれ対応している。In FIG. 10, steps S1001-S100
Steps 1003 to 1003 correspond to steps S601 to S100 described above (see FIG. 6).
603, and corresponds to steps S1004 to S1004.
Step S1015 corresponds to steps S801 to S801 described above (see FIG. 8).
This corresponds to S812.

【０１９４】まず、対象タスク単語クラス化手段３０１
および一般タスク単語クラス化手段３０２は、それぞれ
単語クラス定義データ（図示せず）を読み込む（ステッ
プＳ１００１）。First, the target task word classifying means 301
The general task word classifying means 302 reads word class definition data (not shown) (step S1001).

【０１９５】対象タスク単語クラス化手段３０１は、対
象タスク言語データ１０１を読み込み、単語クラス定義
で定義される単語に関して単語をクラスに置き換えたテ
キストを生成して出力する（ステップＳ１００２）。The target task word classifying means 301 reads the target task language data 101, and generates and outputs text in which words are replaced with classes with respect to the words defined by the word class definition (step S1002).

【０１９６】また、一般タスク単語クラス化手段３０２
は、一般タスク言語データ１０２を読み込み、単語クラ
ス定義で定義される単語に関して単語をクラスに置き換
えたテキストを生成して出力する（ステップＳ１００
３）。The general task word classifying means 302
Reads the general task language data 102, generates and outputs text in which words are replaced with classes with respect to the words defined by the word class definition (step S100).
3).

【０１９７】次に、類似単語対抽出手段１０３Ｂは、対
象タスク単語クラス化手段３０１および一般タスク単語
クラス化手段３０２を通して、それぞれクラス列を読み
込む（ステップＳ１００４）。Next, the similar word pair extracting means 103B reads the class strings through the target task word classifying means 301 and the general task word classifying means 302 (step S1004).

【０１９８】また、対象タスク言語データに含まれるク
ラスｃＴと一般タスク言語データに含まれるクラスｃＧ
（ｃＴとは異なる）とについて、距離ｄ（ｃＴ，ｃＧ）
を求め（ステップＳ１００５）、クラス間の距離ｄ（ｃ
Ｔ，ｃＧ）がしきい値ｔｈｃよりも小さいか否かを判定
する（ステップＳ１００６）。The class cT included in the target task language data and the class cG included in the general task language data
(Different from cT), the distance d (cT, cG)
(Step S1005), and the distance d (c) between the classes
(T, cG) is smaller than the threshold value thc (step S1006).

【０１９９】ステップＳ１００６において、ｄ（ｃＴ，
ｃＧ）≧ｔｈｃ（すなわち、Ｎｏ）と判定されればステ
ップＳ１００５に戻り、ｄ（ｃＴ，ｃＧ）＜ｔｈｃ（す
なわち、Ｙｅｓ）と判定されれば、そのときのクラス対
（ｃＴ，ｃＧ）を類似クラス対に追加する（ステップＳ
１００７）。In step S1006, d (cT,
If it is determined that cG) ≧ thc (that is, No), the process returns to step S1005. If it is determined that d (cT, cG) <thc (that is, Yes), the class pair (cT, cG) at that time is similar. Add to class pair (step S
1007).

【０２００】以下、判定ステップＳ１００８を介して、
上記処理を順次全てのクラス対について実行し、作成さ
れた類似クラス対（ｃＴ，ｃＧ）の一覧を類似単語確率
補正手段７０２に出力する。Hereinafter, through determination step S1008,
The above processing is sequentially performed for all the class pairs, and a list of the created similar class pairs (cT, cG) is output to the similar word probability correction unit 702.

【０２０１】次に、類似単語確率補正手段７０２は、ま
ず、初期クラス言語モデル９０１を読み込み（ステップ
Ｓ１００９）、続いて、類似単語対抽出手段１０３Ｂか
ら類似クラス対（ｃＴ，ｃＧ）を順次読み出す（ステッ
プＳ１０１０）。Next, the similar word probability correcting means 702 first reads the initial class language model 901 (step S1009), and then sequentially reads the similar class pairs (cT, cG) from the similar word pair extracting means 103B (step S1009). Step S1010).

【０２０２】また、初期クラス言語モデル９０１内に定
義された条件付き確率のうち、一般タスクのクラスｃＧ
を含む条件付き確率ＰｃＧ（ｃ_n｜ｃ₁，・・・，
ｃ_n-1）のそれぞれについて、一般タスククラスｃＧを
対象タスククラスｃＴで置き換えた条件付き確率ＰｃＴ
（ｃ_n｜ｃ₁，・・・ｃ_n-1）が学習データ内で定義され
ているか否かを判定する（ステップＳ１０１１）。Also, of the conditional probabilities defined in the initial class language model 901, the class cG of the general task
Conditional probability PcG (c _n | c ₁ ,...,
c _n-1 ), the conditional probability PcT of replacing the general task class cG with the target task class cT
It is determined whether (c _n | c ₁ ,..., C _n-1 ) is defined in the learning data (step S1011).

【０２０３】ステップＳ１０１１において、条件付き確
率ＰｃＴ（ｃ_n｜ｃ₁，・・・，ｃ_n- ₁）が初期クラス言
語モデル９０１で定義されていない（すなわち、Ｎｏ）
と判定されれば、未知のクラス列のために除いた確率か
ら一部を割り当てて、条件付き確率を補正し（ステップ
Ｓ１０１２）、次の判定ステップＳ１０１３に進む。In step S1011, the conditional probability PcT (c _n | c ₁ ,..., C _n- ₁ ) is not defined in the initial class language model 901 (ie, No).
Is determined, a part of the probabilities removed for the unknown class sequence is assigned to correct the conditional probability (step S1012), and the process proceeds to the next determination step S1013.

【０２０４】一方、条件付き確率ＰｃＧが定義されてお
り、ステップＳ１０１１において、条件付き確率ＰｃＴ
が定義されている（すなわち、Ｙｅｓ）と判定されれ
ば、直ちに次の判定ステップＳ１０１３に進む。On the other hand, the conditional probability PcG is defined, and in step S1011 the conditional probability PcT
Is determined (that is, Yes), the process immediately proceeds to the next determination step S1013.

【０２０５】このとき、ステップＳ１０１２において補
正した確率は、たとえば、同一のクラス履歴（ｃ₁，・
・・，ｃ_n-1）である条件付き確率のうちの最小値とす
る（ステップＳ１０１２）。At this time, the probability corrected in step S1012 is, for example, the same as the class history (c ₁ ,.
.., C _n-1 ) (step S1012).

【０２０６】以下、ステップＳ１０１３を介して、クラ
スｃＧを含む他の条件付き確率ＰｃＧについても同様の
処理が行われる。また、ステップＳ１０１４を介して、
上記ステップＳ１００６〜Ｓ１０１０の処理は、全ての
類似クラス対（ｃＧ、ｃＴ）について実行される。Hereinafter, the same processing is performed for other conditional probabilities PcG including the class cG via step S1013. Also, through step S1014,
The processes in steps S1006 to S1010 are executed for all the similar class pairs (cG, cT).

【０２０７】最後に、類似単語確率補正手段７０２は、
クラス言語モデルの確率の和が１となるようにバックオ
フ確率を正規化して、タスク適応化済みクラス言語モデ
ルを生成し（ステップＳ１０１５）、図１０の処理ルー
チンを終了する。Finally, the similar word probability correction means 702
The task-adapted class language model is generated by normalizing the back-off probability so that the sum of the probabilities of the class language model becomes 1 (step S1015), and the processing routine of FIG. 10 ends.

【０２０８】このように、各単語クラス化手段３０１お
よび３０２とともに、類似単語対抽出手段１０３Ｂおよ
び類似単語確率補正手段７０２を設け、一般タスクと対
象タスクとの間で性質が類似する単語クラスについて、
一般タスクの単語クラスの出現確率を用いたスムージン
グを行うことにより、音声認識用のクラス言語モデルを
高精度に推定することができる。As described above, the similar word pair extracting means 103B and the similar word probability correcting means 702 are provided in addition to the word classifying means 301 and 302, and a word class having similar properties between the general task and the target task is provided.
By performing smoothing using the appearance probability of a word class of a general task, a class language model for speech recognition can be estimated with high accuracy.

【０２０９】なお、こうして得られるクラス言語モデル
は、言語処理を必要とする文字認識や、自然言語のテキ
スト処理などにも適用可能である。The class language model obtained in this way can be applied to character recognition that requires language processing, text processing of natural language, and the like.

【０２１０】また、図９に示した音声認識用言語モデル
学習装置は、プログラムとして記録媒体に記録すること
もできる。Further, the language model learning apparatus for speech recognition shown in FIG. 9 can be recorded as a program on a recording medium.

【０２１１】すなわち、図９内の類似単語対抽出手段１
０３Ｂと同様の処理を行う類似単語対抽出機能と、対象
タスク単語クラス化手段３０１と同様の処理を行う対象
タスク単語クラス化機能と、一般タスク単語クラス化手
段３０２と同様の処理を行う一般タスク単語クラス化機
能と、類似単語確率補正手段７０２と同様の処理を行う
類似単語確率補正機能とから構成されるソフトウェアに
より、音声認識用の言語モデル学習プログラムを実現す
ることができる。That is, the similar word pair extracting means 1 shown in FIG.
A similar word pair extraction function that performs the same processing as that of the target task word classifier 301, a target task word classifier that performs the same processing as the target task word classifier 301, and a general task that performs the same processing as the general task word classifier 302 A software comprising a word classifying function and a similar word probability correcting function for performing the same processing as the similar word probability correcting means 702 can implement a language model learning program for speech recognition.

【０２１２】実施の形態６なお、上記実施の形態１では、類似単語対抽出手段の機
能構成について具体的に言及しなかったが、たとえば図
１１のように構成してもよい。Embodiment 6 Although the functional configuration of the similar word pair extracting means is not specifically described in the above-described Embodiment 1, it may be configured as shown in FIG. 11, for example.

【０２１３】図１１はこの発明の実施の形態６による音
声認識用の言語モデル学習装置に用いられる類似単語対
抽出手段１０３Ｃの具体的構成例を示す機能ブロック図
であり、前述と同様のものについては、同一符号を付し
て、または符号の後に「Ｃ」を付して、詳述を省略す
る。FIG. 11 is a functional block diagram showing a specific configuration example of the similar word pair extracting means 103C used in the language model learning apparatus for speech recognition according to the sixth embodiment of the present invention. Are denoted by the same reference numerals or by adding a “C” after the reference numerals, and a detailed description thereof will be omitted.

【０２１４】図１１において、１１０１は統計的単語間
距離算出手段、１１０２はしきい値判定手段、１１０５
は類似単語対抽出手段１０３Ｃ内の距離算出用言語モデ
ル生成手段である。In FIG. 11, reference numeral 1101 denotes a statistical inter-word distance calculating means; 1102, a threshold value determining means;
Is a language model generating means for calculating distance in the similar word pair extracting means 103C.

【０２１５】この場合の特徴的な機能は、類似単語対抽
出手段１０３Ｃ内に距離算出用言語モデル生成手段１１
０５、統計的単語間距離算出手段１１０１およびしきい
値判定手段１１０２を設け、言語データにしたがった統
計的距離尺度に基づき、対象タスクの単語ｗＴと一般タ
スクの単語ｗＧとの単語間距離ｄ（ｗＴ，ｗＧ）を算出
して単語対を選択することにより、高精度に類似単語対
を判定することにある。The characteristic function in this case is that the similarity word pair extraction means 103C includes a distance calculation language model generation means 11
05, a statistical inter-word distance calculating unit 1101 and a threshold determining unit 1102 are provided, and based on a statistical distance scale according to the language data, the inter-word distance d () between the word wT of the target task and the word wG of the general task. wT, wG) to select a word pair to determine a similar word pair with high accuracy.

【０２１６】以下、図１１内の各手段の機能について、
各種モデルおよび各種データと関連させながら具体的に
説明する。類似単語対抽出手段１０３Ｃにおいて、統計
的単語間距離算出手段１１０１は、距離算出用言語モデ
ル生成手段１１０５から推定された言語モデルを取り出
し、対象タスク言語データ１０１および一般タスク言語
データ１０２から抽出される異なる単語対のそれぞれに
ついて、言語モデルに基づいた単語間距離を求め、単語
対および単語間距離を出力する。Hereinafter, the function of each means in FIG.
A detailed description will be given in relation to various models and various data. In the similar word pair extraction unit 103C, the statistical inter-word distance calculation unit 1101 extracts the language model estimated from the distance calculation language model generation unit 1105, and is extracted from the target task language data 101 and the general task language data 102. For each different word pair, the inter-word distance based on the language model is obtained, and the word pair and the inter-word distance are output.

【０２１７】しきい値判定手段１１０２は、単語対およ
び統計的単語間距離を、統計的単語間距離算出手段１１
０１から順次読み込み、単語間距離が一定のしきい値以
下の場合に、単語対（ｗＴ，ｗＧ）を出力する。The threshold value judging means 1102 calculates the word pair and the statistical inter-word distance by the statistical inter-word distance calculating means 11.
01, and the word pair (wT, wG) is output when the inter-word distance is equal to or smaller than a certain threshold value.

【０２１８】このとき、統計的単語間距離算出手段１１
０１は、対象タスク内単語ｗＴおよび一般タスク内単語
ｗＧに関する統計的単語間距離の算出方法として、たと
えば、Ｎグラム言語モデルの条件付き確率におけるユー
クリッド距離を用い、以下の（７）式のように統計的単
語間距離Ｄ₁（ｗＴ，ｗＧ）を求める。At this time, the statistical inter-word distance calculating means 11
01 is, for example, using a Euclidean distance in a conditional probability of an N-gram language model as a calculation method of a statistical inter-word distance regarding the word wT in the target task and the word wG in the general task, as in the following equation (7). The statistical inter-word distance D ₁ (wT, wG) is obtained.

【０２１９】[0219]

【数７】 (Equation 7)

【０２２０】ただし、（７）式において、Ｖは言語デー
タ（単語）の語彙ｘの母集団であり、言語モデルに含ま
れる全ての語彙を表す。However, in the expression (7), V is a population of the vocabulary x of the language data (word), and represents all the vocabulary included in the language model.

【０２２１】また、統計的単語間距離算出手段１１０１
は、後続単語に対する先行単語の条件付き確率を用いた
ユークリッド距離を用い、以下の（８）式のように、統
計的単語間距離Ｄ₂（ｗＴ，ｗＧ）を求めることができ
る。Also, statistical inter-word distance calculating means 1101
Uses the Euclidean distance using the conditional probability of the preceding word with respect to the succeeding word, and can obtain the statistical inter-word distance D ₂ (wT, wG) as in the following equation (8).

【０２２２】[0222]

【数８】 (Equation 8)

【０２２３】また、上記（７）式および（８）式を個別
に用いることのみならず、（７）式と（８）式との和を
用いることもできる。Further, not only the above equations (7) and (8) can be used individually, but also the sum of the equations (7) and (8) can be used.

【０２２４】また、統計的単語間距離算出手段１１０１
は、たとえば、単語ｗＴに関するクロスエントロピーを
用い、以下の（９）式のように、統計的単語間距離Ｄ₃
（ｗＴ，ｗＧ）を求めることができる。Also, statistical inter-word distance calculating means 1101
Is, for example, using the cross entropy related to the word wT, and as shown in the following equation (9), the statistical inter-word distance D ₃
(WT, wG) can be obtained.

【０２２５】[0225]

【数９】 (Equation 9)

【０２２６】また、ユークリッド距離を用いた場合と同
様に、以下の（１０）式に示すように、後続単語に関す
る先行単語の条件付き確率を用いることができる。As in the case of using the Euclidean distance, the conditional probability of the preceding word with respect to the succeeding word can be used as shown in the following equation (10).

【０２２７】[0227]

【数１０】 (Equation 10)

【０２２８】また、上記（９）式および（１０）式を個
別に用いることのみならず、（９）式と（１０）式との
和を用いることもできる。In addition, not only the above equations (9) and (10) can be used individually, but also the sum of equations (9) and (10) can be used.

【０２２９】さらに、上記統計的尺度と言語情報とを組
み合わせて用いることもできる。たとえば、単語が形態
素を表す場合において、２つの単語の品詞が同一でない
場合、距離を無限大として類似単語候補から外すことが
できる。Furthermore, the above-mentioned statistical measure and linguistic information can be used in combination. For example, when the word represents a morpheme and the parts of speech of the two words are not the same, the distance can be set to infinity and excluded from the similar word candidates.

【０２３０】次に、図１２のフローチャートを参照しな
がら、図１１に示したこの発明の実施の形態６に基づく
タスク適応における類似単語対抽出手段１０３Ｃの動作
について、さらに具体的に説明する。図１２において、
ステップＳ１２０３〜Ｓ１２０７は、前述（図２参照）
のステップＳ２０１〜Ｓ２０３、Ｓ２０７およびＳ２０
８にそれぞれ対応している。Next, the operation of the similar word pair extraction means 103C in the task adaptation shown in FIG. 11 according to the sixth embodiment of the present invention shown in FIG. 11 will be described more specifically with reference to the flowchart in FIG. In FIG.
Steps S1203 to S1207 are as described above (see FIG. 2).
Steps S201 to S203, S207 and S20
8 respectively.

【０２３１】まず、距離算出用言語モデル生成手段１１
０５は、対象タスク言語データ１０１および一般タスク
言語データ１０２を読み込み（ステップＳ１２０１）、
入力されたテキストデータから、言語モデルのパラメー
タ推定を行う（ステップＳ１２０２）。First, the language model generating means for distance calculation 11
05 reads the target task language data 101 and the general task language data 102 (step S1201),
The language model parameters are estimated from the input text data (step S1202).

【０２３２】また、統計的単語間距離算出手段１１０１
は、対象タスクに含まれる単語ｗＴと、一般タスクに含
まれる単語ｗＧとの任意の組み合わせからなる単語対
（ｗＴ，ｗＧ）を作成し（ステップＳ１２０３）、距離
算出用言語モデル生成手段１１０５により推定される言
語モデル上で統計的距離ｄ（ｗＴ，ｗＧ）を計算する
（ステップＳ１２０４）。Also, statistical inter-word distance calculating means 1101
Creates a word pair (wT, wG) composed of an arbitrary combination of the word wT included in the target task and the word wG included in the general task (step S1203), and estimates by the language model generating means for distance calculation 1105 The statistical distance d (wT, wG) is calculated on the language model to be executed (step S1204).

【０２３３】続いて、しきい値判定手段１１０２は、統
計的単語間距離算出手段１１０１から得られた単語対
（ｗＴ，ｗＧ）の距離ｄ（ｗＴ，ｗＧ）をしきい値ｔｈ
と比較し、距離ｄ（ｗＴ，ｗＧ）がしきい値ｔｈ未満で
あるか否かを判定する（ステップＳ１２０５）。Subsequently, the threshold value judging means 1102 calculates the distance d (wT, wG) of the word pair (wT, wG) obtained from the statistical inter-word distance calculating means 1101 as the threshold value th.
Then, it is determined whether or not the distance d (wT, wG) is smaller than the threshold th (step S1205).

【０２３４】ステップＳ１２０５において、ｄ（ｗＴ，
ｗＧ）≧ｔｈ（すなわち、Ｎｏ）と判定されればステッ
プＳ１２０４に戻り、ｄ（ｗＴ，ｗＧ）＜ｔｈ（すなわ
ち、Ｙｅｓ）と判定されれば、そのときの単語対（ｗ
Ｔ，ｗＧ）を類似単語対として出力する（ステップＳ１
２０６）。In step S1205, d (wT,
If it is determined that wG) ≧ th (that is, No), the process returns to step S1204; if it is determined that d (wT, wG) <th (that is, Yes), the word pair (w
T, wG) as a similar word pair (step S1)
206).

【０２３５】以下、終了判定ステップＳ１２０７を介し
て、以上の処理を全ての単語対（ｗＴ，ｗＧ）について
行う。Hereinafter, the above processing is performed for all the word pairs (wT, wG) through the end determination step S1207.

【０２３６】このように、類似単語対抽出手段１０３Ｃ
において、言語モデルを推定して統計量に基づいた距離
尺度を利用することにより、高精度の類似単語対を判定
することができる。As described above, the similar word pair extracting means 103C
In, a highly accurate similar word pair can be determined by estimating a language model and using a distance scale based on statistics.

【０２３７】なお、こうして得られる言語モデルは、言
語処理を必要とする文字認識や、自然言語のテキスト処
理などにも適用可能である。また、図１１内の類似単語
対抽出手段１０３Ｃの機能をプログラムとして記録媒体
に記録することもできる。The language model obtained in this way can be applied to character recognition that requires language processing, natural language text processing, and the like. Further, the function of the similar word pair extraction means 103C in FIG. 11 can be recorded on a recording medium as a program.

【０２３８】すなわち、図１１内の距離算出用言語モデ
ル生成手段１１０５と同様の処理を行う言語モデル生成
機能と、統計的単語間距離算出手段１１０１と同様の処
理を行う統計的単語間距離算出機能と、しきい値判定手
段１１０２と同様の処理を行うしきい値判定機能とから
構成されるソフトウェアにより、音声認識用の言語モデ
ル学習装置の類似単語対抽出プログラムを実現すること
ができる。That is, a language model generating function for performing the same processing as the distance calculating language model generating means 1105 in FIG. 11, and a statistical inter-word distance calculating function for performing the same processing as the statistical inter-word distance calculating means 1101 With a software including a threshold determination function for performing the same processing as the threshold determination unit 1102, a similar word pair extraction program of a language model learning apparatus for speech recognition can be realized.

【０２３９】また、図１１においては、距離算出用言語
モデル生成手段１１０５を用いたが、図１３のように、
距離算出用言語モデル１３０１を用いてもよい。図１３
において、類似単語対抽出手段１０３Ｄ内の距離算出用
言語モデル１３０１は、前述（図７参照）の初期言語モ
デル７０１と同様のものであり、事前に作成されてい
る。In FIG. 11, the language model generating means for distance calculation 1105 is used, but as shown in FIG.
A language model for distance calculation 1301 may be used. FIG.
The language model 1301 for distance calculation in the similar word pair extraction means 103D is the same as the initial language model 701 described above (see FIG. 7), and is created in advance.

【０２４０】また、ここでは、類似単語対抽出手段１０
３Ｃへの入力データを単語としているが、単語の代わり
に、図１４のように単語クラスを用いてもよい。図１４
において、類似単語対抽出手段１０３Ｅ内の距離算出用
言語モデル生成手段１１０５Ｅおよび統計的単語間距離
算出手段１１０１Ｅは、各単語クラス化手段３０１およ
び３０２から単語クラスを取り込んでいる。この場合
も、前述と同様に、クラス対を抽出することができる。Here, the similar word pair extracting means 10
Although the input data to 3C is a word, a word class may be used instead of a word as shown in FIG. FIG.
In, the language model generating means for calculating distance 1105E and the statistical inter-word distance calculating means 1101E in the similar word pair extracting means 103E fetch word classes from the respective word classifying means 301 and 302. Also in this case, a class pair can be extracted as described above.

【０２４１】さらに、図１４においては、距離算出用言
語モデル生成手段１１０５Ｅを用いているが、図１５の
ように、距離算出用クラス言語モデル１５０１を用いて
もよい。図１５において、類似単語対抽出手段１０３Ｆ
内の距離算出用クラス言語モデル１５０１は、前述（図
９参照）の初期クラス言語モデル９０１と同様のもので
あり、事前に作成されている。Further, in FIG. 14, a language model for distance calculation 1105E is used, but a class language model for distance calculation 1501 may be used as shown in FIG. In FIG. 15, similar word pair extraction means 103F
The class language model for distance calculation 1501 is similar to the above-described initial class language model 901 (see FIG. 9) and is created in advance.

【０２４２】実施の形態７なお、上記実施の形態１〜６では、言語モデル学習装置
のみに注目し、音声認識装置について具体的に言及しな
かったが、たとえば、音声認識装置を図１６のように構
成してもよい。Seventh Embodiment In the first to sixth embodiments, only the language model learning device is focused on and the speech recognition device is not specifically mentioned. For example, the speech recognition device is shown in FIG. May be configured.

【０２４３】図１６はこの発明の実施の形態７による言
語モデルを用いた音声認識装置を概略的に示すブロック
構成図であり、従来方法または上記実施の形態１、４、
６などで述べた方法により生成される言語モデルを用い
た場合を示している。FIG. 16 is a block diagram schematically showing a speech recognition apparatus using a language model according to a seventh embodiment of the present invention.
6 shows a case where a language model generated by the method described in Example 6 is used.

【０２４４】図１６において、１６０１は音響特徴抽出
手段、１６０２は音響モデル、１６０３は音響照合手
段、１６０４は単語辞書、１６０５は言語モデル、１６
０６は言語照合手段である。In FIG. 16, reference numeral 1601 denotes an acoustic feature extracting means; 1602, an acoustic model; 1603, an acoustic collating means; 1604, a word dictionary;
Reference numeral 06 denotes a language matching unit.

【０２４５】言語モデル１６０５は、上記実施の形態
１、４、６で述べた言語モデル学習装置および方法を用
いて構築されたものである。この場合の特徴的な機能
は、各手段１６０１〜１６０４とともに、言語モデル１
６０５を用いた言語照合手段１６０６を設け、対象タス
クのデータが少量の場合であっても高精度の音声認識を
可能としたことにある。The language model 1605 is constructed using the language model learning apparatus and method described in the first, fourth and sixth embodiments. The characteristic function in this case is that the language model 1
A linguistic collation unit 1606 using 605 is provided to enable high-accuracy speech recognition even when the data of the target task is small.

【０２４６】以下、図１６内の各手段の機能について、
各種モデルおよび各種データと関連させながら具体的に
説明する。まず、音響特徴抽出手段１６０１は、入力さ
れた音声波形をＡ／Ｄ変換するとともに、分析時間フレ
ーム毎に取り出して、メルケプストラムなどの音声特徴
を良好に表すパラメータのベクトルに変換する。In the following, the function of each means in FIG.
A detailed description will be given in relation to various models and various data. First, the acoustic feature extraction unit 1601 performs A / D conversion on an input speech waveform, extracts the speech waveform for each analysis time frame, and converts it into a vector of parameters that favorably represents speech features such as mel cepstrum.

【０２４７】音響モデル１６０２は、たとえばＨＭＭを
用いて、音声の認識単位（音素や単語など）内の音響特
徴ベクトルの性質を確率分布や状態推移などによって表
すものである。The acoustic model 1602 expresses the properties of the acoustic feature vector in a speech recognition unit (phonemes, words, etc.) by using a HMM, for example, by means of a probability distribution, a state transition, or the like.

【０２４８】音響照合手段１６０３は、音響特徴抽出手
段１６０１から得られる音素の音響特徴ベクトルと、音
響モデル１６０２とを照合し、照合の度合いを表すスコ
アを出力する。The acoustic matching means 1603 collates the acoustic feature vector of the phoneme obtained from the acoustic feature extracting means 1601 with the acoustic model 1602, and outputs a score representing the degree of collation.

【０２４９】単語辞書１６０４は、音響モデル１６０２
の並びと、言語的な単位である単語との対応を記述する
ものである。言語モデル１６０５は、言語モデル学習装
置から得られ、認識対象とする単語の接続情報を記述す
るものであり、たとえば、単語Ｎグラム言語モデルを用
いて単語間の遷移を（ｎ−１）重マルコフ過程で表現す
る。The word dictionary 1604 contains the acoustic model 1602
And the correspondence between words and words that are linguistic units. The language model 1605 is obtained from the language model learning device, and describes connection information of a word to be recognized. For example, the transition between words is described by using a word N-gram language model by using (n-1) Markov models. Express in the process.

【０２５０】言語照合手段１６０６は、音響照合手段１
６０３から音響特徴量と音響モデルとの照合スコアを受
け取り、単語辞書１６０４および言語モデル１６０５を
参照して、認識対象となる単語列のうち、最もスコアが
高いものを認識結果とする処理を行う。The language matching means 1606 is the sound matching means 1
A collation score between the acoustic feature amount and the acoustic model is received from 603, and a process is performed by referring to the word dictionary 1604 and the language model 1605 to determine a word string having the highest score among the word strings to be recognized as a recognition result.

【０２５１】次に、図１７のフローチャートを参照しな
がら、図１６に示したこの発明の実施の形態７に基づく
音声認識の手順について、さらに具体的に説明する。ま
ず、図１６に示す音声認識装置は、あらかじめ準備した
音響モデル１６０２および単語辞書１６０４とともに、
上記実施の形態１、４、６（図１、図２、図７、図８、
図１１〜図１３参照）により生成された言語モデル１６
０５を読み込む（ステップＳ１７０１）。Next, the procedure of speech recognition based on the seventh embodiment of the present invention shown in FIG. 16 will be described more specifically with reference to the flowchart of FIG. First, the speech recognition device shown in FIG. 16 includes an acoustic model 1602 and a word dictionary 1604 prepared in advance,
Embodiments 1, 4, and 6 (FIGS. 1, 2, 7, 8,
Language model 16 generated according to FIGS. 11 to 13)
05 is read (step S1701).

【０２５２】音響特徴抽出手段１６０１は、認識対象で
ある入力音声をＡ／Ｄし、ある時間区間を区切った音声
フレームを読み込み（ステップＳ１７０２）、対象とす
る音声フレームについて信号処理手法を用い、メルケプ
ストラムなどの音声特徴を良好に表す音響特徴ベクトル
を抽出する（ステップＳ１７０３）。The acoustic feature extraction means 1601 performs A / D on the input speech to be recognized, reads a speech frame obtained by dividing a certain time interval (step S1702), and uses a signal processing method for the speech frame to be processed. An acoustic feature vector that favorably represents a speech feature such as a cepstrum is extracted (step S1703).

【０２５３】続いて、音響照合手段１６０３は、ステッ
プＳ１７０３で得られた音響特徴ベクトルを音響モデル
１６０２と照合して、音響照合スコアを求める（ステッ
プＳ１７０４）。Subsequently, the sound matching means 1603 checks the sound feature vector obtained in step S1703 with the sound model 1602 to obtain a sound matching score (step S1704).

【０２５４】次に、言語照合手段１６０６は、単語辞書
１６０４および言語モデル１６０５を参照して、認識対
象となる単語について、音響照合スコアを累積していく
（ステップＳ１７０５）。Next, the language matching means 1606 refers to the word dictionary 1604 and the language model 1605 and accumulates acoustic matching scores for the words to be recognized (step S1705).

【０２５５】言語照合手段１６０６は、上記照合処理を
各フレーム毎に実行しながら、対象音声の最終フレーム
に到達したか否かを判定し（ステップＳ１７０６）、対
象音声の最終フレームに到達していない（すなわち、Ｎ
ｏ）と判定されればステップＳ１７０２戻る。The language matching means 1606 determines whether or not the last frame of the target voice has been reached while executing the above-described matching process for each frame (step S1706), and has not reached the last frame of the target voice. (Ie, N
If it is determined as o), the process returns to step S1702.

【０２５６】また、ステップＳ１７０６において、対象
音声の最終フレームに到達した（すなわち、Ｙｅｓ）と
判定されれば、照合が終了したものと見なし、この時点
で最も良いスコアとなっているものを認識結果として出
力し（ステップＳ１７０７）、図１７の処理ルーチンを
終了する。If it is determined in step S1706 that the last frame of the target voice has been reached (that is, Yes), it is considered that the matching has been completed, and the one having the best score at this time is recognized as the recognition result. Is output (step S1707), and the processing routine of FIG. 17 ends.

【０２５７】このように、言語モデル１６０５を用いる
ことにより、対象以外のタスクを含む大量データと、対
象タスクに関する少量データとから、高精度の言語モデ
ルが構築されるので、高精度の音声認識を実現すること
ができる。As described above, by using the language model 1605, a high-precision language model is constructed from a large amount of data including a task other than the target task and a small amount of data relating to the target task. Can be realized.

【０２５８】実施の形態８なお、上記実施の形態７では、上記実施の形態１、４、
６により生成された言語モデルを用いたが、上記実施の
形態２、３、５、６により生成されたクラス言語モデル
を用いてもよい。Eighth Embodiment In the seventh embodiment, the first, fourth and fourth embodiments are described.
Although the language model generated by the sixth embodiment is used, the class language model generated by the second, third, fifth, and sixth embodiments may be used.

【０２５９】図１８はこの発明の実施の形態８による言
語モデルを用いた音声認識装置を概略的に示すブロック
構成図であり、上記実施の形態２、３、５、６で述べた
装置および方法により生成される言語モデルを用いた場
合を示している。FIG. 18 is a block diagram schematically showing a speech recognition apparatus using a language model according to the eighth embodiment of the present invention. The apparatus and method described in the second, third, fifth and sixth embodiments are described. 2 shows a case where a language model generated by is used.

【０２６０】図１８において、各手段１６０１〜１６０
４は前述（図１６参照）と同様のものであり、言語照合
手段１６０６Ａは前述の言語照合手段１６０６に対応し
ている。１８０１は言語モデル内のクラスと単語との対
応関係を表すクラス定義、１８０２はクラスの出現確率
を与えるクラス言語モデルである。In FIG. 18, each means 1601 to 160
Reference numeral 4 is the same as that described above (see FIG. 16), and the language matching means 1606A corresponds to the language matching means 1606 described above. Reference numeral 1801 denotes a class definition representing the correspondence between classes and words in the language model, and reference numeral 1802 denotes a class language model that gives the probability of occurrence of the class.

【０２６１】クラス言語モデル１８０２は、上記実施の
形態２、３、５、６（図３〜図６、図９、図１０、図１
４、図１５参照）で述べた装置および方法を用いて構築
したものである。The class language model 1802 is the same as that of the second, third, fifth and sixth embodiments (FIGS. 3 to 6, FIG. 9, FIG. 10, FIG.
4, see FIG. 15).

【０２６２】この場合の特徴的な機能は、クラス言語モ
デル１８０２を用いた言語照合手段１６０６Ａを設ける
ことにより、学習に用いた対象タスクのデータが少量の
場合であっても高精度の音声認識を可能にしたことにあ
る。The characteristic function in this case is that by providing a language matching means 1606A using a class language model 1802, high-precision speech recognition can be performed even when the data of the target task used for learning is small. It is made possible.

【０２６３】次に、図１９のフローチャートを参照しな
がら、図１８に示したこの発明の実施の形態８に基づく
音声認識の手順について、さらに具体的に説明する。図
１９において、ステップＳ１９０１〜Ｓ１９０７は、前
述（図１７参照）のステップＳ１７０１〜Ｓ１７０７に
それぞれ対応している。Next, the procedure of speech recognition based on the eighth embodiment of the present invention shown in FIG. 18 will be described more specifically with reference to the flowchart in FIG. In FIG. 19, steps S1901 to S1907 correspond to steps S1701 to S1707 described above (see FIG. 17), respectively.

【０２６４】まず、あらかじめ準備した音響モデル１６
０２、単語辞書１６０４およびクラス定義１８０１とと
もに、上記実施の形態２、３、５、６により生成された
クラス言語モデル１８０２を読み込む（ステップＳ１９
０１）。First, the acoustic model 16 prepared in advance
02, along with the word dictionary 1604 and the class definition 1801, the class language model 1802 generated according to the second, third, fifth and sixth embodiments is read (step S19).
01).

【０２６５】音響特徴抽出手段１６０１は、認識対象で
ある入力音声をＡ／Ｄし、ある時間区間を区切った音声
フレームを読み込み（ステップＳ１９０２）、対象とす
る音声フレームについて信号処理手法を用い、メルケプ
ストラムなどの音声特徴を良好に表す音響特徴ベクトル
を抽出する（ステップＳ１９０３）。The acoustic feature extraction means 1601 A / Ds the input speech to be recognized, reads a speech frame obtained by dividing a certain time section (step S1902), and uses the signal processing method for the subject speech frame to execute An acoustic feature vector that well represents a speech feature such as a cepstrum is extracted (step S1903).

【０２６６】続いて、音響照合手段１６０３は、得られ
た音響特徴ベクトルを音響モデル１６０２と照合して、
音響照合スコアを求める（ステップＳ１９０４）。Subsequently, the sound matching means 1603 checks the obtained sound feature vector with the sound model 1602,
An acoustic matching score is obtained (step S1904).

【０２６７】次に、言語照合手段１６０６Ａは、単語辞
書１６０４、クラス定義１８０１およびクラス言語モデ
ル１８０２を参照して、認識対象となる単語について、
音響照合スコアを累積していく（ステップＳ１９０
５）。Next, the language matching means 1606A refers to the word dictionary 1604, the class definition 1801 and the class language model 1802, and
The acoustic matching score is accumulated (step S190)
5).

【０２６８】以下、ステップＳ１９０６を介して上記照
合処理を各フレーム毎に実行していき、対象音声の最終
フレームに到達して照合が終了した時点で、最も良いス
コアとなっているものを認識結果として出力し（ステッ
プＳ１９０７）、図１９の処理ルーチンを終了する。Thereafter, the above-described collation processing is executed for each frame through step S1906, and when the final frame of the target voice is reached and the collation is completed, the one having the highest score is determined as the recognition result. Is output (step S1907), and the processing routine in FIG. 19 ends.

【０２６９】このように、クラス言語モデル１８０２を
用いることにより、対象以外のタスクを含む大量データ
と対象タスクに関する少量データとから、高精度の音声
認識を実現することができる。As described above, by using the class language model 1802, high-accuracy speech recognition can be realized from a large amount of data including a task other than the target and a small amount of data relating to the target task.

【０２７０】[0270]

【発明の効果】以上のように、この発明の請求項１によ
れば、対象タスクのテキストデータを集積した対象タス
ク言語データと、対象タスク以外のタスクを含む一般タ
スクのテキストデータを集積した一般タスク言語データ
と、対象タスク言語データおよび一般タスク言語データ
から、それぞれ言語モデル学習用のテキストデータを読
み込み、タスク適応化済み言語モデルを構築するため
の、類似単語対抽出手段、類似単語列合成手段および言
語モデル生成手段とを備え、類似単語対抽出手段は、対
象タスク言語データおよび一般タスク言語データから各
テキストデータを読み込み、対象タスクのテキストデー
タに含まれる単語と一般タスクのテキストデータに含ま
れる単語との組み合わせから類似単語対を抽出し、類似
単語列合成手段は、各テキストデータを読み込むととも
に、類似単語対抽出手段から類似単語対を読み込み、言
語データに含まれない対象タスク内の単語を含む単語列
を合成して出力し、言語モデル生成手段は、各テキスト
データを読み込むとともに、類似単語列合成手段から単
語列を読み込み、各テキストデータ毎に重み付けて単語
列の統計量を求めることにより、タスク適応化済み言語
モデルを生成するようにしたので、認識精度を高めた言
語モデル学習装置が得られる効果がある。As described above, according to the first aspect of the present invention, general task text data of target tasks and text data of general tasks including tasks other than the target tasks are collected. Similar word pair extraction means, similar word string synthesis means for reading text data for language model learning from task language data, target task language data and general task language data, respectively, and constructing a task-adapted language model And a language model generating unit, wherein the similar word pair extracting unit reads each text data from the target task language data and the general task language data, and includes the words included in the target task text data and the general task text data. A similar word pair is extracted from the combination with the word, and the similar word string synthesizing unit, Along with reading the text data, a similar word pair is read from the similar word pair extraction unit, a word string including a word in the target task not included in the language data is synthesized and output, and the language model generation unit converts each text data. In addition to reading, a word string is read from the similar word string synthesizing means and weighted for each text data to obtain a statistic of the word string, thereby generating a task-adapted language model, thereby improving recognition accuracy. There is an effect that a language model learning device can be obtained.

【０２７１】また、この発明の請求項２によれば、対象
タスクのテキストデータを集積した対象タスク言語デー
タと、対象タスク以外のタスクを含む一般タスクのテキ
ストデータを集積した一般タスク言語データと、対象タ
スク言語データおよび一般タスク言語データからタスク
適応化済み言語モデルを構築するための、対象タスク単
語クラス化手段、一般タスク単語クラス化手段および言
語モデル生成手段とを備え、対象タスク単語クラス化手
段は、対象タスク言語データから対象タスクのテキスト
データを読み込み、クラス定義に示されたクラスに単語
を置き換えて、言語モデル学習用のクラス化された第１
のテキストデータを出力し、一般タスク単語クラス化手
段は、一般タスク言語データから一般タスクのテキスト
データを読み込み、クラス定義に示されたクラスに単語
を置き換えて、言語モデル学習用のクラス化された第２
のテキストデータを出力し、言語モデル生成手段は、第
１および第２のテキストデータを読み込み、各テキスト
データ毎に重み付けて単語列の統計量を求めることによ
り、言語モデルを生成するようにしたので、認識精度を
高めた言語モデル学習装置が得られる効果がある。According to claim 2 of the present invention, target task language data in which text data of a target task is collected, general task language data in which text data of general tasks including tasks other than the target task are collected, A target task word classifier, a general task word classifier, and a language model generator for constructing a task-adapted language model from the target task language data and the general task language data; Reads the text data of the target task from the target task language data, replaces the word with the class indicated in the class definition, and converts the first
The general task word classifying means reads the general task text data from the general task language data, replaces the word with the class indicated in the class definition, and classifies for general language model learning. Second
Is output, and the language model generating means reads the first and second text data and weights each text data to obtain a statistic of a word string, thereby generating a language model. Thus, there is an effect that a language model learning device with improved recognition accuracy can be obtained.

【０２７２】また、この発明の請求項３によれば、対象
タスクのテキストデータを集積した対象タスク言語デー
タと、対象タスク以外のタスクを含む一般タスクのテキ
ストデータを集積した一般タスク言語データと、対象タ
スク言語データおよび一般タスク言語データからタスク
適応化済み言語モデルを構築するための、対象タスク単
語クラス化手段、一般タスク単語クラス化手段、類似単
語対抽出手段、類似単語列合成手段および言語モデル生
成手段とを備え、対象タスク単語クラス化手段は、対象
タスク言語データから対象タスクのテキストデータを読
み込み、クラス定義に示されたクラスに単語を置き換え
て、言語モデル学習用のクラス化された第１のテキスト
データを出力し、一般タスク単語クラス化手段は、一般
タスク言語データから一般タスクのテキストデータを読
み込み、クラス定義に示されたクラスに単語を置き換え
て、言語モデル学習用のクラス化された第２のテキスト
データを出力し、類似単語対抽出手段は、第１および第
２のテキストデータを読み込み、対象タスクのテキスト
データに含まれる単語と一般タスクのテキストデータに
含まれる単語との組み合わせから類似単語対を抽出し、
類似単語列合成手段は、第１および第２のテキストデー
タを読み込むとともに、類似単語対抽出手段から類似単
語対を読み込み、言語データに含まれない対象タスク内
の単語を含む単語列を合成して出力し、言語モデル生成
手段は、第１および第２のテキストデータを読み込むと
ともに、類似単語列合成手段から単語列を読み込み、各
テキストデータ毎に重み付けて単語列の統計量を求める
ことにより、タスク適応化済み言語モデルを生成するよ
うにしたので、認識精度を高めた言語モデル学習装置が
得られる効果がある。According to claim 3 of the present invention, target task language data in which text data of a target task is collected, general task language data in which text data of general tasks including tasks other than the target task are collected, Target task word classifying means, general task word classifying means, similar word pair extracting means, similar word string synthesizing means, and language model for constructing a task-adapted language model from target task language data and general task language data The target task word classifying means reads text data of the target task from the target task language data, replaces the word with the class indicated in the class definition, and classifies the text data into a class for the language model learning. 1. The general task word classifying means outputs general task language data. Read the text data of the general task, replace the words with the classes indicated in the class definition, and output the classified second text data for language model learning. Reading the second text data, extracting a similar word pair from a combination of a word included in the text data of the target task and a word included in the text data of the general task,
The similar word string synthesizing unit reads the first and second text data, reads a similar word pair from the similar word pair extracting unit, and synthesizes a word string including a word in the target task that is not included in the language data. The language model generating means reads the first and second text data, reads the word string from the similar word string synthesizing means, and weights each text data to obtain a statistic of the word string. Since the adapted language model is generated, there is an effect that a language model learning device with improved recognition accuracy can be obtained.

【０２７３】また、この発明の請求項４によれば、対象
タスクのテキストデータを集積した対象タスク言語デー
タと、対象タスク以外のタスクを含む一般タスクのテキ
ストデータを集積した一般タスク言語データと、事前に
準備したテキストデータを用いて作成された初期言語モ
デルと、対象タスク言語データ、一般タスク言語データ
および初期言語モデルから、タスク適応化済み統計的言
語モデルを構築するための、類似単語対抽出手段および
類似単語確率補正手段とを備え、類似単語対抽出手段
は、対象タスク言語データおよび一般タスク言語データ
から、それぞれ言語モデル学習用のテキストデータを読
み込み、対象タスクのテキストデータに含まれる単語と
一般タスクのテキストデータに含まれる単語との組み合
わせから類似単語対を抽出し、類似単語確率補正手段
は、類似単語対抽出手段から類似単語対を読み込むとと
もに、初期言語モデルを読み込み、対象タスクで出現す
る単語の出現確率のスムージングを行うことにより、タ
スク適応化済み統計的言語モデルを生成するようにした
ので、認識精度を高めた言語モデル学習装置が得られる
効果がある。According to claim 4 of the present invention, target task language data in which text data of a target task is collected, general task language data in which text data of general tasks including tasks other than the target task are collected, Similar word pair extraction for constructing a task-adapted statistical language model from an initial language model created using text data prepared in advance and target task language data, general task language data, and initial language model Means and a similar word probability correction means, wherein the similar word pair extraction means reads text data for language model learning from the target task language data and the general task language data, respectively, and extracts words included in the text data of the target task. Similar word pairs are obtained from combinations with words contained in general task text data. The similar word probability correction means extracts the similar word pairs from the similar word pair extraction means, reads the initial language model, and smooths the appearance probability of the words appearing in the target task, thereby obtaining the task-adapted statistics. Since a dynamic language model is generated, there is an effect that a language model learning device with improved recognition accuracy can be obtained.

【０２７４】また、この発明の請求項５によれば、対象
タスクのテキストデータを集積した対象タスク言語デー
タと、対象タスク以外のタスクを含む一般タスクのテキ
ストデータを集積した一般タスク言語データと、あらか
じめ作成された初期クラス言語モデルと、対象タスク言
語データ、一般タスク言語データおよび初期クラス言語
モデルから、タスク適応化済みクラス言語モデルを構築
するための、対象タスク単語クラス化手段、一般タスク
単語クラス化手段、類似単語対抽出手段および類似単語
確率補正手段とを備え、対象タスク単語クラス化手段
は、対象タスク言語データから対象タスクのテキストデ
ータを読み込み、クラス定義に示されたクラスに単語を
置き換えて、言語モデル学習用のクラス化された第１の
テキストデータを出力し、一般タスク単語クラス化手段
は、一般タスク言語データから一般タスクのテキストデ
ータを読み込み、クラス定義に示されたクラスに単語を
置き換えて、言語モデル学習用のクラス化された第２の
テキストデータを出力し、類似単語対抽出手段は、第１
および第２のテキストデータを読み込み、対象タスクの
テキストデータに含まれる単語と一般タスクのテキスト
データに含まれる単語との組み合わせから類似単語対を
抽出し、類似単語確率補正手段は、類似単語対抽出手段
から類似単語対を読み込むとともに、初期クラス言語モ
デルを読み込み、対象タスクで出現する単語の出現確率
のスムージングを行うことにより、タスク適応化済みク
ラス言語モデルを生成するようにしたので、認識精度を
高めた言語モデル学習装置が得られる効果がある。According to claim 5 of the present invention, target task language data in which text data of a target task is collected, general task language data in which text data of general tasks including tasks other than the target task are collected, Target task word classifying means and general task word class for constructing a task-adapted class language model from a pre-created initial class language model and target task language data, general task language data, and initial class language model Target task word classifier reads text data of a target task from target task language data, and replaces the word with a class indicated in the class definition. To output the first text data classified into classes for language model learning. Then, the general task word classifying means reads the general task text data from the general task language data, replaces the word with the class indicated in the class definition, and classifies the second text data for language model learning. And the similar word pair extraction means outputs the first
And the second text data are read, and a similar word pair is extracted from a combination of a word included in the text data of the target task and a word included in the text data of the general task. By reading similar word pairs from the means, reading the initial class language model, and smoothing the appearance probability of words appearing in the target task, a task-adapted class language model was generated, so recognition accuracy was improved. There is an effect that an enhanced language model learning device can be obtained.

【０２７５】また、この発明の請求項６によれば、請求
項１または請求項４において、類似単語抽出手段は、距
離算出用言語モデル生成手段、統計的単語間距離算出手
段およびしきい値判定手段を含み、距離算出用言語モデ
ル生成手段は、対象タスク言語データおよび一般タスク
言語データから、それぞれ言語モデル学習用のテキスト
データを読み込み、各テキストデータ毎に重み付けて単
語列の統計量を求めて、距離算出用の統計的言語モデル
を生成し、統計的単語間距離算出手段は、距離算出用言
語モデル生成手段から統計的言語モデルを読み込み、各
テキストデータから抽出した単語からなる単語対につい
て、統計的言語モデル上の統計的な距離を単語間距離と
して求め、しきい値判定手段は、統計的単語間距離算出
手段から単語対および単語間距離を読み込み、所定のし
きい値を越える単語対を出力するようにしたので、認識
精度を高めた言語モデル学習装置が得られる効果があ
る。According to a sixth aspect of the present invention, in the first or fourth aspect, the similar word extracting means includes a language model generating means for calculating a distance, a statistical inter-word distance calculating means, and a threshold value determining means. The language model generation means for distance calculation reads text data for language model learning from the target task language data and the general task language data, and obtains a statistic of the word string by weighting each text data. Generating a statistical language model for distance calculation, and the statistical inter-word distance calculating means reads the statistical language model from the distance calculating language model generating means, and for a word pair consisting of words extracted from each text data, The statistical distance on the statistical language model is determined as the inter-word distance. Load the fine word distance, since the output a word pair exceeding a predetermined threshold, the language model learning device with improved recognition accuracy is the effect obtained.

【０２７６】また、この発明の請求項７によれば、請求
項１または請求項４において、類似単語抽出手段は、距
離算出用言語モデル、統計的単語間距離算出手段および
しきい値判定手段を含み、距離算出用言語モデルは、事
前に準備したテキストデータを用いて作成されており、
統計的単語間距離算出手段は、距離算出用言語モデルを
読み込み、各テキストデータから抽出した単語からなる
単語対について、距離算出用言語モデル上の統計的な距
離を単語間距離として求め、しきい値判定手段は、統計
的単語間距離算出手段から単語対および単語間距離を読
み込み、所定のしきい値を越える単語対を出力するよう
にしたので、認識精度を高めた言語モデル学習装置が得
られる効果がある。According to a seventh aspect of the present invention, in the first or fourth aspect, the similar word extracting means includes a language model for distance calculation, a statistical inter-word distance calculating means, and a threshold value determining means. Including, the language model for distance calculation is created using text data prepared in advance,
The statistical inter-word distance calculation means reads the distance calculation language model, obtains a statistical distance on the distance calculation language model as a word-to-word distance for a word pair composed of words extracted from each text data, The value judging unit reads the word pair and the inter-word distance from the statistical inter-word distance calculating unit and outputs a word pair exceeding a predetermined threshold, so that a language model learning apparatus with improved recognition accuracy is obtained. Has the effect.

【０２７７】また、この発明の請求項８によれば、請求
項３または請求項５において、類似単語抽出手段は、距
離算出用言語モデル生成手段、統計的単語間距離算出手
段およびしきい値判定手段を含み、距離算出用言語モデ
ル生成手段は、対象タスク単語クラス化手段および一般
タスク単語クラス化手段から第１および第２のテキスト
データを読み込み、各テキストデータ毎に重み付けて単
語列の統計量を求めて、距離算出用の統計的言語モデル
を生成し、統計的単語間距離算出手段は、距離算出用言
語モデル生成手段から統計的言語モデルを読み込み、各
テキストデータから抽出した単語からなる単語対につい
て、統計的言語モデル上の統計的な距離を単語間距離と
して求め、しきい値判定手段は、統計的単語間距離算出
手段から単語対および単語間距離を読み込み、所定のし
きい値を越える単語対を出力するようにしたので、認識
精度を高めた言語モデル学習装置が得られる効果があ
る。According to an eighth aspect of the present invention, in the third or fifth aspect, the similar word extracting means includes a language model generating means for calculating distance, a statistical inter-word distance calculating means, and a threshold value judging means. The language model generating means for distance calculation reads the first and second text data from the target task word classifying means and the general task word classifying means, and weights each text data for statistical data of the word string. , And a statistical language model for distance calculation is generated. The statistical inter-word distance calculation means reads the statistical language model from the distance calculation language model generation means and generates a word composed of words extracted from each text data. For the pair, the statistical distance on the statistical language model is determined as the inter-word distance, and the threshold value judging means receives the word pair from the statistical inter-word distance calculating means. Load the fine word distance, since the output a word pair exceeding a predetermined threshold, the language model learning device with improved recognition accuracy is the effect obtained.

【０２７８】また、この発明の請求項９によれば、請求
項３または請求項５において、類似単語抽出手段は、距
離算出用クラス言語モデル、統計的単語間距離算出手段
およびしきい値判定手段を含み、距離算出用クラス言語
モデルは、事前に準備したテキストデータを用いて作成
されており、統計的単語間距離算出手段は、距離算出用
クラス言語モデルを読み込むとともに、対象タスク単語
クラス化手段および一般タスク単語クラス化手段から第
１および第２のテキストデータを読み込み、各テキスト
データから抽出した単語からなる単語対について、距離
算出用クラス言語モデル上の統計的な距離を単語間距離
として求め、しきい値判定手段は、統計的単語間距離算
出手段から単語対および単語間距離を読み込み、所定の
しきい値を越える単語対を出力するようにしたので、認
識精度を高めた言語モデル学習装置が得られる効果があ
る。According to a ninth aspect of the present invention, in the third or fifth aspect, the similar word extracting means includes a class language model for calculating distance, a statistical inter-word distance calculating means, and a threshold determining means. The distance calculation class language model is created using text data prepared in advance, and the statistical inter-word distance calculation means reads the distance calculation class language model and sets the target task word classification means. And reading the first and second text data from the general task word classifying means, and for a word pair consisting of words extracted from each text data, obtains a statistical distance on a distance calculation class language model as an inter-word distance. The threshold determining means reads the word pair and the inter-word distance from the statistical inter-word distance calculating means and exceeds a predetermined threshold Since so as to output a word pair, the effect of the language model learning device with improved recognition accuracy can be obtained.

【０２７９】また、この発明の請求項１０によれば、請
求項６から請求項９までのいずれかにおいて、統計的単
語間距離算出手段は、Ｎグラム言語モデル上のユークリ
ッド距離を用いて、単語間距離を測定するようにしたの
で、認識精度を高めた言語モデル学習装置が得られる効
果がある。According to a tenth aspect of the present invention, in any one of the sixth to ninth aspects, the statistical inter-word distance calculation means uses the Euclidean distance on the N-gram language model to calculate the word. Since the distance is measured, there is an effect that a language model learning device with improved recognition accuracy can be obtained.

【０２８０】また、この発明の請求項１１によれば、請
求項６から請求項９までのいずれかにおいて、統計的単
語間距離算出手段は、Ｎグラム言語モデル上のクロスエ
ントロピーを用いて、単語間距離を測定するようにした
ので、認識精度を高めた言語モデル学習装置が得られる
効果がある。According to the eleventh aspect of the present invention, in any one of the sixth to ninth aspects, the statistical inter-word distance calculating means uses the cross entropy on the N-gram language model to calculate the word. Since the distance is measured, there is an effect that a language model learning device with improved recognition accuracy can be obtained.

【０２８１】また、この発明の請求項１２によれば、請
求項１から請求項１１までのいずれかの言語モデル学習
装置を用いた音声認識装置であって、言語モデルまたは
クラス言語モデルは、音声認識に用いられるようにした
ので、高精度の音声認識装置が得られる効果がある。According to a twelfth aspect of the present invention, there is provided a speech recognition apparatus using any one of the first to eleventh language model learning apparatuses, wherein the language model or the class language model is a speech model. Since it is used for recognition, there is an effect that a highly accurate speech recognition device can be obtained.

【０２８２】また、この発明の請求項１３によれば、対
象タスクのテキストデータを集積する第１のステップ
と、対象タスク以外のタスクを含む一般タスクのテキス
トデータを集積する第２のステップと、対象タスク言語
データおよび一般タスク言語データから、それぞれ言語
モデル学習用のテキストデータを読み込み、タスク適応
化済み言語モデルを構築するための、類似単語対抽出ス
テップ、類似単語列合成ステップおよび言語モデル生成
ステップとを備え、類似単語対抽出ステップは、対象タ
スク言語データおよび一般タスク言語データから各テキ
ストデータを読み込み、対象タスクのテキストデータに
含まれる単語と一般タスクのテキストデータに含まれる
単語との組み合わせから類似単語対を抽出し、類似単語
列合成ステップは、各テキストデータおよび類似単語対
を読み込み、言語データに含まれない対象タスク内の単
語を含む単語列を合成して出力し、言語モデル生成ステ
ップは、各テキストデータおよび単語列を読み込み、各
テキストデータ毎に重み付けて単語列の統計量を求める
ことにより、タスク適応化済み言語モデルを生成するよ
うにしたので、認識精度を高めた言語モデル学習方法が
得られる効果がある。According to a thirteenth aspect of the present invention, a first step of accumulating text data of a target task, a second step of accumulating text data of a general task including tasks other than the target task, A similar word pair extraction step, a similar word string synthesis step, and a language model generation step for reading text data for language model learning from the target task language data and the general task language data, respectively, and constructing a task-adapted language model. The similar word pair extraction step reads each text data from the target task language data and the general task language data, and executes a similar word pair extraction step based on a combination of a word included in the target task text data and a word included in the general task text data. A similar word pair is extracted, and a similar word string synthesizing step includes: The text data and similar word pairs are read, a word string including the word in the target task not included in the language data is synthesized and output, and the language model generation step reads each text data and word string, and Since the task-adapted language model is generated by calculating the statistic of the word string by weighting, the language model learning method with improved recognition accuracy is obtained.

【０２８３】また、この発明の請求項１４によれば、対
象タスクのテキストデータを集積する第１のステップ
と、対象タスク以外のタスクを含む一般タスクのテキス
トデータを集積する第２のステップと、対象タスク言語
データおよび一般タスク言語データからタスク適応化済
み言語モデルを構築するための、対象タスク単語クラス
化ステップ、一般タスク単語クラス化ステップおよび言
語モデル生成ステップとを備え、対象タスク単語クラス
化ステップは、対象タスク言語データから対象タスクの
テキストデータを読み込み、クラス定義に示されたクラ
スに単語を置き換えて、言語モデル学習用のクラス化さ
れた第１のテキストデータを出力し、一般タスク単語ク
ラス化ステップは、一般タスク言語データから一般タス
クのテキストデータを読み込み、クラス定義に示された
クラスに単語を置き換えて、言語モデル学習用のクラス
化された第２のテキストデータを出力し、言語モデル生
成ステップは、第１および第２のテキストデータを読み
込み、各テキストデータ毎に重み付けて単語列の統計量
を求めることにより、言語モデルを生成するようにした
ので、認識精度を高めた言語モデル学習方法が得られる
効果がある。According to claim 14 of the present invention, a first step of accumulating text data of a target task, a second step of accumulating text data of a general task including tasks other than the target task, A target task word classifying step, a general task word classifying step, and a language model generating step for constructing a task-adapted language model from the target task language data and the general task language data; Reads the text data of the target task from the target task language data, replaces the word with the class indicated in the class definition, outputs the first text data classified for language model learning, and outputs the general task word class The generalization step converts general task language data to general task text data. Reading, replacing the word with the class indicated in the class definition, and outputting classified second text data for language model learning, the language model generating step reads the first and second text data, Since the language model is generated by weighting each text data to obtain the statistic of the word string, a language model learning method with improved recognition accuracy can be obtained.

【０２８４】また、この発明の請求項１５によれば、対
象タスクのテキストデータを集積する第１のステップ
と、対象タスク以外のタスクを含む一般タスクのテキス
トデータを集積する第２のステップと、対象タスク言語
データおよび一般タスク言語データからタスク適応化済
み言語モデルを構築するための、対象タスク単語クラス
化ステップ、一般タスク単語クラス化ステップ、類似単
語対抽出ステップ、類似単語列合成ステップおよび言語
モデル生成ステップとを備え、対象タスク単語クラス化
ステップは、対象タスク言語データから対象タスクのテ
キストデータを読み込み、クラス定義に示されたクラス
に単語を置き換えて、言語モデル学習用のクラス化され
た第１のテキストデータを出力し、一般タスク単語クラ
ス化ステップは、一般タスク言語データから一般タスク
のテキストデータを読み込み、クラス定義に示されたク
ラスに単語を置き換えて、言語モデル学習用のクラス化
された第２のテキストデータを出力し、類似単語対抽出
ステップは、第１および第２のテキストデータを読み込
み、対象タスクのテキストデータに含まれる単語と一般
タスクのテキストデータに含まれる単語との組み合わせ
から類似単語対を抽出し、類似単語列合成ステップは、
第１および第２のテキストデータと類似単語対とを読み
込み、言語データに含まれない対象タスク内の単語を含
む単語列を合成して出力し、言語モデル生成ステップ
は、第１および第２のテキストデータと単語列とを読み
込み、各テキストデータ毎に重み付けて単語列の統計量
を求めることにより、タスク適応化済み言語モデルを生
成するようにしたので、認識精度を高めた言語モデル学
習方法が得られる効果がある。According to the fifteenth aspect of the present invention, a first step of accumulating text data of a target task, a second step of accumulating text data of a general task including tasks other than the target task, A target task word classifying step, a general task word classifying step, a similar word pair extraction step, a similar word string synthesizing step, and a language model for constructing a task-adapted language model from the target task language data and the general task language data. Generating a target task word classifying step, reads text data of the target task from the target task language data, replaces the word with the class indicated in the class definition, and classifies the target task 1 is output, and the general task word classifying step is The general task text data is read from the task language data, the words are replaced with the classes indicated in the class definition, the second text data classified for language model learning is output, and the similar word pair extraction step includes: The first and second text data are read, and a similar word pair is extracted from a combination of a word included in the text data of the target task and a word included in the text data of the general task.
The first and second text data and the similar word pair are read, a word string including a word in the target task not included in the language data is synthesized and output, and the language model generating step includes the first and second language data. A task-adapted language model is generated by reading text data and word strings and calculating the weight of the word strings by weighting each text data, so a language model learning method with improved recognition accuracy has been developed. There is an effect that can be obtained.

【０２８５】また、この発明の請求項１６によれば、対
象タスクのテキストデータを集積する第１のステップ
と、対象タスク以外のタスクを含む一般タスクのテキス
トデータを集積する第２のステップと、事前に準備した
テキストデータを用いて初期言語モデルを作成する第３
のステップと、対象タスク言語データ、一般タスク言語
データおよび初期言語モデルから、タスク適応化済み統
計的言語モデルを構築するための、類似単語対抽出ステ
ップおよび類似単語確率補正ステップとを備え、類似単
語対抽出ステップは、対象タスク言語データおよび一般
タスク言語データから、それぞれ言語モデル学習用のテ
キストデータを読み込み、対象タスクのテキストデータ
に含まれる単語と一般タスクのテキストデータに含まれ
る単語との組み合わせから類似単語対を抽出し、類似単
語確率補正ステップは、類似単語対および初期言語モデ
ルを読み込み、対象タスクで出現する単語の出現確率の
スムージングを行うことにより、タスク適応化済み統計
的言語モデルを生成するようにしたので、認識精度を高
めた言語モデル学習方法が得られる効果がある。According to claim 16 of the present invention, a first step of accumulating text data of a target task, a second step of accumulating text data of a general task including tasks other than the target task, Create an initial language model using text data prepared in advance 3
And a similar word pair extraction step and a similar word probability correction step for constructing a task-adapted statistical language model from the target task language data, the general task language data, and the initial language model. The pair extraction step reads text data for language model learning from the target task language data and the general task language data, respectively, and extracts the words included in the target task text data and the words included in the general task text data. The similar word probability correction step extracts similar word pairs and generates a task-adapted statistical language model by reading the similar word pairs and the initial language model and smoothing the appearance probability of words appearing in the target task. Linguistic modelling with improved recognition accuracy There is an effect in which the method can be obtained.

【０２８６】また、この発明の請求項１７によれば、対
象タスクのテキストデータを集積する第１のステップ
と、対象タスク以外のタスクを含む一般タスクのテキス
トデータを集積する第２のステップと、初期クラス言語
モデルをあらかじめ作成する第４のステップと、対象タ
スク言語データ、一般タスク言語データおよび初期クラ
ス言語モデルから、タスク適応化済みクラス言語モデル
を構築するための、対象タスク単語クラス化ステップ、
一般タスク単語クラス化ステップ、類似単語対抽出ステ
ップおよび類似単語確率補正ステップとを備え、対象タ
スク単語クラス化ステップは、対象タスク言語データか
ら対象タスクのテキストデータを読み込み、クラス定義
に示されたクラスに単語を置き換えて、言語モデル学習
用のクラス化された第１のテキストデータを出力し、一
般タスク単語クラス化ステップは、一般タスク言語デー
タから一般タスクのテキストデータを読み込み、クラス
定義に示されたクラスに単語を置き換えて、言語モデル
学習用のクラス化された第２のテキストデータを出力
し、類似単語対抽出ステップは、第１および第２のテキ
ストデータを読み込み、対象タスクのテキストデータに
含まれる単語と一般タスクのテキストデータに含まれる
単語との組み合わせから類似単語対を抽出し、類似単語
確率補正ステップは、類似単語対および初期クラス言語
モデルを読み込み、対象タスクで出現する単語の出現確
率のスムージングを行うことにより、タスク適応化済み
クラス言語モデルを生成するようにしたので、認識精度
を高めた言語モデル学習方法が得られる効果がある。According to the seventeenth aspect of the present invention, a first step of accumulating text data of a target task, a second step of accumulating text data of a general task including tasks other than the target task, A fourth step of creating an initial class language model in advance, and a target task word classifying step of constructing a task-adapted class language model from the target task language data, the general task language data, and the initial class language model;
A general task word classifying step, a similar word pair extracting step, and a similar word probability correcting step. The target task word classifying step reads text data of the target task from the target task language data, and executes the class indicated in the class definition. And outputs the first text data classified as a class for language model learning. The general task word classifying step reads the general task text data from the general task language data, and indicates the general task word data in the class definition. The word is replaced with the class, and the classified second text data for language model learning is output. The similar word pair extraction step reads the first and second text data and converts the text data of the target task into text data of the target task. Combination of words included and words included in text data of general tasks The similar word probability correction step reads the similar word pair and the initial class language model, and smoothes the appearance probability of the words appearing in the target task to obtain a task-adapted class language model. Since it is generated, there is an effect that a language model learning method with improved recognition accuracy can be obtained.

【０２８７】また、この発明の請求項１８によれば、請
求項１３または請求項１６において、類似単語抽出ステ
ップは、距離算出用言語モデル生成ステップ、統計的単
語間距離算出ステップおよびしきい値判定ステップを含
み、距離算出用言語モデル生成ステップは、対象タスク
言語データおよび一般タスク言語データから、それぞれ
言語モデル学習用のテキストデータを読み込み、各テキ
ストデータ毎に重み付けて単語列の統計量を求めて、距
離算出用の統計的言語モデルを生成し、統計的単語間距
離算出ステップは、統計的言語モデルを読み込み、各テ
キストデータから抽出した単語からなる単語対につい
て、統計的言語モデル上の統計的な距離を単語間距離と
して求め、しきい値判定ステップは、単語対および単語
間距離を読み込み、所定のしきい値を越える単語対を出
力するようにしたので、認識精度を高めた言語モデル学
習方法が得られる効果がある。According to the eighteenth aspect of the present invention, in the thirteenth aspect or the sixteenth aspect, the similar word extracting step includes the step of generating a language model for calculating a distance, the step of calculating a statistical inter-word distance, and the step of determining a threshold value. The language model generation step for distance calculation includes reading text data for language model learning from the target task language data and the general task language data, and weighting each text data to obtain a statistic of the word string. Generating a statistical language model for distance calculation, and calculating a statistical inter-word distance. The statistical language model reads the statistical language model, and executes a statistical process on the statistical language model for a word pair composed of words extracted from each text data. Is determined as the inter-word distance, and the threshold determination step reads the word pair and the inter-word distance, Since so as to output a word pairs exceeding certain threshold, the language model learning method with improved recognition accuracy is the effect obtained.

【０２８８】また、この発明の請求項１９によれば、請
求項１３または請求項１６において、類似単語抽出ステ
ップは、距離算出用言語モデル作成ステップ、統計的単
語間距離算出ステップおよびしきい値判定ステップを含
み、距離算出用言語モデル作成ステップは、事前に準備
したテキストデータを用いて距離算出用言語モデルを作
成し、統計的単語間距離算出ステップは、距離算出用言
語モデルを読み込み、各テキストデータから抽出した単
語からなる単語対について、距離算出用言語モデル上の
統計的な距離を単語間距離として求め、しきい値判定ス
テップは、単語対および単語間距離を読み込み、所定の
しきい値を越える単語対を出力するようにしたので、認
識精度を高めた言語モデル学習方法が得られる効果があ
る。According to a nineteenth aspect of the present invention, in the thirteenth or sixteenth aspect, the similar word extracting step includes the step of creating a language model for calculating a distance, the step of calculating a statistical inter-word distance, and the step of determining a threshold value. The step of creating a language model for distance calculation includes creating a language model for distance calculation using text data prepared in advance, and the step of calculating a statistical inter-word distance reads the language model for distance calculation, For a word pair consisting of words extracted from the data, a statistical distance on a distance calculation language model is obtained as an inter-word distance. The threshold value determining step reads the word pair and the inter-word distance, and reads a predetermined threshold value. Is output, a word model learning method with improved recognition accuracy is obtained.

【０２８９】また、この発明の請求項２０によれば、請
求項１５または請求項１７において、類似単語抽出ステ
ップは、距離算出用言語モデル生成ステップ、統計的単
語間距離算出ステップおよびしきい値判定ステップを含
み、距離算出用言語モデル生成ステップは、第１および
第２のテキストデータを読み込み、各テキストデータ毎
に重み付けて単語列の統計量を求めて、距離算出用の統
計的言語モデルを生成し、統計的単語間距離算出ステッ
プは、統計的言語モデルを読み込み、各テキストデータ
から抽出した単語からなる単語対について、統計的言語
モデル上の統計的な距離を単語間距離として求め、しき
い値判定ステップは、単語対および単語間距離を読み込
み、所定のしきい値を越える単語対を出力するようにし
たので、認識精度を高めた言語モデル学習方法が得られ
る効果がある。According to a twentieth aspect of the present invention, in the fifteenth or seventeenth aspect, the similar word extracting step includes the step of generating a language model for calculating a distance, the step of calculating a statistical inter-word distance, and the step of determining a threshold value. The step of generating a language model for distance calculation includes reading the first and second text data, obtaining a statistic of the word string by weighting each text data, and generating a statistical language model for distance calculation. The statistical inter-word distance calculating step reads the statistical language model, finds a statistical distance on the statistical language model as a word-to-word distance for a word pair composed of words extracted from each text data, In the value determination step, the word pairs and the inter-word distance are read, and word pairs exceeding a predetermined threshold value are output. The effect of the language model learning method with enhanced is obtained.

【０２９０】また、この発明の請求項２１によれば、請
求項１５または請求項１７において、類似単語抽出ステ
ップは、距離算出用クラス言語モデル作成ステップ、統
計的単語間距離算出ステップおよびしきい値判定ステッ
プを含み、距離算出用クラス言語モデル作成ステップ
は、事前に準備したテキストデータを用いて距離算出用
クラス言語モデルを作成し、統計的単語間距離算出ステ
ップは、距離算出用クラス言語モデルならびに第１およ
び第２のテキストデータを読み込み、各テキストデータ
から抽出した単語からなる単語対について、距離算出用
クラス言語モデル上の統計的な距離を単語間距離として
求め、しきい値判定ステップは、単語対および単語間距
離を読み込み、所定のしきい値を越える単語対を出力す
るようにしたので、認識精度を高めた言語モデル学習方
法が得られる効果がある。According to a twenty-first aspect of the present invention, in the fifteenth or seventeenth aspect, the similar word extracting step includes the step of creating a class language model for calculating a distance, the step of calculating a statistical inter-word distance, and the step of Including a determination step, the distance calculation class language model creation step creates a distance calculation class language model using text data prepared in advance, and the statistical inter-word distance calculation step includes a distance calculation class language model and The first and second text data are read, and for a word pair composed of words extracted from each text data, a statistical distance on a distance calculation class language model is obtained as an inter-word distance. Since word pairs and inter-word distances are read and word pairs exceeding a predetermined threshold are output, The effect of the language model learning method with enhanced 識精 degree is obtained.

【０２９１】また、この発明の請求項２２によれば、請
求項１８から請求項２１までのいずれかにおいて、統計
的単語間距離算出ステップは、Ｎグラム言語モデル上の
ユークリッド距離を用いて、単語間距離を測定するよう
にしたので、認識精度を高めた言語モデル学習方法が得
られる効果がある。According to a twenty-second aspect of the present invention, in any one of the eighteenth to twenty-first aspects, the step of calculating a statistical inter-word distance uses the Euclidean distance on an N-gram language model to calculate a word. Since the distance is measured, there is an effect that a language model learning method with improved recognition accuracy can be obtained.

【０２９２】また、この発明の請求項２３によれば、請
求項１８から請求項２１までのいずれかにおいて、統計
的単語間距離算出ステップは、Ｎグラム言語モデル上の
クロスエントロピーを用いて、単語間距離を測定するよ
うにしたので、認識精度を高めた言語モデル学習方法が
得られる効果がある。According to a twenty-third aspect of the present invention, in any one of the eighteenth to twenty-first aspects, the statistical word-to-word distance calculating step uses the cross entropy on the N-gram language model to calculate the word. Since the distance is measured, there is an effect that a language model learning method with improved recognition accuracy can be obtained.

【０２９３】また、この発明の請求項２４によれば、請
求項１３から請求項２３までのいずれかの言語モデル学
習方法を用いた音声認識方法であって、言語モデルまた
はクラス言語モデルは、音声認識に用いられるようにし
たので、高精度の音声認識方法が得られる効果がある。According to claim 24 of the present invention, there is provided a speech recognition method using any of the language model learning methods according to claims 13 to 23, wherein the language model or the class language model is a speech model. Since it is used for recognition, there is an effect that a highly accurate voice recognition method can be obtained.

【０２９４】また、この発明の請求項２５によれば、請
求項１３から請求項２３までのいずれかの言語モデル学
習方法を記憶した記録媒体がが得られる効果がある。According to the twenty-fifth aspect of the present invention, there is an effect that a recording medium storing any one of the language model learning methods according to the thirteenth to twenty-third aspects is obtained.

【０２９５】また、この発明の請求項２６によれば、請
求項２４の言語モデル学習方法を用いた音声認識方法を
記憶した記録媒体が得られる効果がある。According to the twenty-sixth aspect of the present invention, there is an effect that a recording medium storing a speech recognition method using the language model learning method of the twenty-fourth aspect is obtained.

[Brief description of the drawings]

【図１】この発明の実施の形態１による言語モデル学
習装置を概略的に示すブロック構成図である。FIG. 1 is a block diagram schematically showing a language model learning apparatus according to Embodiment 1 of the present invention.

【図２】この発明の実施の形態１による言語モデル学
習装置および方法の処理手順を示すフローチャートであ
る。FIG. 2 is a flowchart showing a processing procedure of the language model learning apparatus and method according to the first embodiment of the present invention.

【図３】この発明の実施の形態２による言語モデル学
習装置を概略的に示すブロック構成図である。FIG. 3 is a block diagram schematically showing a language model learning device according to a second embodiment of the present invention.

【図４】この発明の実施の形態２による言語モデル学
習装置および方法の処理手順を示すフローチャートであ
る。FIG. 4 is a flowchart showing a processing procedure of a language model learning apparatus and method according to Embodiment 2 of the present invention;

【図５】この発明の実施の形態３による言語モデル学
習装置を概略的に示すブロック構成図である。FIG. 5 is a block diagram schematically showing a language model learning apparatus according to Embodiment 3 of the present invention.

【図６】この発明の実施の形態３による言語モデル学
習装置および方法の処理手順を示すフローチャートであ
る。FIG. 6 is a flowchart illustrating a processing procedure of a language model learning apparatus and method according to Embodiment 3 of the present invention;

【図７】この発明の実施の形態４による言語モデル学
習装置を概略的に示すブロック構成図である。FIG. 7 is a block diagram schematically showing a language model learning apparatus according to Embodiment 4 of the present invention.

【図８】この発明の実施の形態４による言語モデル学
習装置および方法の処理手順を示すフローチャートであ
る。FIG. 8 is a flowchart illustrating a processing procedure of a language model learning apparatus and method according to Embodiment 4 of the present invention;

【図９】この発明の実施の形態５による言語モデル学
習装置を概略的に示すブロック構成図である。FIG. 9 is a block diagram schematically showing a language model learning apparatus according to Embodiment 5 of the present invention.

【図１０】この発明の実施の形態５による言語モデル
学習装置および方法の処理手順を示すフローチャートで
ある。FIG. 10 is a flowchart showing a processing procedure of a language model learning apparatus and method according to Embodiment 5 of the present invention.

【図１１】この発明の実施の形態６による言語モデル
学習装置の類似単語対抽出手段を具体例に示す機能ブロ
ック図である。FIG. 11 is a functional block diagram showing a specific example of a similar word pair extraction unit of the language model learning apparatus according to Embodiment 6 of the present invention;

【図１２】この発明の実施の形態６による言語モデル
学習装置の類似単語対抽出手段の処理手順を示すフロー
チャートである。FIG. 12 is a flowchart showing a processing procedure of a similar word pair extracting unit of the language model learning device according to the sixth embodiment of the present invention.

【図１３】この発明の実施の形態６による類似単語対
抽出手段の第２の具体例を示す機能ブロック図である。FIG. 13 is a functional block diagram showing a second specific example of the similar word pair extracting means according to the sixth embodiment of the present invention.

【図１４】この発明の実施の形態６による類似単語対
抽出手段の第３の具体例を示す機能ブロック図である。FIG. 14 is a functional block diagram showing a third specific example of the similar word pair extracting means according to the sixth embodiment of the present invention.

【図１５】この発明の実施の形態６による類似単語対
抽出手段の第４の具体例を示す機能ブロック図である。FIG. 15 is a functional block diagram showing a fourth specific example of the similar word pair extracting means according to the sixth embodiment of the present invention.

【図１６】この発明の実施の形態７による言語モデル
学習装置を用いた音声認識装置を概略的に示すブロック
構成図である。FIG. 16 is a block diagram schematically showing a speech recognition device using a language model learning device according to a seventh embodiment of the present invention.

【図１７】この発明の実施の形態７による言語モデル
学習装置を用いた音声認識装置の処理手順を示すフロー
チャートである。FIG. 17 is a flowchart showing a processing procedure of the speech recognition device using the language model learning device according to the seventh embodiment of the present invention.

【図１８】この発明の実施の形態８による言語モデル
学習装置を用いた音声認識装置を概略的に示すブロック
構成図である。FIG. 18 is a block diagram schematically showing a speech recognition device using a language model learning device according to an eighth embodiment of the present invention.

【図１９】この発明の実施の形態８による言語モデル
学習装置を用いた音声認識装置の処理手順を示すフロー
チャートである。FIG. 19 is a flowchart showing a processing procedure of a speech recognition device using the language model learning device according to the eighth embodiment of the present invention.

【図２０】一般的なクラス定義の一例を示す説明図で
ある。FIG. 20 is an explanatory diagram showing an example of a general class definition.

【図２１】従来の言語モデル学習装置を概略的に示す
ブロック構成図である。FIG. 21 is a block diagram schematically showing a conventional language model learning device.

【図２２】従来の言語モデル学習装置および方法によ
る処理手順を示すフローチャートである。FIG. 22 is a flowchart showing a processing procedure by a conventional language model learning apparatus and method.

[Explanation of symbols]

１０１対象タスク言語データ、１０２一般タスク言
語データ、１０３、１０３Ｂ、１０３Ｃ、１０３Ｄ、１
０３Ｅ、１０３Ｆ類似単語対抽出手段、１０４、１０
４Ｂ類似単語列合成手段、１０５、１０５Ａ、１０５
Ｂ言語モデル生成手段、３０１対象タスク単語クラ
ス化手段、３０２一般タスク単語クラス化手段および
言語モデル生成手段とを備え、７０１初期言語モデ
ル、７０２類似単語確率補正手段、９０１初期クラス
言語モデル、１１０１、１１０１Ｄ、１１０１Ｆ統計
的単語間距離算出手段、１１０２、１１０２Ｅしきい
値判定手段、１１０５、１１０５Ｅ距離算出用言語モ
デル生成手段、１３０１距離算出用言語モデル、１５０
１距離算出用クラス言語モデル、１６０５言語モデ
ル、１８０２クラス言語モデル。101 target task language data, 102 general task language data, 103, 103B, 103C, 103D, 1
03E, 103F Similar word pair extraction means, 104, 10
4B Similar Word String Synthesis Means, 105, 105A, 105
B language model generating means, 301 target task word classifying means, 302 general task word classifying means and language model generating means, 701 initial language model, 702 similar word probability correcting means, 901 initial class language model, 1101, 1101D, 1101F Statistical word-to-word distance calculation means, 1102, 1102E Threshold determination means, 1105, 1105E Distance calculation language model generation means, 1301 Distance calculation language model, 150
1 Class language model for distance calculation, 1605 language model, 1802 class language model.

Claims

[Claims]

1. A target task language data in which text data of a target task is accumulated, a general task language data in which text data of general tasks including tasks other than the target task are accumulated, the target task language data and the general task language A similar word pair extracting means, a similar word string synthesizing means, and a language model generating means for reading text data for language model learning from data, respectively, and constructing a task-adapted language model; The extracting unit reads each text data from the target task language data and the general task language data, and extracts a similar word pair from a combination of a word included in the target task text data and a word included in the general task text data. The similar word string synthesizing means extracts each of the texts. Reading the similar word pair from the similar word pair extracting unit, and synthesizing and outputting a word string including a word in the target task that is not included in the language data, the language model generating unit includes: Generating the task-adapted language model by reading each text data, reading the word string from the similar word string synthesizing means, and weighting the text data to obtain a statistic of the word string. A language model learning device characterized by the following.

2. A target task language data in which text data of a target task is accumulated, a general task language data in which text data of general tasks including tasks other than the target task are accumulated, the target task language data and the general task language A target task word classifier, a general task word classifier, and a language model generator for constructing a task-adapted language model from data, wherein the target task word classifier includes the target task language data. Read the text data of the target task from
Replacing the word with the class indicated in the class definition and outputting first text data classified for language model learning, wherein the general task word classifying means converts the general task language data into a general task text Read the data,
Replacing the word with the class indicated in the class definition and outputting classified second text data for language model learning; the language model generating means reads the first and second text data; A language model learning apparatus, wherein the language model is generated by obtaining a statistic of a word string by weighting each text data.

3. A target task language data in which text data of a target task is accumulated, a general task language data in which text data of general tasks including tasks other than the target task are accumulated, the target task language data and the general task language A target task word classifying unit, a general task word classifying unit, a similar word pair extracting unit, a similar word string synthesizing unit, and a language model generating unit for constructing a task-adapted language model from data; The task word classifying means reads text data of the target task from the target task language data,
Replacing the word with the class indicated in the class definition and outputting first text data classified for language model learning, wherein the general task word classifying means converts the general task language data into a general task text Read the data,
The word is replaced with the class indicated in the class definition, and the classified second text data for language model learning is output. The similar word pair extracting unit reads the first and second text data. Extracting a similar word pair from a combination of a word included in the text data of the target task and a word included in the text data of the general task, wherein the similar word string synthesizing unit includes the first and second text data. And reads the similar word pair from the similar word pair extraction unit, synthesizes and outputs a word string including a word in the target task that is not included in the language data, and outputs the first language model. And the second text data, and reads the word string from the similar word string synthesizing means. By determining the statistics of the word string put,
A language model learning apparatus for generating the task-adapted language model.

4. Using the target task language data in which text data of the target task is accumulated, the general task language data in which text data of general tasks including tasks other than the target task are accumulated, and text data prepared in advance A similar word pair extraction unit and a similar word probability correction unit for constructing a task-adapted statistical language model from the target language data, the general task language data, and the initial language model. The similar word pair extracting means reads text data for language model learning from the target task language data and the general task language data, respectively, and reads a word included in the text data of the target task and the general task. Similar words from the combination with the words contained in the text data of The similar word probability correction unit reads the similar word pair from the similar word pair extraction unit, reads the initial language model, and performs smoothing of the appearance probability of the word appearing in the target task. A language model learning apparatus for generating the task-adapted statistical language model.

5. A target task language data in which text data of a target task is collected, general task language data in which text data of general tasks including tasks other than the target task are collected, an initial class language model created in advance, From the target task language data, the general task language data and the initial class language model, a target task word classifier, a general task word classifier, and similar word pair extraction for constructing a task-adapted class language model Means and a similar word probability correction means, the target task word classifying means reads text data of the target task from the target task language data,
Replacing the word with the class indicated in the class definition and outputting first text data classified for language model learning, wherein the general task word classifying means converts the general task language data into a general task text Read the data,
The word is replaced with the class indicated in the class definition, and the classified second text data for language model learning is output. The similar word pair extracting unit reads the first and second text data. Extracting a similar word pair from a combination of a word included in the text data of the target task and a word included in the text data of the general task, wherein the similar word probability correction unit includes the similar word pair extraction unit Reading the word pair, reading the initial class language model, and smoothing the appearance probability of the word appearing in the target task, thereby generating the task-adapted class language model. apparatus.

6. The similar word extraction means includes a distance calculation language model generation means, a statistical inter-word distance calculation means, and a threshold value determination means, wherein the distance calculation language model generation means includes the target task language. Reading text data for language model learning from the data and the general task language data, obtaining a statistic of a word string by weighting each text data, generating a statistical language model for distance calculation, The inter-word distance calculation means reads the statistical language model from the distance calculation language model generation means,
For a word pair consisting of words extracted from each of the text data, a statistical distance on the statistical language model is determined as an inter-word distance. The language model learning device according to claim 1, wherein a word pair and the inter-word distance are read, and a word pair exceeding a predetermined threshold is output.

7. The similar word extraction means includes a distance calculation language model, a statistical inter-word distance calculation means, and a threshold value determination means, and the distance calculation language model uses text data prepared in advance. The statistical inter-word distance calculating means reads the language model for distance calculation, and, for a word pair composed of words extracted from each of the text data, statistically calculates the word pair on the language model for distance calculation. Calculating a distance as a word-to-word distance, wherein the threshold value determination means reads the word pair and the word-to-word distance from the statistical word-to-word distance calculation means, and outputs a word pair exceeding a predetermined threshold value. The language model learning device according to claim 1 or 4, wherein

8. The similar word extraction means includes a distance calculation language model generation means, a statistical inter-word distance calculation means, and a threshold value determination means, wherein the distance calculation language model generation means includes the target task word The first and second text data are read from the classifying means and the general task word classifying means, weighted for each text data to obtain a word string statistic, and a statistical language model for distance calculation is generated. The statistical inter-word distance calculating means reads the statistical language model from the distance calculating language model generating means,
For a word pair consisting of words extracted from each of the text data, a statistical distance on the statistical language model is determined as an inter-word distance. The language model learning device according to claim 3, wherein a word pair and the inter-word distance are read, and a word pair exceeding a predetermined threshold is output.

9. The similar word extraction means includes a distance calculation class language model, a statistical inter-word distance calculation means, and a threshold value determination means, wherein the distance calculation class language model includes text data prepared in advance. The statistical inter-word distance calculation means reads the distance calculation class language model and obtains first and second words from the target task word classification means and the general task word classification means. , And for a word pair consisting of words extracted from each of the text data, a statistical distance on the distance calculation class language model is determined as an inter-word distance. The word pair and the inter-word distance are read from the target inter-word distance calculation means, and a word pair exceeding a predetermined threshold is output. The language model learning device according to claim 3 or 5, wherein:

10. The statistical inter-word distance calculating means, wherein N
10. The language model learning apparatus according to claim 6, wherein the inter-word distance is measured using a Euclidean distance on a Gram language model.

11. The statistical inter-word distance calculating means includes:
10. The language model learning apparatus according to claim 6, wherein the inter-word distance is measured using cross entropy on a Gram language model.

12. The speech recognition device using the language model learning device according to claim 1, wherein the language model or the class language model is used for speech recognition.

13. A first step of accumulating text data of a target task, a second step of accumulating text data of a general task including tasks other than the target task, the target task language data and the general task language A similar word pair extracting step, a similar word string synthesizing step, and a language model generating step for reading text data for language model learning from the data and constructing a task-adapted language model, respectively. The extracting step reads each text data from the target task language data and the general task language data, and sets a similar word pair from a combination of a word included in the target task text data and a word included in the general task text data. And the similar word string synthesizing step comprises: The text data and the similar word pair are read, and a word string including a word in the target task that is not included in the language data is synthesized and output.The language model generating step includes: By reading, by weighting for each of the text data to determine the statistics of the word string,
A language model learning method, characterized by generating the task-adapted language model.

14. A first step of accumulating text data of a target task, a second step of accumulating text data of a general task including a task other than the target task, the target task language data and the general task language A target task word classifying step, a general task word classifying step, and a language model generating step for constructing a task-adapted language model from data, wherein the target task word classifying step includes the target task language data. Read the text data of the target task from, replace the word with the class indicated in the class definition,
Outputting first classified text data for language model learning, the general task word classifying step reads text data of a general task from the general task language data, and converts the word into a class indicated in the class definition; Is replaced by
Outputting classified second text data for language model learning; the language model generating step reads the first and second text data, weights each of the text data, and calculates a statistic of a word string. A language model learning method, wherein the language model is generated by calculating

15. A first step of accumulating text data of a target task, a second step of accumulating text data of general tasks including tasks other than the target task, the target task language data and the general task language A target task word classifying step, a general task word classifying step, a similar word pair extracting step, a similar word string synthesizing step, and a language model generating step for constructing a task-adapted language model from data; The task word classifying step reads text data of the target task from the target task language data and replaces the word with the class indicated in the class definition,
Outputting first classified text data for language model learning, the general task word classifying step reads text data of a general task from the general task language data, and converts the word into a class indicated in the class definition; Is replaced by
Outputting classified second text data for language model learning, the similar word pair extracting step reads the first and second text data, and outputs a word included in the text data of the target task; Extracting a similar word pair from a combination of words included in the text data of the general task; and the similar word string synthesizing step reads the first and second text data and the similar word pair and includes the similar word pair in the language data. The language model generating step reads the first and second text data and the word string, and weights the text data for each of the text data. A language model learning method for generating the task-adapted language model by calculating a statistic of the word string. .

16. A first step of accumulating text data of a target task, a second step of accumulating text data of a general task including tasks other than the target task, and initializing using text data prepared in advance. A third step of creating a language model; a similar word pair extraction step and a similarity step for constructing a task-adapted statistical language model from the target task language data, the general task language data, and the initial language model. A word probability correction step, wherein the similar word pair extraction step reads text data for language model learning from the target task language data and the general task language data, respectively, and includes words included in the text data of the target task. And words contained in the text data of the general task Extracting a similar word pair from a set of words, the similar word probability correcting step reads the similar word pair and the initial language model, and performs smoothing of an appearance probability of a word appearing in the target task, thereby performing the task adaptation. A language model learning method characterized by generating a statistical statistical language model.

17. A first step of accumulating text data of a target task, a second step of accumulating text data of general tasks including tasks other than the target task, and a fourth step of preparing an initial class language model in advance. From the target task language data, the general task language data and the initial class language model, to construct a task-adapted class language model, target task word classifying step, general task word classifying step,
A similar word pair extraction step and a similar word probability correction step, wherein the target task word classifying step reads text data of the target task from the target task language data and replaces the word with the class indicated in the class definition. ,
Outputting first classified text data for language model learning, the general task word classifying step reads text data of a general task from the general task language data, and converts the word into a class indicated in the class definition; Is replaced by
Outputting classified second text data for language model learning, the similar word pair extracting step reads the first and second text data, and outputs a word included in the text data of the target task; Extracting a similar word pair from a combination with a word included in the text data of the general task; the similar word probability correction step reads the similar word pair and the initial class language model, and appears a word appearing in the target task. A language model learning method, wherein the task-adapted class language model is generated by performing probability smoothing.

18. The step of extracting a similar word includes a step of generating a language model for calculating a distance, a step of calculating a statistical inter-word distance, and a step of determining a threshold value. Data and the general task language data,
The respective text data for language model learning are read, the weight of each text data is weighted to obtain a statistic of the word string, and a statistical language model for distance calculation is generated. The statistical language model is read, and a statistical distance on the statistical language model is obtained as a word-to-word distance for a word pair composed of words extracted from each of the text data. And reading the word-to-word distance and outputting a word pair exceeding a predetermined threshold value.
Language model learning method described in.

19. The similar word extraction step includes a distance calculation language model creation step, a statistical inter-word distance calculation step, and a threshold value determination step, and the distance calculation language model creation step is prepared in advance. Creating the language model for distance calculation using text data, the statistical inter-word distance calculation step reads the language model for distance calculation, and for the word pair consisting of words extracted from each of the text data, Calculating a statistical distance on the calculation language model as a word-to-word distance; and reading the word pair and the word-to-word distance, and outputting a word pair exceeding a predetermined threshold value. Claim 13 or Claim 16 characterized by the above-mentioned.
Language model learning method described in.

20. The similar word extraction step includes a distance calculation language model generation step, a statistical inter-word distance calculation step, and a threshold value determination step, wherein the distance calculation language model generation step includes the first and second language models. The second text data is read, a weight of each text data is weighted, a statistic of the word string is obtained, a statistical language model for distance calculation is generated, and the statistical inter-word distance calculating step includes: Reading a model, for a word pair consisting of words extracted from each of the text data, determining a statistical distance on the statistical language model as an inter-word distance, the threshold value determining step includes: 18. The method according to claim 15, wherein the inter-distance is read and a word pair exceeding a predetermined threshold is output.
Language model learning method described in.

21. The similar word extraction step includes a distance calculation class language model creation step, a statistical inter-word distance calculation step, and a threshold value determination step, wherein the distance calculation class language model creation step includes: The distance calculation class language model is created using the prepared text data, and the statistical inter-word distance calculation step reads the distance calculation class language model and the first and second text data, and For a word pair composed of words extracted from text data, a statistical distance on the distance calculation class language model is determined as an inter-word distance. The threshold value determining step reads the word pair and the inter-word distance. 15. A word pair exceeding a predetermined threshold value is output. 7
Language model learning method described in.

22. The method according to claim 18, wherein in the statistical inter-word distance calculating step, the inter-word distance is measured using a Euclidean distance on an N-gram language model. Language model learning method described in.

23. The method according to claim 18, wherein in the step of calculating a statistical inter-word distance, the inter-word distance is measured using cross entropy on an N-gram language model. Language model learning method described in.

24. The speech recognition method using the language model learning method according to claim 13, wherein the language model or the class language model is used for speech recognition.

25. A storage medium storing the language model learning method according to any one of claims 13 to 23.

26. A storage medium storing a speech recognition method using the language model learning method according to claim 24.