JP2005265967A

JP2005265967A - Recording medium where tree structure dictionary is recorded and language score table generating program for tree structure dictionary

Info

Publication number: JP2005265967A
Application number: JP2004074702A
Authority: JP
Inventors: Hiroaki Kokubo; 浩明小窪; Hiroshi Yamamoto; 博史山本
Original assignee: ATR Advanced Telecommunications Research Institute International
Current assignee: ATR Advanced Telecommunications Research Institute International
Priority date: 2004-03-16
Filing date: 2004-03-16
Publication date: 2005-09-29
Anticipated expiration: 2024-03-16
Also published as: JP4521631B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a tree structure capable of suppressing a search error while efficiently reducing necessary capacity. <P>SOLUTION: It is decided whether the number of words sharing each non-terminal node of the tree structure dictionary stored on the storage medium (the number of terminal nodes which can be reached from the non-terminal node through child nodes) is larger than a specified threshold (444) and when it exceeds the threshold, a nonapproximate bigram language score table is held (446), but when not, scores which are approximated by unigram are requested of all of the child nodes (448), so that a maximum value among obtained apprximiate scores is held as an approximate score of the non-terminal node (450). <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明は、大語彙音声認識等で用いられる木構造辞書の作成方法に関し、特に、消費メモリ量を削減するために、各ノードに付与される言語スコアに近似値を導入した木構造辞書、及びその作成方法に関する。 The present invention relates to a method for creating a tree structure dictionary used in large vocabulary speech recognition and the like, and in particular, to reduce the amount of memory consumed, a tree structure dictionary that introduces approximate values into language scores assigned to each node, and It relates to its creation method.

大語彙音声認識システムでは、音声認識の過程又は認識後の検証の過程で確率的言語モデルを用いた言語スコアを算出する。確率的言語モデルとは、ある言語において、単語列又は文字列等に対して、それらが起こる確率で自然言語をモデル化したものである。言語スコアは、音声認識の結果得られた単語列等の尤度を、言語モデルに従って算出したものである。 In a large vocabulary speech recognition system, a language score using a probabilistic language model is calculated in a speech recognition process or a verification process after recognition. The probabilistic language model is a model of a natural language with a probability of occurrence of a word string or a character string in a certain language. The language score is obtained by calculating the likelihood of a word string or the like obtained as a result of speech recognition according to a language model.

Ｎ個の単語又は文字からなる単語列又は文字列が生成される確率によって自然言語をモデル化したものをＮグラムモデルと呼ぶ。特にＮ＝１，２，３の場合をそれぞれユニグラム、バイグラム、トライグラムと呼ぶ。パラメータ推定のための計算量及び精度の点から、バイグラム又はトライグラムが用いられる場合が多い。以下の説明は単語列についてのものとする。 A model in which a natural language is modeled by the probability that a word string or character string composed of N words or characters is generated is called an N-gram model. In particular, the cases of N = 1, 2, 3 are called unigram, bigram and trigram, respectively. A bigram or trigram is often used from the viewpoint of calculation amount and accuracy for parameter estimation. The following description is for word strings.

認識候補の言語スコアを算出するため、予め算出した言語スコアを付した単語辞書を用意することが多い。また、認識候補の探索の効率を向上させるために、単語辞書を木構造ネットワークで表現することが一般的である。 In order to calculate the language score of a recognition candidate, a word dictionary with a pre-calculated language score is often prepared. In order to improve the efficiency of searching for recognition candidates, it is common to represent a word dictionary with a tree structure network.

図１を参照して、木構造辞書の概要を説明する。今、音響モデルをモノフォンとし、語彙として「赤い」「明るい」及び「青い」だけを考えるものとする。これらの語の音素列２０，２２，２４を図１の上段に示す。図から明らかなように、語頭の「ａ」の音３０は３つの単語で共有され、さらにその後の二つの音を含む「ａｋａ」の音素列３２は二つの単語により共有されている。 An outline of the tree structure dictionary will be described with reference to FIG. Assume that the acoustic model is a monophone and only “red”, “bright”, and “blue” are considered as vocabularies. The phoneme strings 20, 22, and 24 of these words are shown in the upper part of FIG. As is apparent from the figure, the sound “a” 30 at the beginning of the word is shared by three words, and the phoneme string 32 of “aka” including the subsequent two sounds is shared by the two words.

図１の下段に上記した単語群に対応する木構造辞書４０を示す。木構造辞書４０は、図に示すように、単語の先頭部分からの音素が共通する部分を共有ノードとし、単語が音素を共有しなくなった時点で別ノードに分岐していく。例えば前記した３つの単語は、音素「ａ」を共有しているので、木構造辞書４０の先頭ノード５０をこの３つの単語で共有する。しかし次の音素では二つのノード５２（ｋ）及び６２（ｏ）に分岐する。以下同様に木構造化されており、その結果ノード列５０，５２，５４，５６，５８，６０をたどることによって単語「あかるい」が、ノード列５０，５２，５４，６４をたどることによって単語「あかい」が、ノード列５０，６２，６６をたどることによって単語「あおい」が、それぞれ探索される。 A tree structure dictionary 40 corresponding to the above word group is shown in the lower part of FIG. As shown in the figure, the tree structure dictionary 40 uses a portion where the phonemes from the beginning of a word are common as a shared node, and branches to another node when the word no longer shares a phoneme. For example, since the three words described above share the phoneme “a”, the first node 50 of the tree structure dictionary 40 is shared by the three words. However, the next phoneme branches to two nodes 52 (k) and 62 (o). Thereafter, the tree structure is similarly formed. As a result, the word “Akarai” is traced by following the node strings 50, 52, 54, 56, 58, 60, and the word “Akairu” is traced by following the node strings 50, 52, 54, 64. The word “Aoi” is searched by tracing the node strings 50, 62 and 66 respectively.

以上が木構造辞書の基本的構造である。 The above is the basic structure of the tree structure dictionary.

木構造辞書の終端ノード（図１の場合のノード６０、６４及び６６）には、各単語のバイグラムの言語スコアテーブル（図１の例の場合、言語スコアテーブル７０，７２，７４）が付されている。このテーブルは、先行する可能性のある全ての単語に対し、その後に当該終端ノードの単語が続いて生起する確率である。従って、この言語スコアテーブルは言語モデル中の全単語の数だけのエントリを持つ。つまり、言語スコアテーブルの容量は、語彙数に依存する。 The end node of the tree structure dictionary (nodes 60, 64, and 66 in the case of FIG. 1) is assigned a bigram language score table for each word (in the example of FIG. 1, the language score tables 70, 72, and 74). ing. This table shows the probability that the word of the terminal node will subsequently occur for all the words that may precede. Therefore, this language score table has as many entries as the number of all words in the language model. That is, the capacity of the language score table depends on the number of vocabularies.

音声認識では、音声認識に同期して上記した木構造辞書の先頭ノードから終端ノードに向けて、仮説の展開を進める。終端ノードに到達した単語仮説は単語グラフに登録され、再び辞書の先頭ノードから後続の単語仮説の探索を開始する。 In speech recognition, the development of hypotheses proceeds from the first node of the tree structure dictionary to the end node in synchronization with the speech recognition. The word hypothesis reaching the end node is registered in the word graph, and the search for the subsequent word hypothesis is started again from the first node of the dictionary.

木構造辞書の探索において、全ての可能なパスを探索していくと、探索数が多くなるという問題がある。そのため、スコアに基づく枝刈りをする必要がある。そのために、終端ノード以外のノード（例えば図１に示す例ではノード５２，５４，５６，５８，６２）には、そのノードを共有している単語（終端ノード）の中の最大尤度を、全ての先行単語について計算した言語スコアテーブル（図１の場合のテーブル８０，８２，８４，８６，８８）を付す。この場合、例えば語彙数がＮであればこの言語スコアテーブルのエントリ数はＮ個になる。この様子を図３に言語スコアテーブル１２０として示す。 In searching the tree structure dictionary, if all possible paths are searched, there is a problem that the number of searches increases. Therefore, pruning based on the score is necessary. Therefore, for nodes other than the terminal node (for example, the nodes 52, 54, 56, 58, and 62 in the example shown in FIG. 1), the maximum likelihood among words (terminal nodes) sharing the node is A language score table (tables 80, 82, 84, 86, 88 in the case of FIG. 1) calculated for all preceding words is attached. In this case, for example, if the number of vocabularies is N, the number of entries in the language score table is N. This state is shown as a language score table 120 in FIG.

図３に示す言語スコアテーブル１２０は、例えばノードｓを複数の単語ｗ_iが共有している場合に、ノードｓに付与すべき言語スコアテーブルである。ノードｓを共有している単語の集合をＳで表せば、ｗi∈Ｓと書くことができる。図３に示すようにこの言語スコアテーブル１２０は、先行単語（単語ｗ_p1〜ｗ_pN）の全てに対して算出された、単語ｗ_i（ｗ_i∈Ｓ）の条件付生起確率ｐ（ｗ_i｜ｗ_pj）（ｗ_i∈Ｓ，ｊ＝１〜Ｎ）の中の最大値ｍａｘｐ（ｗ_i｜ｗ_pj）（ｗi∈Ｓ）からなるＮ個のエントリを含む。 The language score table 120 shown in FIG. 3 is a language score table to be given to the node s when, for example, the node s is shared by a plurality of words w _i . If a set of words sharing the node s is represented by S, it can be written as w i εS. As shown in FIG. 3, the language score table 120 includes the conditional occurrence probability p (w _i ) of the word w _i (w _i _εS ) calculated for all of the preceding words (words w _{p1 to} w _pN ). | W _pj ) (W _i εS, j = 1 to N) including N entries of the maximum value max p (w _i | w _pj ) (wiεS).

探索時には、各ノードで分岐する際に、分岐先のノードの言語スコアテーブルを参照して、ノードのスコアを更新し、そのスコアが所定の値より小さな枝についてはそれ以上探索しないことにより枝刈りを行なう。 At the time of branching, when branching at each node, the node score is updated by referring to the language score table of the branch destination node, and pruning is performed by not searching for branches whose score is smaller than a predetermined value. To do.

言語スコアの最大値を求める処理を頻繁に繰返すと、膨大な計算量を必要とするため、予め全ての先行単語について事前計算してテーブル化しておく。これが上記した言語スコアテーブルである。そのテーブルサイズは上記したように語彙数Ｎに依存する。ところで、共有する単語セットが互いに異なるノードの場合、それらに付与すべき言語スコアテーブルの内容も互いに異なる。従って、共有する単語が異なるノードごとに、上記した言語スコアテーブルを準備する必要がある。 If the process for obtaining the maximum language score is repeated frequently, an enormous amount of calculation is required. Therefore, all preceding words are pre-calculated and tabulated in advance. This is the language score table described above. The table size depends on the vocabulary number N as described above. By the way, when the shared word sets are different nodes, the contents of the language score table to be given to them are also different from each other. Therefore, it is necessary to prepare the language score table described above for each node having a different shared word.

しかし、木構造辞書のノード数は膨大で、語彙数も多いため、上記したテーブル化を行なうためには大量のメモリが必要とされ、実質的に不可能といってもよいほどである。そこで、テーブル化のために必要となるメモリ容量を削減することが必要になる。そのための手法が後掲の非特許文献１に開示されている。 However, since the number of nodes in the tree structure dictionary is enormous and the number of vocabularies is large, a large amount of memory is required to perform the above-described table formation, which may be substantially impossible. Therefore, it is necessary to reduce the memory capacity required for tabulation. A technique for this is disclosed in Non-Patent Document 1 described later.

非特許文献１に開示の方法は、木構造辞書において、先頭ノードからの深さを判定基準とし、上記した言語スコアテーブルを近似している。具体的には、非特許文献１に記載の方法では、あるしきい値を定め、先頭ノードからの深さがこのしきい値以下の共有ノードでは、近似をせず上記した言語スコアテーブルを付与し、しきい値を超える深さのノードでは、上記した言語スコアテーブルではなく所定の近似を行なう。近似の方法として、Ｎ個のエントリを、そのノードを共有している終端ノードのユニグラムの最大値（スカラー値）で近似する方法、及び言語スコアテーブルを単語バイグラムではなくクラスバイグラムで作成することにより近似する方法が提案されている。 The method disclosed in Non-Patent Document 1 approximates the above-described language score table in a tree structure dictionary using the depth from the first node as a criterion. Specifically, in the method described in Non-Patent Document 1, a certain threshold value is defined, and the above-mentioned language score table is assigned without approximation to a shared node whose depth from the first node is equal to or less than this threshold value. However, in a node having a depth exceeding the threshold value, a predetermined approximation is performed instead of the language score table described above. As an approximation method, N entries are approximated by the maximum value (scalar value) of a unigram of a terminal node sharing the node, and a language score table is created by a class bigram instead of a word bigram. An approximation method has been proposed.

ユニグラムで言語スコアテーブルを近似する場合、Ｎ個のエントリを持つテーブルが一つのスカラー値に置換えられるので、記憶容量は大きく削減される。クラスバイグラムで近似する場合には、ユニグラムほどの記憶容量の削減はできないが、精度がそれほど低下しないという利点がある。 When a language score table is approximated by a unigram, a table having N entries is replaced with one scalar value, so that the storage capacity is greatly reduced. When approximating with a class bigram, the storage capacity cannot be reduced as much as a unigram, but there is an advantage that the accuracy does not decrease so much.

小窪他３名、「連続音声認識システムにおけるfactoringテーブルのコンパクト化と不要単語仮説のガーベッジコレクション」、電子情報通信学会論文誌Ｄ−ＩＩＶｏｌ．８６ｎｏ．６，ｐｐ．７８７−７９５、２００３年Kokubo et al., “Compacting factoring table in continuous speech recognition system and garbage collection of unnecessary word hypotheses”, IEICE Transactions D-II Vol. 86 no. 6, pp. 787-795, 2003

非特許文献１に開示の方法は、効率よく消費メモリを削減することができる。しかし、言語スコアを近似することによって探索エラーが生じた場合、そのエラーによる損失の量が予測できない。そのため、場合により認識性能の大きな低下要因となる可能性がある。 The method disclosed in Non-Patent Document 1 can reduce memory consumption efficiently. However, if a search error occurs by approximating the language score, the amount of loss due to that error cannot be predicted. For this reason, there is a possibility that the recognition performance may be greatly reduced.

従って、本発明の目的は、木構造辞書のための言語スコアテーブルの容量を効率的に削減しながら、探索エラーの発生を抑えることができる木構造辞書を記録した記憶媒体、及び木構造辞書の言語スコアテーブル作成プログラムを提供することである。 Accordingly, an object of the present invention is to provide a storage medium storing a tree structure dictionary capable of suppressing the occurrence of a search error while efficiently reducing the capacity of the language score table for the tree structure dictionary, and a tree structure dictionary. To provide a language score table creation program.

本発明の第１の局面に係る記憶媒体は、子ノードを持つ複数の非終端ノードと、子ノードを持たずにそれぞれ単語に対応する複数の終端ノードとから構成される木構造辞書を記録した記憶媒体である。複数の非終端ノードの各々は、それぞれ所定の音素に対応している。木構造辞書は、木構造辞書の先頭ノードから子ノードをたどっていくことにより、各非終端ノードを経由して、当該非終端ノードに対応する音素が自身の音素列の一部となっている単語に対応する終端ノードの全てに到達可能に構成されている。複数の終端ノードにはそれぞれ、対応の単語の、所定の言語モデルによる第１の言語スコアテーブルが付されている。複数の非終端ノードにはそれぞれ、当該非終端ノードから子ノードをたどっていくことにより到達できる終端ノードの全ての言語スコアテーブルから、当該言語スコアテーブルの最大値をとることにより作成される第２の言語スコアテーブルが付されている。複数の非終端ノードの第２の言語スコアテーブルは、当該非終端ノードから子ノードをたどることにより到達できる終端ノードの数が所定のしきい値未満の場合には、当該第２の言語スコアテーブルより簡略な言語スコア情報に置換されている。 The storage medium according to the first aspect of the present invention stores a tree structure dictionary composed of a plurality of non-terminal nodes having child nodes and a plurality of terminal nodes corresponding to words without having child nodes. It is a medium. Each of the plurality of non-terminal nodes corresponds to a predetermined phoneme. In the tree structure dictionary, the child nodes are traced from the first node of the tree structure dictionary, so that the phoneme corresponding to the non-terminal node becomes a word that is part of its own phoneme string via each non-terminal node. All corresponding terminal nodes are configured to be reachable. Each of the plurality of terminal nodes is provided with a first language score table of a corresponding word according to a predetermined language model. A second language created by taking the maximum value of the language score table from all the language score tables of the terminal node that can be reached by tracing the child nodes from the non-terminal node to each of the plurality of non-terminal nodes A score table is attached. The second language score table of a plurality of non-terminal nodes is simpler than the second language score table when the number of terminal nodes that can be reached by tracing child nodes from the non-terminal nodes is less than a predetermined threshold value. Language language information has been replaced.

好ましくは、所定の言語モデルはバイグラム言語モデルである。 Preferably, the predetermined language model is a bigram language model.

さらに好ましくは、簡略な言語スコア情報は、当該非終端ノードから到達できる終端ノード全てに関するユニグラムスコアの最大値である。ユニグラムスコアに代えて、クラスバイグラムスコアを使用してもよい。 More preferably, the simple language score information is a maximum value of a unigram score for all terminal nodes that can be reached from the non-terminal node. A class bigram score may be used instead of the unigram score.

第２の言語スコアテーブルは、第１の言語スコアテーブルと同じエントリ数を有し、かつ第２の言語スコアテーブルのエントリの各々は、当該非終端ノードから子ノードをたどることにより到達できる終端ノードの第１の言語スコアテーブルの対応するエントリの最大値からなるものでもよい。 The second language score table has the same number of entries as the first language score table, and each entry of the second language score table is a terminal node reachable by following the child node from the non-terminal node. It may consist of the maximum value of the corresponding entry in the first language score table.

本発明の第２の局面に係る言語スコアテーブル作成プログラムは、子ノードを持つ複数の非終端ノードと、子ノードを持たずにそれぞれ単語に対応する複数の終端ノードとから構成される木構造辞書において、複数の非終端ノードに付与される言語スコアテーブルを作成するための言語スコアテーブル作成プログラムである。複数の非終端ノードは、それぞれ所定の音素に対応している。木構造辞書は、木構造の先頭ノードから子ノードをたどっていくことにより、各非終端ノードを経由して、当該非終端ノードに対応する音素が自身の音素列の一部となっている単語に対応する終端ノードの全てに到達可能に構成されている。複数の終端ノードにはそれぞれ、対応の単語の、所定の言語モデルによる第１の言語スコアテーブルが付されている。この言語スコアテーブル作成プログラムは、複数の非終端ノードの各々について、当該非終端ノードから子ノードをたどっていくことにより到達できる終端ノードの数を算出するためのプログラム部分と、複数の非終端ノードの各々において、当該非終端ノードの親ノードから言語スコアテーブルの要求を受けたことに応答して、当該非終端ノードの子ノードの各々に対して当該子ノードの言語スコアテーブルを要求するプログラム部分と、複数の非終端ノードの各々において、言語スコアテーブルを要求するプログラム部分が実行されることにより、当該非終端ノードの子ノードの各々から返される言語スコアテーブルに基づき、当該非終端ノードの第２の言語スコアテーブルを作成するプログラム部分と、言語スコアテーブルを作成するプログラム部分により作成された第２の言語スコアテーブルを親ノードに返すプログラム部分と、当該非終端ノードから子ノードをたどることにより到達できる終端ノードの数が所定のしきい値に対して予め定める条件を充足しているか否かを判定するプログラム部分と、条件が充足されていると判定されたことに応答して、当該非終端ノードについて作成された第２の言語スコアテーブルを、当該非終端ノードの言語スコアテーブルとして記憶装置に記憶させるプログラム部分と、条件が充足されていないと判定されたことに応答して、当該非終端ノードの子ノードに対して、簡略な言語スコア情報を要求する処理を行なうプログラム部分と、簡略な言語スコア情報を要求する処理に応答して子ノードから返される簡略な言語スコア情報に基づいて、当該非終端ノードに関する簡略な言語スコア情報を作成し記憶装置に記憶させるプログラム部分と、親ノードから簡略な言語スコア情報の要求を受けたことに応答して、記憶装置に記憶された簡略な言語スコア情報を親ノードに返すプログラム部分とを含む。 A language score table creation program according to a second aspect of the present invention is a tree structure dictionary composed of a plurality of non-terminal nodes having child nodes and a plurality of terminal nodes corresponding to words without having child nodes. A language score table creation program for creating a language score table assigned to a plurality of non-terminal nodes. Each of the plurality of non-terminal nodes corresponds to a predetermined phoneme. The tree structure dictionary corresponds to a word in which the phoneme corresponding to the non-terminal node is part of its own phoneme string via each non-terminal node by tracing the child node from the first node of the tree structure. It is configured to be able to reach all of the end nodes that perform. Each of the plurality of terminal nodes is provided with a first language score table of a corresponding word according to a predetermined language model. The language score table creation program includes a program part for calculating the number of terminal nodes that can be reached by following a child node from each non-terminal node, and a plurality of non-terminal nodes. In response to receiving a language score table request from a parent node of the non-terminal node, a program part that requests the language score table of the child node for each of the child nodes of the non-terminal node, and a plurality of non-terminals By executing a program part that requests a language score table in each of the nodes, a second language score table for the non-terminal node is created based on the language score table returned from each of the child nodes of the non-terminal node. Create program part and language score table A program part that returns the second language score table created by the program part to the parent node, and a condition that the number of terminal nodes that can be reached by tracing the child node from the non-terminal node is predetermined with respect to a predetermined threshold value. In response to determining that the condition is satisfied, and a program part for determining whether or not the condition is satisfied, the second language score table created for the non-terminal node is used as the language score of the non-terminal node. A program part to be stored in the storage device as a table, and a program part that performs processing for requesting simple language score information to a child node of the non-terminal node in response to determining that the condition is not satisfied And simple language score information returned from a child node in response to a process requesting simple language score information. Therefore, a simplified program stored in the storage device in response to receiving a request for simple language score information from the parent node, and a program portion for creating the simple language score information related to the non-terminal node and storing it in the storage device And a program part that returns linguistic score information to the parent node.

さらに好ましくは。簡略な言語スコア情報は、各非終端ノードから子ノードをたどっていくことにより到達できる各終端ノードのユニグラムスコアの最大値である。ユニグラムスコアに代えて、クラスバイグラムスコアを使用してもよい。 More preferably. The simple language score information is the maximum value of the unigram score of each terminal node that can be reached by following the child nodes from each non-terminal node. A class bigram score may be used instead of the unigram score.

また、第２の言語スコアテーブルは、第１の言語スコアテーブルと同じエントリ数を有し、かつ第２の言語スコアテーブルのエントリの各々は、当該非終端ノードから子ノードをたどることにより到達できる終端ノードの第１の言語スコアテーブルの対応するエントリの最大値からなるものでもよい。この場合、非終端ノードの第２の言語スコアテーブルを作成するプログラム部分は、複数の非終端ノードの各々において、言語スコアテーブルを要求するプログラム部分が実行されることにより、当該非終端ノードの子ノードの各々から返される言語スコアテーブルの各エントリにつき、最大値をとって当該非終端ノードの第２の言語スコアテーブルを作成するプログラム部分を含んでもよい。 The second language score table has the same number of entries as the first language score table, and each entry of the second language score table can be reached by following the child node from the non-terminal node. It may consist of the maximum value of the corresponding entry in the first language score table of the node. In this case, the program part for creating the second language score table of the non-terminal node is executed by executing the program part that requests the language score table in each of the plurality of non-terminal nodes. For each entry of the language score table returned from, a program portion for taking the maximum value and creating the second language score table of the non-terminal node may be included.

言語スコアを近似することによる探索エラーに起因する損失は、そのノードを共有する終端ノードの数が大きいほど大きい。そこで本実施の形態では、非特許文献１に開示された方法のように先頭ノードからの深さを基準として言語スコアテーブルを近似するのではなく、各ノードを共有している単語の数を基準として、言語スコアテーブルを近似する。この考え方につき図２を参照して説明する。 The loss due to the search error by approximating the language score increases as the number of terminal nodes sharing the node increases. Therefore, in the present embodiment, the language score table is not approximated based on the depth from the head node as in the method disclosed in Non-Patent Document 1, but the number of words sharing each node is used as a reference. The language score table is approximated as follows. This concept will be described with reference to FIG.

図２は、木構造辞書１００の例を示す。図２を参照して、この木構造辞書１００は、終端ノードＷ₁〜Ｗ₁₄と（以後、説明を簡単にするためにこれら終端ノードを「単語」と呼ぶ。）、これら単語によって共有される複数のノード（例えばノードＳ_A及びＳ_B）とを有する。ノードＳ_Aは５つの単語Ｗ₁〜Ｗ₅によって共有されている。ノードＳ_Bは２つの単語Ｗ₁₃及びＷ₁₄によって共有されている。この場合、ノードＳ_AとノードＳ_Bとでは、ノードＳ_Bが優先的に近似の対象とされる。 FIG. 2 shows an example of the tree structure dictionary 100. Referring to FIG. 2, this tree structure dictionary 100 is shared by terminal nodes W _{1 to} W ₁₄ (hereinafter, these terminal nodes are referred to as “words” for the sake of simplicity). A plurality of nodes (for example, nodes S _A and S _B ). Node S _A is shared by _five words W _{1 to} W ₅ . Node S _B is shared by two words W ₁₃ and W ₁₄ . In this case, the node S _B is preferentially approximated between the node S _A and the node S _B.

このように、ノードを共有する単語の数を基準として言語尤度を近似するか否かを定めることにより、探索エラーを抑えながら、木構造辞書のための言語スコアテーブルの容量を効率的に削減することができる。 In this way, by determining whether or not the language likelihood is approximated based on the number of words that share a node, it is possible to efficiently reduce the capacity of the language score table for the tree structure dictionary while suppressing search errors. can do.

この実施の形態に係る木構造辞書の作成システムは、コンピュータハードウェアと、そのコンピュータハードウェアにより実行されるプログラムと、コンピュータハードウェアに格納されるデータとにより実現される。図４はこのコンピュータシステム３３０の外観を示し、図５はコンピュータシステム３３０の内部構成を示す。 The tree structure dictionary creation system according to this embodiment is realized by computer hardware, a program executed by the computer hardware, and data stored in the computer hardware. FIG. 4 shows the external appearance of the computer system 330, and FIG. 5 shows the internal configuration of the computer system 330.

図４を参照して、このコンピュータシステム３３０は、ＦＤ（フレキシブルディスク）ドライブ３５２及びＣＤ−ＲＯＭ（コンパクトディスク読出専用メモリ）ドライブ３５０を有するコンピュータ３４０と、キーボード３４６と、マウス３４８と、モニタ３４２とを含む。 Referring to FIG. 4, this computer system 330 includes a computer 340 having an FD (flexible disk) drive 352 and a CD-ROM (compact disk read only memory) drive 350, a keyboard 346, a mouse 348, and a monitor 342. including.

図５を参照して、コンピュータ３４０は、ＦＤドライブ３５２及びＣＤ−ＲＯＭドライブ３５０に加えて、ＣＰＵ（中央処理装置）３５６と、ＣＰＵ３５６、ＦＤドライブ３５２及びＣＤ−ＲＯＭドライブ３５０に接続されたバス３６６と、ブートアッププログラム等を記憶する読出専用メモリ（ＲＯＭ）３５８と、バス３６６に接続され、プログラム命令、システムプログラム、及び作業データ等を記憶するランダムアクセスメモリ（ＲＡＭ）３６０とを含む。コンピュータシステム３３０はさらに、プリンタ３４４を含んでいる。 Referring to FIG. 5, in addition to the FD drive 352 and the CD-ROM drive 350, the computer 340 includes a CPU (Central Processing Unit) 356 and a bus 366 connected to the CPU 356, the FD drive 352, and the CD-ROM drive 350. And a read only memory (ROM) 358 for storing a boot-up program and the like, and a random access memory (RAM) 360 connected to the bus 366 for storing a program command, a system program, work data, and the like. Computer system 330 further includes a printer 344.

ここでは示さないが、コンピュータ３４０はさらにローカルエリアネットワーク（ＬＡＮ）への接続を提供するネットワークアダプタボードを含んでもよい。 Although not shown here, the computer 340 may further include a network adapter board that provides a connection to a local area network (LAN).

コンピュータシステム３３０に本実施の形態に係る木構造辞書の作成方法を実現させるためのコンピュータプログラムは、ＣＤ−ＲＯＭドライブ３５０又はＦＤドライブ３５２に挿入されるＣＤ−ＲＯＭ３６２又はＦＤ３６４に記憶され、さらにハードディスク３５４に転送される。又は、プログラムは図示しないネットワークを通じてコンピュータ３４０に送信されハードディスク３５４に記憶されてもよい。プログラムは実行の際にＲＡＭ３６０にロードされる。ＣＤ−ＲＯＭ３６２から、ＦＤ３６４から、又はネットワークを介して、直接にＲＡＭ３６０にプログラムをロードしてもよい。 A computer program for causing the computer system 330 to realize the method for creating a tree structure dictionary according to the present embodiment is stored in the CD-ROM 362 or FD 364 inserted in the CD-ROM drive 350 or FD drive 352, and further the hard disk 354. Forwarded to Alternatively, the program may be transmitted to the computer 340 through a network (not shown) and stored in the hard disk 354. The program is loaded into the RAM 360 when executed. The program may be loaded directly into the RAM 360 from the CD-ROM 362, from the FD 364, or via a network.

このプログラムは、コンピュータ３４０にこの実施の形態に係る木構造辞書の作成方法を実現させるための複数の命令を含む。この方法を行なわせるのに必要な基本的機能のいくつかはコンピュータ３４０上で動作するオペレーティングシステム（ＯＳ）又はサードパーティのプログラム、若しくはコンピュータ３４０にインストールされる各種ツールキットのモジュールにより提供される。従って、このプログラムはこの実施の形態のシステム及び方法を実現するのに必要な機能全てを必ずしも含まなくてよい。このプログラムは、命令のうち、所望の結果が得られるように制御されたやり方で適切な機能又は「ツール」を呼出すことにより、上記した木構造辞書の作成方法を実現できる命令のみを含んでいればよい。コンピュータシステム３３０の動作は周知であるので、ここではその詳細は繰り返さない。 This program includes a plurality of instructions for causing the computer 340 to realize the tree structure dictionary creation method according to this embodiment. Some of the basic functions necessary to perform this method are provided by operating system (OS) or third party programs running on the computer 340, or modules of various toolkits installed on the computer 340. Therefore, this program does not necessarily include all functions necessary to realize the system and method of this embodiment. This program may include only instructions that can realize the above-described method for creating a tree structure dictionary by calling an appropriate function or “tool” in a controlled manner so as to obtain a desired result. That's fine. Since the operation of computer system 330 is well known, details thereof will not be repeated here.

以下、木構造辞書自体は既に作成されたものとし、言語スコアテーブルを作成する処理を実現するためのコンピュータプログラムについて説明する。これらプログラムはいずれもノードクラスのメソッドとして用意されるものである。 Hereinafter, it is assumed that the tree structure dictionary itself has already been created, and a computer program for realizing processing for creating a language score table will be described. These programs are all prepared as node class methods.

必要なプログラムは、大きく分けて３つある。第１は、各ノードが自分の下にいくつの単語（終端ノード）が存在しているかを子ノードに問合わせるためのプログラムである。このプログラムは再帰的に実行される。第２は、各ノードが自ノードに格納すべき言語スコアテーブルを決定する処理である。この処理をするためには、子ノードの言語スコアテーブルを問合わせる必要がある。従ってこのプログラムも再帰的に実行される。第３は、第２のプログラムに関連して実行されるものであり、近似した後の言語スコアを返すよう親ノードから要求されたときに、自ノードに格納された近似後の言語スコアを親ノードに返す処理である。第３の処理については容易に実現できるので、以下第１及び第２の処理を実現するプログラム（メソッド）の制御構造についてのみ説明する。 There are three major programs. The first is a program for inquiring child nodes how many words (terminal nodes) exist under each node. This program is executed recursively. The second is processing for determining a language score table that each node should store in its own node. In order to perform this process, it is necessary to query the language score table of the child node. Therefore, this program is also executed recursively. The third is executed in connection with the second program. When the parent node requests to return the language score after approximation, the language score after approximation stored in the own node is used as the parent language score. Processing to return to the node. Since the third process can be easily realized, only the control structure of the program (method) for realizing the first and second processes will be described below.

図６は、各ノード（終端ノードを除く）において、自己の下にいくつの終端ノード（自ノードを共有している単語数）が存在しているかを子ノードに問合わせることにより調べるためのプログラムの制御構造を示す。図６を参照して、このプログラムは、自ノードの下の子ノードの全てに対して、その下の共有単語数を問合わせるステップ４００と、ステップ４００で自己の全ての子ノードから得られた共有単語数の合計を算出し、自己の属性として当該共有単語数を保存するステップ４０２とを含む。ステップ４０２の後、自己の下の共有単語数の合計を戻り値として処理を終了する。ステップ４００で各ノードに対して共有単語数を問合わせると、各子ノードでは、この図６に示すこの処理を実行する。すなわちこの処理は再帰的に実行される。 FIG. 6 shows a program for checking the number of end nodes (number of words sharing the own node) under each node (excluding the end node) by inquiring the child nodes. The control structure of is shown. Referring to FIG. 6, this program is obtained from all the child nodes of the self in step 400 that inquires all of the child nodes under the self node for the number of shared words under the node. Calculating the total number of shared words, and storing 402 the number of shared words as its own attribute. After step 402, the process is terminated with the total number of shared words underneath as a return value. When querying the number of shared words for each node in step 400, each child node executes this processing shown in FIG. That is, this process is executed recursively.

終端ノードの場合、自己の下には子ノードは存在しない。終端ノードが親ノードから共有単語数の問合せを受けた場合、単に「１」を返せばよい。従ってここでは終端ノードに関する処理については特に図示しない。 In the case of a terminal node, there is no child node under self. When the terminal node receives an inquiry about the number of shared words from the parent node, it may simply return “1”. Accordingly, the processing related to the terminal node is not particularly illustrated here.

以上の処理によって、木構造辞書中の各ノードの属性値として、その下に存在する終端ノードの数が調べられる。 With the above processing, the number of terminal nodes existing below the attribute value of each node in the tree structure dictionary is checked.

図７は、各ノード（終端ノードを除く）において、親ノードから自ノードの言語スコアテーブルの要求を受けたときに実行される処理のフローチャートである。まず、ステップ４４０で、自己の直下の子ノードの全てに対してそれぞれの言語スコアテーブルを問合わせる。ステップ４４０で各子ノードから返された言語スコアテーブルを用いて、ステップ４４２で自己のノードの言語スコアテーブルを作成する。 FIG. 7 is a flowchart of processing executed in each node (excluding the terminal node) when a request for the language score table of the own node is received from the parent node. First, in step 440, each language score table is queried with respect to all of its immediate child nodes. Using the language score table returned from each child node in step 440, a language score table of its own node is created in step 442.

ここでは、図３に示されるように、先行する単語ごとに、自己の下の終端ノードの言語スコアの最大値を求める。従って、先行する単語一つについて言語スコアテーブルのエントリが一つ作成される。語彙数がＮであれば、言語スコアテーブルにはＮ個の言語スコアが格納される。 Here, as shown in FIG. 3, for each preceding word, the maximum value of the language score of the terminal node below it is obtained. Therefore, one entry in the language score table is created for one preceding word. If the vocabulary number is N, N language scores are stored in the language score table.

続いてステップ４４４で、図６に示す処理により算出した自己の下の共有単語数が、所定のしきい値以上か否かを判定する。しきい値以上であればステップ４４６に進み、ステップ４４２で作成された言語スコアテーブルをそのまま自己の言語スコアテーブルとして保存する。共有単語数がしきい値未満であれば、ステップ４４８で、近似したスコア（スカラー）を自己の子ノードの全てに対して要求する。ステップ４５０では、ステップ４４８で全ての子ノードから得られた近似スコアの中の最大値を、自己の近似スコアとして格納する。 Subsequently, at step 444, it is determined whether or not the number of shared words under the self calculated by the process shown in FIG. 6 is equal to or greater than a predetermined threshold value. If it is equal to or greater than the threshold value, the process proceeds to step 446, and the language score table created in step 442 is stored as it is as its own language score table. If the number of shared words is less than the threshold, in step 448, an approximate score (scalar) is requested to all of its own child nodes. In step 450, the maximum value among the approximate scores obtained from all the child nodes in step 448 is stored as its own approximate score.

ステップ４５０の後、ステップ４４２で作成された言語スコアテーブルを戻り値としてこの処理を終了する。 After step 450, the process is terminated with the language score table created in step 442 as a return value.

本実施の形態では、近似スコアとしてユニグラムを使用する。各終端ノードには、上記したバイグラムの言語スコアテーブルだけでなくユニグラムのスコア（スカラー）も格納されている。図７に示す処理によって最終的に各終端ノードに対して近似スコアの要求が出されるので、各終端ノードは自己のユニグラムスコアを親ノードに対して返せばよい。 In this embodiment, a unigram is used as an approximate score. Each terminal node stores not only the bigram language score table described above but also a unigram score (scalar). Since the process shown in FIG. 7 finally requests an approximate score for each terminal node, each terminal node may return its own unigram score to the parent node.

また、ステップ４４８で近似したスコアの要求を受けたノード（終端ノード以外）では、それ以前に図７の処理を完了しその中でステップ４５０によって既に近似したスコアを算出し保存済みである場合には、保存済みの近似スコアを親ノードに対して返す。近似値がない場合には、ステップ４４８及びステップ４５０の処理を行なって近似したスコアを算出し保存した上、親ノードに対して返す。 In addition, in the node (other than the terminal node) that has received the request for the score approximated in step 448, the processing in FIG. 7 has been completed before that, and the score approximated in step 450 has already been calculated and stored. Returns the stored approximate score to the parent node. If there is no approximate value, the process of steps 448 and 450 is performed to calculate and store the approximate score, and return it to the parent node.

以上のような制御構造を有するプログラムによって、木構造辞書の各ノードについて、言語スコアテーブル又は近似した言語スコア（ユニグラム）を付与することができる。 With the program having the control structure as described above, a language score table or an approximate language score (unigram) can be assigned to each node of the tree structure dictionary.

上記した実施の形態に係る木構造作成システムは以下のように動作する。まず、従来の技術におけるのと同様に、木構造が作成される。各終端ノードには、予め算出されている言語スコアテーブルが付与される。 The tree structure creation system according to the above-described embodiment operates as follows. First, a tree structure is created as in the prior art. A language score table calculated in advance is assigned to each terminal node.

最初に、各木構造の先頭ノードに対してその下の共有単語数を問合わせる。先頭ノードでは、図６に示す処理が実行される。すなわち先頭ノードは、この問合せを受けて自己の直下の子ノードの各々に対し、それらの下の共有単語数を問合わせる。問合せを受けた子ノードの各々では、さらにそれらの直下の子ノードの共有単語数を問合わせる。このように、この問合せは各ノードによって再帰的に実行され、最終的に各終端ノードに対しこの要求が与えられる。各終端ノードは、この要求に応答して「１」を返す。この返答を受けた親ノードは、子ノードから返された値の合計を算出して格納し、さらに親ノードに対して返す。このようにして、最も下のノードから順番に上層のノードに向けて、自己の下の共有単語数を返していき、最終的に先頭ノードにその下の共有単語数の合計が戻される。この過程で各ノードには、図６のステップ４０２の処理により、自己の下の共有単語数が記憶される。 First, the number of shared words under the first node of each tree structure is inquired. In the head node, the process shown in FIG. 6 is executed. That is, the top node receives this inquiry and inquires each child node immediately below itself of the number of shared words below them. Each of the child nodes that have received the inquiry further inquires the number of shared words of the child nodes immediately below them. Thus, this query is recursively executed by each node, and finally this request is given to each terminal node. Each terminal node returns “1” in response to this request. Upon receiving this response, the parent node calculates and stores the sum of the values returned from the child nodes, and returns it to the parent node. In this way, the number of shared words below itself is returned in order from the lowest node to the upper layer node, and finally the total number of shared words below it is returned to the top node. In this process, each node stores the number of shared words under itself by the process of step 402 in FIG.

次に、先頭ノードに対して言語スコアテーブルを問合わせる。先頭ノードでは、図７に示す処理が実行される。すなわち先頭ノードは、この問合せを受けて自己の直下の子ノードの各々に対し、言語スコアテーブルを要求する（図７のステップ４４０）。それら子ノードはさらに、自己の直下の子ノードの各々に対し、言語スコアテーブルを要求する。こうした処理が繰り返され、最終的に終端ノードにこの要求が与えられる。 Next, the language score table is queried for the first node. In the head node, the processing shown in FIG. 7 is executed. That is, the top node receives this inquiry and requests a language score table from each of its immediate child nodes (step 440 in FIG. 7). These child nodes further request a language score table for each of their immediate child nodes. These processes are repeated, and this request is finally given to the end node.

この要求を受けた終端ノードの各々は、自己が保持している言語スコアテーブルを親ノードに返す。全ての子ノードから言語スコアテーブルを受けた親ノードは、図７のステップ４４２に示されるように、得られた言語スコアテーブルに基づいて自己の言語スコアテーブルを作成する。この言語スコアテーブルは、図３に示すように、先行単語ごとに、子ノードから得られた言語スコアの最大値をとったものである。 Each of the terminal nodes that have received this request returns the language score table held by itself to the parent node. The parent node that has received the language score table from all the child nodes creates its own language score table based on the obtained language score table, as shown in step 442 of FIG. As shown in FIG. 3, this language score table is obtained by taking the maximum language score obtained from the child node for each preceding word.

さらに親ノードは、自己の下の共有単語数がしきい値以上か否かを判定する（ステップ４４４）。この判定は、図６のステップ４０２で自己の下の共有単語数が既に算出されているので可能となる。もし共有単語数がしきい値以上なら、ステップ４４２で作成されたテーブルを自己の言語スコアテーブルとして保存し処理を終了する。さもなければ、ステップ４４８で、自己の直下の子ノードの全てに対し、近似したスコアを要求する。今、直下の子ノードが終端ノードであれば、その終端ノードはその終端ノードに対応する単語のユニグラムを親ノードに返す。全ての子ノードからユニグラムを受けた親ノードは、受けたユニグラムの中の最大値を自己の近似スコアとして格納し、処理を終了する。 Further, the parent node determines whether or not the number of shared words under itself is equal to or greater than a threshold value (step 444). This determination is possible because the number of shared words under itself has already been calculated in step 402 of FIG. If the number of shared words is greater than or equal to the threshold value, the table created in step 442 is stored as its own language score table, and the process ends. Otherwise, step 448 requests approximate scores for all of its immediate child nodes. If the immediate child node is a terminal node, the terminal node returns a word unigram corresponding to the terminal node to the parent node. The parent node that receives the unigram from all the child nodes stores the maximum value in the received unigram as its own approximate score, and ends the process.

図７のステップ４４４で共有単語数がしきい値未満と判定される場合、さらに下位のノードにおいても、同様の判定が行なわれる。従って、あるノードでステップ４４８の処理が実行されるときには、その前に既にステップ４４０で子ノードの全てに対し言語スコアテーブルの問合せが行なわれた時点において、子ノードではステップ４４８及び４５０の処理が完了している。従って、あるノードがステップ４４８の処理を実行するときには、その直下のノードからすぐに近似したスコアが返され、ステップ４５０の処理を実行できる。 If it is determined in step 444 in FIG. 7 that the number of shared words is less than the threshold value, the same determination is performed in the lower nodes. Therefore, when the processing of step 448 is executed in a certain node, the processing of steps 448 and 450 is executed in the child node at the time when the query of the language score table has already been performed for all of the child nodes in step 440 before that. Completed. Therefore, when a certain node executes the process of step 448, a score that is immediately approximated is returned from the node immediately below it, and the process of step 450 can be executed.

このようにして、先頭ノードを始点として再帰的に各ノードの言語スコアテーブルの作成及び近似が行なわれる。図７のステップ４４４以下の処理により、共有単語数がしきい値以上のノードには近似しない言語スコアテーブルが保持され、それ以外のノードでは近似したスコアが保持される。従って、木構造辞書の容量が削減され、音声認識に必要なメモリ容量も削減できる。また、言語スコアテーブルの近似は、ノードの共有単語数がしきい値未満のときに限定される。そのため、近似による探索エラーの影響は限定された範囲にとどまり、音声認識の精度が予期せず悪化するおそれは少ない。 In this way, the language score table of each node is recursively created and approximated starting from the top node. By the processing from step 444 onward in FIG. 7, a language score table that is not approximated to nodes whose number of shared words is equal to or greater than a threshold value is retained, and approximate scores are retained at other nodes. Therefore, the capacity of the tree structure dictionary is reduced, and the memory capacity necessary for speech recognition can be reduced. The approximation of the language score table is limited when the number of shared words of the node is less than a threshold value. Therefore, the influence of the search error due to the approximation is limited to a limited range, and there is little possibility that the accuracy of the speech recognition is unexpectedly deteriorated.

例えば、図２に示す例でいえば、しきい値＝３〜５のいずれの場合にも、ノードＳ_Aについては言語スコアテーブルの近似は行なわれず、ノードＳ_Bについてのみ行なわれる
なお、上記した実施の形態では、近似スコアとしてユニグラムを用いた。しかし本発明はそのような実施の形態に限定されるわけではない。例えば、非特許文献１でも提案されているとおり、近似としてクラスバイグラムを使用するようにしてもよい。 For example, in the example shown in FIG. 2, the case of both threshold = 3-5, approximation of language score table is not performed for the node S _A, Note carried out only for node S _B, and the In the embodiment, a unigram is used as an approximate score. However, the present invention is not limited to such an embodiment. For example, as proposed in Non-Patent Document 1, a class bigram may be used as an approximation.

また、上記した実施の形態では、言語スコアテーブルをバイグラムの言語スコアによるものとした。しかし、本実施の形態は原理的にはバイグラムを用いたものに限定されない。例えば言語スコアテーブルをトライグラムからなるテーブルとし、バイグラムの言語スコアテーブル又はユニグラムで言語スコアテーブルを近似することも考えられる。 In the above embodiment, the language score table is based on the bigram language score. However, this embodiment is not limited to the one using bigram in principle. For example, the language score table may be a table composed of trigrams, and the language score table may be approximated by a bigram language score table or unigram.

今回開示された実施の形態は単に例示であって、本発明が上記した実施の形態のみに制限されるわけではない。本発明の範囲は、発明の詳細な説明の記載を参酌した上で、特許請求の範囲の各請求項によって示され、そこに記載された文言と均等の意味及び範囲内でのすべての変更を含む。 The embodiment disclosed herein is merely an example, and the present invention is not limited to the above-described embodiment. The scope of the present invention is indicated by each claim in the claims after taking into account the description of the detailed description of the invention, and all modifications within the meaning and scope equivalent to the wording described therein are intended. Including.

木構造辞書の概念を説明するための模式図である。It is a schematic diagram for demonstrating the concept of a tree structure dictionary. 木構造辞書における言語スコアテーブルの近似を説明するための図である。It is a figure for demonstrating the approximation of the language score table in a tree structure dictionary. バイグラムによる言語スコアテーブルの構成を模式的に示す図である。It is a figure which shows typically the structure of the language score table by a bigram. コンピュータの外観図である。It is an external view of a computer. 図４に示すコンピュータのブロック図である。It is a block diagram of the computer shown in FIG. ノード下の共有単語数に関する問合せを受けた場合に、各ノードで実行されるプログラム（メソッド）の制御構造を示すフローチャートである。It is a flowchart which shows the control structure of the program (method) performed in each node, when the inquiry regarding the number of shared words under a node is received. 言語スコアテーブルに関する問合せを親ノードから受けたときに、各ノードで実行されるプログラム（メソッド）の制御構造を示すフローチャートである。It is a flowchart which shows the control structure of the program (method) performed by each node, when the inquiry regarding a language score table is received from the parent node.

Explanation of symbols

４０，１００木構造辞書、５０，５２，５４，５６，５８，６２，Ｓ_A，Ｓ_B ノード、６０，６４，６６，Ｗ₁〜Ｗ₁₄ 終端ノード、７０，７２，７４，８０，８２，８４，８６，８８，１２０言語スコアテーブル 40,100 tree structure dictionary, 50, 52, 54, 56, 58, 62, S _A , S _B nodes, 60, 64, 66, W _{1 to} W ₁₄ terminal nodes, 70, 72, 74, 80, 82, 84, 86, 88, 120 Language Score Table

Claims

A storage medium recording a tree structure dictionary composed of a plurality of non-terminal nodes having child nodes and a plurality of terminal nodes corresponding to words without having child nodes, wherein the plurality of non-terminal nodes are respectively It corresponds to a predetermined phoneme,
The tree structure dictionary
By following the child node from the first node of the tree structure dictionary, the terminal node corresponding to the word whose phoneme corresponding to the non-terminal node is part of its own phoneme sequence via each non-terminal node It is configured to reach all of the
Each of the plurality of terminal nodes is provided with a first language score table of a corresponding word according to a predetermined language model,
Each of the plurality of non-terminal nodes is created by taking the maximum value of the language score table from all the language score tables of the terminal nodes that can be reached by following the child nodes from the non-terminal node. With a language score table
The second language score table of the plurality of non-terminal nodes is the second language score table when the number of terminal nodes that can be reached by following the child nodes from the non-terminal nodes is less than a predetermined threshold value. A storage medium storing a computer-readable tree structure dictionary, characterized by being replaced with simpler language score information.

The storage medium according to claim 1, wherein the predetermined language model is a bigram language model.

The storage medium according to claim 1, wherein the simple language score information is a maximum value of a unigram score for all terminal nodes that can be reached from the non-terminal node.

The storage medium according to claim 1, wherein the simple language score information is a maximum value of a class bigram score for all terminal nodes that can be reached from the non-terminal node.

The second language score table has the same number of entries as the first language score table, and each entry of the second language score table can be reached by following a child node from the non-terminal node. The storage medium according to claim 1, comprising a maximum value of a corresponding entry in the first language score table of a terminal node.

Create a language score table to be assigned to a plurality of non-terminal nodes in a tree structure dictionary composed of a plurality of non-terminal nodes having child nodes and a plurality of terminal nodes corresponding to words without having child nodes. A language score table creation program for each of the plurality of non-terminal nodes, each corresponding to a predetermined phoneme,
The tree structure dictionary
By following the child node from the first node of the tree structure dictionary, the terminal node corresponding to the word whose phoneme corresponding to the non-terminal node is part of its own phoneme string via each non-terminal node It is configured to reach all of the
Each of the plurality of terminal nodes is provided with a first language score table of a corresponding word according to a predetermined language model,
The language score table creation program includes:
For each of the plurality of non-terminal nodes, a program part for calculating the number of terminal nodes that can be reached by tracing child nodes from the non-terminal nodes;
In response to receiving a request for a language score table from a parent node of the non-terminal node in each of the plurality of non-terminal nodes, a language score table of the child node is requested to each of the child nodes of the non-terminal node Program part to be
By executing a program part that requests the language score table in each of the plurality of non-terminal nodes, the second language of the non-terminal node is based on the language score table returned from each of the child nodes of the non-terminal node. A program part for creating a score table;
A program part for returning the second language score table created by the program part for creating the language score table to the parent node;
A program part for determining whether the number of terminal nodes that can be reached by following a child node from the non-terminal node satisfies a predetermined condition with respect to a predetermined threshold;
In response to determining that the condition is satisfied, a program part that causes the storage device to store the second language score table created for the non-terminal node as a language score table of the non-terminal node;
In response to determining that the condition is not satisfied, a program part that performs processing for requesting simple language score information to a child node of the non-terminal node;
Based on the simple language score information returned from the child node in response to the process of requesting the simple language score information, a program part that creates simple language score information related to the non-terminal node and stores it in the storage device;
A language score table creating program including a program part that returns the simple language score information stored in the storage device to the parent node in response to receiving the simple language score information request from the parent node .

The language score table creation program according to claim 6, wherein the predetermined language model is a bigram language model.

The language score according to claim 6 or 7, wherein the simple language score information is a maximum value of a unigram score of each terminal node that can be reached by following a child node from each non-terminal node. Table creation program.

The language score according to claim 6 or 7, wherein the simple language score information is a maximum value of a class bigram score of each terminal node that can be reached by following a child node from each non-terminal node. Table creation program.

The second language score table has the same number of entries as the first language score table, and each entry of the second language score table can be reached by following a child node from the non-terminal node. Consisting of the maximum value of the corresponding entry in the first language score table of the terminal node,
The program part for creating the second language score table of the non-terminal node is a child node of the non-terminal node by executing a program part that requests the language score table in each of the plurality of non-terminal nodes. 10. A program part for creating a second language score table for the non-terminal node by taking a maximum value for each entry in the language score table returned from each of Language score table creation program.