JPS6126095A

JPS6126095A - Automatic calculation of word-to-word distance

Info

Publication number: JPS6126095A
Application number: JP14718984A
Authority: JP
Inventors: 石垣　由里子; 佐藤　泰雄
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1984-07-16
Filing date: 1984-07-16
Publication date: 1986-02-05
Also published as: JPH0574838B2

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、単語間距離の自動算出方法に関し、音声認識
装置の認識対象単語セットの適否を事前評価するのに用
いて有効なものである。[Detailed Description of the Invention] [Field of Industrial Application] The present invention relates to a method for automatically calculating the distance between words, and is effective when used to evaluate in advance the suitability of a set of words to be recognized by a speech recognition device. .

[Conventional technology]

音声認識装置には音声入力してよい単語群（認識対象単
語上ノドという）を予め定めておくものがあるが、か＼
る音声認識装置の入力単語の認識率を向上させるには該
単語セットに（ＩＪている発音（読み）のものかないよ
うにしておくことか重要である。単語セットに含まれる
ＱＸ語それ自体は装置使用目的により定まり、変更しに
くいことが予想されるが、単語の読みは変えても格別支
障ないから、単語セット中の各単語の読みは紛られしい
、も−のかないように選定しておくのかよい。Some speech recognition devices have a predetermined group of words that can be input by voice (known as recognition target words).
In order to improve the recognition rate of input words of a speech recognition device, it is important to make sure that the word set does not contain any (IJ) pronunciation (pronunciation).The QX words themselves included in the word set are It is determined by the purpose of use of the device and is expected to be difficult to change, but changing the pronunciation of a word will not cause any particular problem, so the pronunciation of each word in the word set should be selected so as not to be confusing or confusing. Should I leave it?

紛られしいか否かは、簡単に分るものもある。There are some things that are easy to tell whether they are confusing or not.

例えば数字の「７−ｊは「しち」とも「なな」とも読め
るが、これを１しち」と発音すると数字の「１」即ち「
いち」と紛られしく、両者は誤認識され易い。そこで「
７」は「なな」と発音するようにすれば［いちＪとの区
別が明瞭になり、これは経験的にも知られている。しか
し音声認識対象の単語の数が増加するにつれて、紛られ
しいのはどれとどれか簡単には分らな（なり、また紛ら
れしいものを見つけてその一方を他の読みに代えると今
度は弛め単語と紛られしくなるという問題もある。そこ
で音声認識対象の単語セットは、２様３様の読みがある
単語については適当な１２を選んで該単語セットの読み
を固定し、それで実際に音声認識してみて誤認識が生じ
るか否かをテストし、誤認識が生しれば該当単語を他の
読みに代えて再びテストし、といったカットアンドトラ
イの方法をとって適当な単語上ノドを求めている。しか
しこの方法では時間的、労力的負担が非常に大きい。For example, the number ``7-j'' can be read as ``shichi'' or ``nana'', but if you pronounce it as ``1-shichi'', it will be pronounced as the number ``1'', that is, ``
It is easy for the two to be mistaken for each other. Therefore"
If ``7'' is pronounced as ``nana'', it will be clearly distinguished from ``ichi J'', and this is known from experience. However, as the number of words to be recognized increases, it is not easy to tell which words are confusing (and when you find confusing words and substitute one pronunciation for the other, it becomes easier to understand which ones are confusing). There is also the problem that the word set for speech recognition is a word that has 2 or 3 different pronunciations, and 12 are selected to fix the pronunciation of the word set. We use a cut-and-try method to test whether or not there is a misrecognition by performing voice recognition, and if there is a misrecognition, we change the pronunciation of the word to a different reading and test again. However, this method requires an extremely large amount of time and effort.

そこで実際に音声認識テストをするのではなく、事前に
、まだ文字の段階で、単語セットの各単語の読みの適否
をチェ７りするのが有効である。そして紛られしいとい
う問題は単語セット中の任意の２つの単語間で発生する
ことを考えると、単語セット中の全単語につき残りの全
単語との紛られしさの程度を全て調へ上げ、誤認識の恐
れがある単語対（の読み）があればその単語セットは不
採用とする、のが有効である。単語間の類似、非類似度
を数値で表わすものに単語間距離がある。Therefore, rather than actually conducting a speech recognition test, it is effective to check the pronunciation of each word in the word set in advance, while it is still at the character stage. Considering that the problem of confusion occurs between any two words in the word set, we can increase the degree of confusion with all remaining words for every word in the word set, and It is effective to reject a word set if there is a word pair (pronunciation) that poses a risk of recognition. Word distance is a numerical expression of the degree of similarity and dissimilarity between words.

ＤＰマツチング方＠　（Ｖｅｌｉｃｈｋｏ　ｅｔ　ａｌ
、　　Ｉｎｔ。DP matching method @ (Velichko et al
, Int.

Ｊ、　　　Ｍａｎ−Ｍａｃｈｉｎｅ　　Ｓ、ｔｕｄｉｅ
ｓ、　　　　ｖｏｌ、２＋　　　ｐ２２３＋　　　１９
７０）ではこの単語間距離を文字列相互間の距離として
求める。簡単には２つの単語の各音節間のローカル距離
を累積しこれらの音節には種々の組合せが、考えられ組
合せが異なれば累積値も異なるが、その中の最小値を単
語間距離とするものである。J, Man-Machine S, tudie
s, vol, 2+ p223+ 19
In step 70), this distance between words is determined as the distance between character strings. Simply put, the local distance between each syllable of two words is accumulated, and there are various possible combinations of these syllables, and different combinations result in different cumulative values, but the minimum value among them is taken as the inter-word distance. It is.

[Problem that the invention seeks to solve]

しかしながらこのＤＰ法は、各単語の音節数を反映した
ものではない。１；すえば音節数２の単語へと音節数３
の単語Ｂとの距離が、音節数２の単語Ｃと音節数２の単
語りとの距離に等しいという結果が得られたとすると、
単語Ａ、Ｂの組と単語Ｃ１Ｄの組の誤認識率は同じであ
るとされるが、経験的に言っても音節数の異なる組Ａ、
Ｂの認識率は文字数の等しい組Ｃ，Ｄの認識率より高い
はずである。この点が、従来のＤＰマツチング法による
距離計算では反映されていない。However, this DP method does not reflect the number of syllables in each word. 1; Word with 2 syllables becomes 3 syllables
Suppose we get the result that the distance between word B and word B is equal to the distance between word C, which has two syllables, and word R, which has two syllables.
It is said that the misrecognition rate for the set of words A and B and the set of word C1D is the same, but empirically speaking, it is said that the set of words A and B has the same number of syllables.
The recognition rate of B should be higher than that of sets C and D, which have the same number of characters. This point is not reflected in distance calculation using the conventional DP matching method.

本発明は、上述したＤＰ法の不十分さを補い、認識対象
単語の事前評価をより実用性の高いものにしようとする
ものである。The present invention aims to compensate for the insufficiency of the DP method described above and to make the preliminary evaluation of recognition target words more practical.

ｃ問題点を解決するだめの手段〕本発明は、音声認識対象の単語セットの各単語間の距離
を算出する方法において、各車・語を個々の音節に分解
するステップと、音節数Ｍの単語と音節数Ｎの単語との
距離をＤＰマツチング法で求めるステップと、得られた
距離に、■２つの単語が同じ音節数を持てば、そうでな
い場合よりも距離は小さい、■２つの単語が同じ単語長
さを持つ場合には、ＤＰ法で求められた終点までの距離
を単語長で割ったものを正規化された距離とする、■距
離は２つの距離について常に対称である、の３条件を満
足させる修正を施す正規化定数を乗して単語間距離を求
めるステップを有することを特徴とするものである。次
に実施例を参照しながら構成及び作用を詳細に説明する
。[Means for Solving Problem c] The present invention provides a method for calculating the distance between each word in a word set to be speech recognized. A step of calculating the distance between a word and a word with the number of syllables N using the DP matching method, and using the obtained distance as follows: ■ If two words have the same number of syllables, the distance is smaller than if the two words do not have the same number of syllables. have the same word length, the distance to the end point calculated by the DP method divided by the word length is the normalized distance. ■The distance is always symmetrical about the two distances. This method is characterized by the step of calculating the distance between words by multiplying it by a normalization constant that makes corrections that satisfy three conditions. Next, the structure and operation will be explained in detail with reference to embodiments.

〔Example〕

第１図はかな表記された２つの単語Ａ、Ｂを処理ブロッ
ク１．２で音節に分解し、処理ブロック３でその単語間
距離をＤＰ法により求めるシステムの概略図である。４
は音節に分解する際に使用するかなと音節の対応表、５
は距離を求める際に使用する音素距離マトリクスである
。FIG. 1 is a schematic diagram of a system in which two words A and B written in kana are decomposed into syllables in processing block 1.2, and the distance between the words is determined by the DP method in processing block 3. 4
Kana and syllable correspondence table used when breaking down into syllables, 5
is the phoneme distance matrix used when calculating the distance.

単語Ａ、Ｂのかな表記は５０音（４６音＞　＋’　ｓ音
、鼻濁音、半濁音、促音、撥音、拗音、外来語の“スイ
”、゛ティ”などと、これらの長音のうちの任意の１つ
又は複数の組合せからなる。また処理ブロック１．２に
より分解される音節は子音＋母音からなり、母音ばａ、
ｉ、ｕ、・・・・・・等、子音はｓ、に、ｔ、・・・・
・・等からなる。かなのローマ字表記表の一部を次に示
す。The kana notation for words A and B consists of 50 syllables (46 syllables >+' s, nasal rasp, semi-voiced, consonant, rasp, sul, loanwords such as ``sui'' and ``ti'', and any of these long syllables. The syllables decomposed by processing block 1.2 consist of consonants + vowels, and the vowels a,
i, u, etc., the consonants are s, ni, t, etc.
Consists of...etc. A part of the romaji notation for kana is shown below.

表　　　１処理ブロック３では音素間の距離マトリクスを使用して
２個の単語Ａ、Ｂ間の距離を求める。距離マトリクスの
一部を次表に示す。Table 1 Processing block 3 calculates the distance between two words A and B using a distance matrix between phonemes. A part of the distance matrix is shown in the table below.

第４図に単語入力から距離出力を得るまでの処理フロー
を示す。フロックＩＡでは前記表１の変換テーブルを用
いて、ひらがなの入力単語Ａを音節列へ変換する。第５
図（ａｌはこの変換処理の詳細を示すフローチャーｌ−
で、同図（ｂｌはその具体例である。ブロック２人も同
様で、単語Ｂに対し同様処理を施す。これらの処理で得
られた音節列Ａ。FIG. 4 shows a processing flow from inputting a word to obtaining a distance output. In Flock IA, input word A in hiragana is converted into a syllable string using the conversion table shown in Table 1 above. Fifth
Figure (al is a flowchart showing the details of this conversion process)
In the same figure, bl is a specific example. The two people in the block also perform the same processing on word B. The syllable string A obtained by these processings.

Ｂはブロック３Ａで、前記表２の音素間距用１テーブル
を用いて距離計算される。これらのブロックＩＡ、２Ａ
、３Ａは第１図のブ（コック１．２．３に相当する。B is block 3A, and the distance is calculated using the 1 table for inter-phoneme distance shown in Table 2 above. These blocks IA, 2A
, 3A correspond to cock 1.2.3 in FIG.

処理ブ１コック３によるＤＰ法では、すべての単語の組
み合わせについて距離を求めるので、窓は設げない。そ
して、各単音節間の！ローカル距離を求め、始点から各
単音節間の距￥１１１を累積していき、終点までの距離
を求める。第２図で説明するに、単語Ａのｉ番目の単音
節と単語ＢＯ１番目の単音節の間の距離（ローカル距離
）をＣＩ、Ｊ　　（図示せず）とし、ごごまでの累積距
ｊ３１ｉをｄ′１．とすると、ｄ’＋　、　ｉ　は次式
で表わされる。In the DP method using the processing block 3, distances are calculated for all word combinations, so no windows are provided. And between each monosyllable! Find the local distance, accumulate the distances of ¥111 between each single syllable from the starting point, and find the distance to the ending point. To explain with Figure 2, let the distance (local distance) between the i-th monosyllable of word A and the first monosyllable of word BO be CI, J (not shown), and the cumulative distance j31i of the sesame seeds is d′1. Then, d'+ and i are expressed by the following formula.

ｄｌ、４−ｍｉｎ　（ｄＺ−＋＋＋　ｌ　ｄ　＋−１、
ｈ−＋　。dl, 4-min (dZ-+++ l d +-1,
h-+.

”ｌ　ｒ　Ｊ−＋　）　”　Ｃ１ｒ　］即ちｊ、ｊ番迄
の距離”ｉ、＋　ば、その手前の３点までの距離ｄＺ−
１，１＋　ｄ′ｌ−１＋　Ｊ　ｌ　＋　　”＋＋＋　ｌ
　の最小のものに１０−カル距呂１［ｃ、、、、を加え
たものである。ｉ、ｊを増加させてＭ、Ｎに至れば単語
Ａ。``l r J-+ ) '' C1r ] That is, j, distance to number j ``i,+, distance to three points in front of it dZ-
1,1+ d′l−1+ J l + ”+++ l
It is the sum of 10-cal distance 1 [c, , , , to the minimum of . If i and j are increased and M and N are reached, word A is obtained.

８間の距離が求まる。こ−でＭは単語Ａの音節の数、Ｎ
は単３３Ｂの音節の数である。Find the distance between 8. Here, M is the number of syllables in word A, N
is the number of single 33B syllables.

このようにして得られたＤＰ法による単語間距離は、前
述のように単語の音節数か考慮されていない。そごで本
発明では最終音節Ｍ、Ｎまでの絶対距離ｄ′かｉＵられ
たら、それに正規化定数Ｋを乗してそれを単語間距Ｍｌ
ｔ　（＋とする。即ちα、βは荷重係数正規化定数にとして上記の値を用いたのは次の３点を考
慮したためである。■２つの単語が同じ音節数を持てば
、そうでない場合よりも距離は小さい。■２つの単語か
同じ単語長さを持つ場合には、Ｄ、Ｐ法で求められた終
点までの距帛１１を、単語長で割ったものを正規化され
た距離とする。■距離は２つの距離について常に対称で
ある（順序か逆でも結果は同じ）。Ｍ＋Ｎ−ｋ　（一定
）のときは上記に＝（ｔ−Ｎ）２〜の式は上記■〜■の
条件を満たす。The inter-word distance obtained by the DP method as described above does not take into account the number of syllables in each word, as described above. Therefore, in the present invention, once the absolute distance d' or iU to the final syllables M and N is determined, it is multiplied by the normalization constant K and used as the inter-word distance Ml.
t (+. In other words, α and β are weighting coefficient normalization constants. The above values were used because the following three points were taken into consideration. ■ If two words have the same number of syllables, then if they do not The distance is smaller than when the two words have the same length.The normalized distance is calculated by dividing the distance 11 to the end point obtained by the D and P method by the word length. ■The distance is always symmetrical about the two distances (the result is the same whether the order or the reverse).When M+N-k (constant), use the above equation as = (t-N)2~ The formula for 2~ is the above ■~■ satisfies the conditions.

以下、具体例を説明する。第３図は「せ」と「つ」の距
離を求める場合の各処理の流れを示す。A specific example will be explained below. FIG. 3 shows the flow of each process when calculating the distance between "se" and "tsu".

ＤＰ法による「せ」と「つ」の小語間距離は、各々の音
素間距離の合計として求められる。この場合子音ＳとＴ
Ｓの距離がｄ　ｃ　　（Ｓ、　ＴＳ）　−〇、’５であり、また母
音ＥとＵの距離がｄｖ　　（Ｅ、ｕ）＝５００．０であるとすると、「せ」と１つ」の絶対距離はｄ’＋　
、＋＝ｄｃ　（Ｓ、　ＴＳ）　＋ｄｖ　（Ｅ、　ｕ）　
−５００，５となる。０．５．５００などの子音、母音
間距離は、言語学的に各子音、母音間の類似、非類似度
が分っているからそれにより定められる。同じ子音同志
、母音同志の距離はＯである。次に本発明では正規化を
行なうが正規化定数にはＭ＝１．Ｎ＝１であるからα−
１，β−１の場合は、となり、正規化された距離ｄはｄ＝ｌ・５００．５＝５００．５となる。The distance between small words "se" and "tsu" by the DP method is obtained as the sum of the distances between each phoneme. In this case the consonants S and T
Suppose that the distance between S is d c (S, TS) −〇,'5, and the distance between vowels E and U is dv (E, u) = 500.0. The absolute distance is d'+
, +=dc (S, TS) +dv (E, u)
-500.5. The distance between consonants and vowels, such as 0.5.500, is determined by linguistically knowing the degree of similarity and dissimilarity between each consonant and vowel. The distance between the same consonants and vowels is O. Next, in the present invention, normalization is performed, and the normalization constant is M=1. Since N=1, α−
1, β-1, then the normalized distance d is d=l·500.5=500.5.

第６図は上述の単語間距離システムをハードウェアイメ
ージで示す図である。１０は単語対を入力される入力部
、１２は変換テーブルつまりかなと音節の対応表１８を
用いて文字列を音節列に変換する変換部、１４は距離マ
トリクス２０を用いて距離計算を行なうＤＰ演算部、１
６は出力部である。FIG. 6 is a diagram showing a hardware image of the above-mentioned word distance system. 10 is an input unit into which word pairs are input; 12 is a conversion unit that converts a character string into a syllable string using a conversion table, that is, a kana-syllable correspondence table 18; and 14 is a DP that performs distance calculations using a distance matrix 20. Arithmetic unit, 1
6 is an output section.

〔Effect of the invention〕

以上述べたように本発明によれば、ＤＰ法だけによって
は反映できない音節数の違いを単語間距離に折り込むこ
とかできるので、音声認識装置の認識対象とする単記−
の読みの選択をより一層認識率の高いものとする二とが
できる。As described above, according to the present invention, differences in the number of syllables that cannot be reflected by the DP method alone can be incorporated into the distance between words.
It is possible to make the selection of readings even higher in recognition rate.

[Brief explanation of the drawing]

第１図はＤＰ法による距離計算の処理ブロック図、第２
図はＤＰ法の説明図、第３図は本発明の一具体例を示す
説明図、第４図および第５図は本発明の処理要領を示す
フローチャート、第６図はシステム構成を示すブロック
図である。図面で１，２は音節に分解する手段、３はＤＰ法による
距離計算手段である。Figure 1 is a processing block diagram of distance calculation using the DP method, Figure 2
FIG. 3 is an explanatory diagram of the DP method, FIG. 3 is an explanatory diagram showing a specific example of the present invention, FIGS. 4 and 5 are flowcharts showing the processing procedure of the present invention, and FIG. 6 is a block diagram showing the system configuration. It is. In the drawing, 1 and 2 are means for decomposing into syllables, and 3 is a distance calculation means using the DP method.

Claims

[Claims] A method for calculating the distance between each word of a word set to be speech recognized, comprising: decomposing each word into individual syllables; and distance between a word with M syllables and a word with N syllables. is calculated using the DP matching method, and the distance obtained includes (1) if the two words have the same number of syllables, the distance is smaller than if they do not, and (2) if the two words have the same word length. If it has, the normalized distance is the distance to the end point determined by the DP method divided by the word length, and (3) the distance is always symmetric about the two distances. 1. A method for automatically calculating a distance between words, the method comprising the step of calculating the distance between words by multiplying the distance by a normalization constant.