JPS59157699A

JPS59157699A - Voice registration system

Info

Publication number: JPS59157699A
Application number: JP58031550A
Authority: JP
Inventors: 厚夫田中; 徹上田
Original assignee: Computer Basic Technology Research Association Corp
Current assignee: Computer Basic Technology Research Association Corp
Priority date: 1983-02-25
Filing date: 1983-02-25
Publication date: 1984-09-07
Also published as: JPH0160159B2

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】く技術分野〉本発明は音声入力装置における音声登録方式の改良に関
するものである。DETAILED DESCRIPTION OF THE INVENTION Technical Field The present invention relates to an improvement in a voice registration method in a voice input device.

〈背景技術〉一般に犬語業の音声認識を行なう場合、従来のように単
語単位で音声宜発声して、その音声を登録していたので
は膨大な音声を全て発声１なければならず、その労力も
相当なものになる。まだ語業数の増大による単語間の誤
りも顕著になり実用的でなくなって来る問題点がある。<Background technology> Generally speaking, when performing speech recognition in the dog language industry, the traditional method of uttering each word and registering the sounds would require all the enormous amounts of speech to be uttered in one go. It will also take a lot of effort. There is still a problem that as the number of words increases, errors between words become more noticeable, making it impractical.

従って、大語業の音声認識を考えた場合、音声を詳細に
分析し、できる限υ音声の特徴を引き出す必要がある。Therefore, when considering speech recognition in the language industry, it is necessary to analyze speech in detail and extract as much of the characteristics of υ speech as possible.

そこで音声を単語単位でなく、音素や音節といったより
微少時間の単位に分割して、その単位で識別を行なうこ
とが提案されている二また、音素や音節は数十から高々
数百の種類であるので、少ない音声の登録で大語業の音
声認識ができ、有効であると考えられる。Therefore, it has been proposed to divide speech into smaller units of time, such as phonemes and syllables, rather than words, and identify them in units of time.In addition, phonemes and syllables can range from tens to hundreds of types at most. Therefore, it is possible to perform speech recognition for a large number of languages by registering a small number of sounds, and it is considered to be effective.

しかしながら、各音素や音節は発声毎のばらつき、発声
の仕方の違い、前後の音声の影響等のだめに多種多様な
変形を受けるため、音声の登録もそれらの影響を予め考
慮しておく必要がある。However, each phoneme or syllable undergoes a wide variety of transformations due to variations in each utterance, differences in the way it is uttered, and the influence of preceding and following voices, so it is necessary to take these influences into consideration in advance when registering voices. .

そのため、ある一つの音素や音節に対して、予め変形を
受けた多数の音声を登録しなくてはならなくなる。この
ことは処理量の増大を招き、登録すべき音声の数も増大
してしまうため、音素や音節単位で識別を行なう利点が
失なわれる。Therefore, it becomes necessary to register a large number of sounds that have been transformed in advance for one phoneme or syllable. This results in an increase in the amount of processing and the number of voices to be registered, which eliminates the advantage of identifying each phoneme or syllable.

一方、各音素や音節は全て同じ出現頻度で現われること
がなく、また識別率も全て同じではない。On the other hand, phonemes and syllables do not all appear with the same frequency, and their identification rates are not all the same.

出現頻度が高く識別率の低い音素や音節に対して。For phonemes and syllables that occur frequently and have a low identification rate.

はより注意深く取り扱う必要があるが、出現頻度の極め
て少ないものに対してはより多くの処理を行なわなくて
もよい。should be treated with more care, but it is not necessary to perform more processing for those that appear very infrequently.

音素や音節の出現頻度や識別率を考慮して音素や音節の
登録まだは再登録を行なうことにより、全体としての性
能を維持したままで、より少ない処理量で識別を行なえ
る可能性がおる。By registering or re-registering phonemes and syllables in consideration of their frequency of occurrence and identification rate, it is possible to perform identification with less processing while maintaining overall performance. .

く目　的〉本発明は上記のような観点でなされたものであり、より
少ない数の音声の登録で、高い認識性能を維持すること
が可能な音声登録方式を提供することを目的とするもの
そある。Purpose The present invention has been made from the above-mentioned viewpoint, and an object of the present invention is to provide a voice registration method that can maintain high recognition performance by registering a smaller number of voices. There it is.

〈実施−例〉以下、図面を参照して本発明の詳細な説明する０第１図
は本発明の音声登録方式を実施した音声入力装置の一例
を示すブロック構成図である０第１図において、１は音
節音声識別部であり、該音節音声識別部１では区切って
発声された単音節や連続音声から抽出された音節部の認
識を行ない、その音節の識別結果を出力して、判定結果
処理部２へ送る。上記音節音声識別部１は音節の標準パ
ターンを格納するメモリ１１を有し、入力され検出され
た音節音声から作成された入カッくターンと標準パター
ンとのパターンマ・ンチングヲ基本として音節識別を行
なう。<Implementation Example> Hereinafter, the present invention will be described in detail with reference to the drawings. FIG. 1 is a block configuration diagram showing an example of a voice input device implementing the voice registration method of the present invention. , 1 is a syllable sound identification unit, which recognizes the syllable parts extracted from single syllables and continuous sounds that are uttered separately, outputs the identification results of the syllables, and calculates the judgment results. Send it to the processing section 2. The syllable speech identification unit 1 has a memory 11 for storing standard patterns of syllables, and performs syllable identification based on pattern mapping between the standard pattern and an incoming cuff turn created from input and detected syllable sounds. .

判定結果処理部２は上記音節音声識別部１から送られて
来る識別結果を記憶する識別結果記憶メモリ２１を有し
、該メモリ２１に記憶された内容ピもとすいて、一旦、
音節、単語１文節１文章等の単位で音節符号を表示部３
へ送る。上記表示部３では適当なフォーマットでこれら
の斐字列を表示することになる。The determination result processing unit 2 has a recognition result storage memory 21 that stores the recognition results sent from the syllable voice recognition unit 1, and once the contents stored in the memory 21 are stored,
Display section 3 displays syllable codes in units of syllables, words, phrases, sentences, etc.
send to The display section 3 displays these character strings in an appropriate format.

オペレータは表示部３に表示された識別結果を見て、ど
の音節の識別結果が誤ったかを見つけ、誤識別結果指示
手段４を構成している手動スイッチ、キーボードあるい
はライトベン等の手動の入力操作によ１って、判定結果
処理部２に識別の誤つだ音節を指定するコードや番号を
入力する。The operator looks at the identification results displayed on the display unit 3, finds out which syllable has the incorrect identification result, and performs a manual input operation using the manual switch, keyboard, light ben, etc. that constitutes the incorrect identification result indicating means 4. Then, a code or number specifying the incorrectly identified syllable is input to the judgment result processing section 2.

一方、上記判定結果処理部２内に設けられた音節出現回
数計数手段２２及び誤シ回数計数手段２３によって識別
されてメモリ２１に記憶された各音節の出現回数のカウ
ント数がインクリメントされると共に、誤識別結果指示
手段４によって指示された音節の誤り回数のカウント数
がインクリメントされ、その結果が各音節毎の出現゛回
数記憶メモリ２４及び誤り回数記憶メモリ２５に記憶さ
れる。On the other hand, the count of the number of occurrences of each syllable identified by the syllable appearance number counting means 22 and the erroneous syllable number counting means 23 provided in the judgment result processing section 2 and stored in the memory 21 is incremented, and The count of the number of errors of the syllable specified by the misidentification result indicating means 4 is incremented, and the result is stored in the occurrence number storage memory 24 and the error number storage memory 25 for each syllable.

また判定結果処理部２は該処理部２丙の変換手段２６に
よって上記メモリ２４及び２５に記憶された各音節毎の
出現回数計数値及び誤り回数計数１直から各音節の出現
率と識別率を求めて音節出現率表記憶メモリ５１及び音
節識別率表記憶ンモリ５２の内容を修正する。　− 具体的には、例えばメモリ２４に記憶された各音節の出
現回数のカウント数の総和を求めてから、この総和で各
音節の出現回数のカウント数を割った値を各音節の出現
率ａｌ（ｉは音節の種類を表わす番号）とし既に音節出
現率表記憶メモリ５１に記憶されている音節ｉの出現率
ｂｉ　と例えば（ｋＪ＋ａｉ）／（ｋ＋Ｄなる演算（ｋ
は適当な値、例えば１〜ＩＯのある値）によって得られ
る値をＬｂｉと置き換えてメモリ５１に記憶させるよう
に変換手段２６が構成されている。同様に誤り回数に対
しても、同様の処理が実行され、音節出現回数の総和で
各音節の誤シ回数のカウント数を割った値を各音節の誤
り率ｄｉ　とし、既に音節識別率表記憶メモリ５２に記
憶されている音節ｉの誤り率ｅｉと例えば（ｈｅ４　＋
ｄｉ）／　（ｈ”　Ｉ　）なる演算（ｈは適当な値）に
よって得られる値をｅｉ　　と置き換えてメモリ５２に
記憶させる。In addition, the judgment result processing unit 2 calculates the appearance rate and identification rate of each syllable from the appearance count value and error count count 1 for each syllable stored in the memories 24 and 25 by the conversion means 26 of the processing unit 2H. Then, the contents of the syllable appearance rate table storage memory 51 and the syllable identification rate table storage memory 52 are corrected. - Specifically, for example, after calculating the sum of the counts of the number of times each syllable appears stored in the memory 24, the value obtained by dividing the count of the number of times each syllable appears by this sum is calculated as the appearance rate al of each syllable. (i is a number representing the type of syllable) and the appearance rate bi of the syllable i already stored in the syllable appearance rate table storage memory 51, for example, (kJ+ai)/(k+D) (k
The conversion means 26 is configured to replace Lbi with an appropriate value (for example, a certain value between 1 and IO) and store it in the memory 51. Similarly, the same process is performed for the number of errors, and the error rate di for each syllable is calculated by dividing the count of the number of errors for each syllable by the total number of syllable appearances, and the syllable identification rate table has already been stored. For example, if the error rate ei of the syllable i stored in the memory 52 is (he4 +
The value obtained by the calculation di)/(h"I) (h is an appropriate value) is replaced with ei and stored in the memory 52.

このようにして、最近の頻度情報（出現、誤り）を音節
出現率衣及び音節識別率衣に盛り込んでいくことになる
。In this way, recent frequency information (occurrence, errors) is incorporated into the syllable appearance rate and syllable identification rate.

以上のようにして各音節の出現率及び誤り率を頻度や度
数の形で判定結果処理部２内に記憶しておいて適宜（例
えば入力音節数がある回数になった時）出現率と識別率
倚変換し、処理部２内の頻度や２度数はリセットされる
。As described above, the appearance rate and error rate of each syllable are stored in the judgment result processing unit 2 in the form of frequency and frequency, and are identified as the appearance rate as appropriate (for example, when the number of input syllables reaches a certain number of times). The rate is converted, and the frequency and number of times in the processing unit 2 are reset.

なお、メモＩ７５１及び５２内の記憶内容の修正を各音
節の処理が終る毎に行なうようにしてもよい０６は登録判定部であり、該登録判定部６は上記メモリ５
１及び５２に記憶されている出現率表及び識別率表にも
とすいてメモＩＪ　ｌ　ｌに登録されている標準パター
ンの内の変更を要するものを判定して、その結果を判定
結果処理部２へ入力して、表示部３に再登録すべき音節
を表示すると共にメモＩＪ　Ｉ　Ｉ内の所望の音節に対
する標準パターンの書き換えを可能な状態にする。Incidentally, the storage contents in the memos I751 and 52 may be corrected each time the processing of each syllable is completed.
The appearance rate table and the identification rate table stored in 1 and 52 are used to determine which standard patterns need to be changed among the standard patterns registered in the memo IJ l l, and the result is processed by the determination result processing unit. 2, the syllable to be re-registered is displayed on the display section 3, and the standard pattern can be rewritten for the desired syllable in the memo IJII.

上記登録判定部６の判定動作は各音節に対する誤り率ｅ
ｉ　　がある閾値Ｅを越えたことを判断して行なわれ、
この結果として表示部３に音節ｉの文字を表示する。The judgment operation of the registration judgment unit 6 is based on the error rate e for each syllable.
This is done by determining that i has exceeded a certain threshold E,
As a result, the characters of the syllable i are displayed on the display section 3.

なお、上記閾値Ｅは出現率ｂｉ　　の値に応じて複数個
設定されることが望ましく、例えば出現率Ｂ＋　、Ｂ２
．Ｂａ（Ｂｌ＞Ｂ２＞Ｊ）とした場合、ｂｉ）Ｂｌの音
節に対してはｅ　１＞Ｅｔ　、　Ｊ＞Ｊ＞Ｂ２（７）音
節ニ対してはｅｉ＞Ｅ２．Ｂ２＞ｂｌ＞Ｂ３の音節に対
してはｅ　ｉ　＞Ｅｓ　（Ｅｔ　＜Ｅ２　＜Ｅａ　）の
場合に再登録を指示するように成せば、出現率のより高
い音節についてはより低い誤り率の場合にも再登録指示
が成されることになる。Note that it is desirable that a plurality of threshold values E be set according to the value of the appearance rate bi, for example, the appearance rate B+, B2
．． When Ba (Bl>B2>J), bi) e 1>Et for the Bl syllable, and ei>E2 for the J>J>B2 (7) syllable. For syllables with B2>bl>B3, if e i >Es (Et <E2 <Ea), re-registration is instructed, and for syllables with a higher occurrence rate, if the error rate is lower. A re-registration instruction will also be issued.

オペレータは表示部３に表示される音節文字を確認して
登録のやり直しをするか否かを判断することになる。The operator checks the syllables displayed on the display section 3 and decides whether to redo the registration or not.

なお、初期の登録の際には、予め別の話者が標準的なも
のとして得られている出現率表や識別率表に基いて各音
節の標準パターンの個数を算定することになる。In addition, at the time of initial registration, the number of standard patterns for each syllable is calculated based on an appearance rate table or a discrimination rate table obtained in advance as a standard by another speaker.

また連続音声でも文の初めや単語の語頭では音節音声の
先端部は無音区間の棟に続いて現われる。Also, even in continuous speech, at the beginning of a sentence or word, the tip of the syllable appears following the ridge of the silent section.

従って、同じ音節でも発声状況によって標準パターンが
大変異なったものとなシ、音節の標準パターンとしては
音声区間中から抽出したものと、無音区間の後から抽出
したものが必要となる。その他の調音結合による効果も
考えれば、一つの音節に対して多種の標準パターンが必
要である。Therefore, standard patterns for the same syllable can vary greatly depending on the pronunciation situation, and standard patterns for syllables must include one extracted from within a speech interval and one extracted after a silent interval. Considering the effects of other articulatory combinations, a variety of standard patterns are required for one syllable.

再登録の際には、一つの音節のどの発声条件での音声の
登録が必要であるかの情報も含めて登録判定部６より再
登録すべき音節情報を判定結果処理部２に送る。At the time of re-registration, the registration determination section 6 sends syllable information to be re-registered to the determination result processing section 2, including information on which utterance condition of one syllable is required to be registered.

ある音節の出現率が高くてしかも識別率が悪い場合には
再登録が急がれる。このような場合、判定結果処理部２
はこの音節の文字を表示部３で区別表示させるように指
示してもよい。例えばディスプレイではその片隅にその
文字を点滅させたり、異なる色で表示させたりするよう
に成せば良い。If the occurrence rate of a certain syllable is high and the identification rate is low, re-registration is urgent. In such a case, the determination result processing unit 2
may instruct the display unit 3 to display the characters of this syllable in a distinct manner. For example, on a display, the characters can be made to flash in one corner or displayed in a different color.

以上に述べた実施例は音節単位に識別の誤りを指定でき
る装置に適用した場合であるが、音節単位の識別で単語
や文節等を認識する場合、音節の誤りを指定できないこ
とがある。The embodiment described above is applied to a device that can specify identification errors in syllable units, but when recognizing words, phrases, etc. by syllable unit identification, it may not be possible to specify syllable errors.

例えば、単語認識の場合、単語の認識結果をひらがなや
カタカナの文字列で表示するよりも漢字で表示した方が
分かりやすい。従って、音節単位で誤りを指定するより
も単語単位で誤りを指定した方が全体の処理効率が高く
なることが起こる。For example, in the case of word recognition, it is easier to understand if the word recognition results are displayed in kanji rather than in hiragana or katakana character strings. Therefore, the overall processing efficiency may be higher when errors are specified on a word-by-word basis than on a syllable-by-syllable basis.

このような場合の処理方法の一例を以下に示す。An example of a processing method in such a case is shown below.

今、「たまがわ」と発声した音声入力の識別結果として
第２図１ｂｌ）で示すように音節１−だ」に対して音節
候補「か」、「ば」、「た」が得られたとする。Assume that the syllable candidates ``ka'', ``ba'', and ``ta'' are obtained for the syllable 1-da'' as shown in FIG. 2 1bl) as a recognition result of the voice input uttered ``tamagawa''.

なお、音節文字の下に示されている数値はその候補の信
頼性に関する量を表わしたものである。Note that the numerical value shown below the syllabary represents the amount related to the reliability of that candidate.

この値は信頼性を表わす量と尤度、類似度、距離。This value is a quantity that represents reliability, likelihood, similarity, and distance.

第１候補に対する距離比等を用いることができる。A distance ratio to the first candidate, etc. can be used.

ここでは、入力音節のパターンと音節標準パターンとの
ユークリフド距離を例に示す。各音節候補の信頼度は距
離が犬きくなるに従って下がる。Here, the Euclidean distance between the input syllable pattern and the syllable standard pattern is shown as an example. The reliability of each syllable candidate decreases as the distance increases.

単語候補に対する距離を各音節候補の距離の和で表わし
、信頼度の高い順に並べると第２図１ｂｌのようになる
。入力される単語を地名に限り、単語辞書（後述）に「
がまがわ」や「ばまがわ」がなければ地名単語として第
２図（ｃ）のようなものが候補として残ることになる。The distance to a word candidate is expressed as the sum of the distances of each syllable candidate, and when they are arranged in descending order of reliability, the result is as shown in FIG. 2, 1bl. Limit the input words to place names and enter "
If there were no ``Gamagawa'' or ``Bamagawa,'' candidates for place name words such as those shown in Figure 2 (c) would remain.

音節単位の識別で単語を認識する場合には、第３図に示
すようなを節動候補作成部２７、単語辞書２８及び辞書
照合部２９の機能手段を判定結果処理部２は備えている
必要がある。ここで音節音声識別部ｌからは第２図（ａ
）のように信頼度に関す。When recognizing words by identifying each syllable, the determination result processing unit 2 must be equipped with functional means such as a syllable candidate generation unit 27, a word dictionary 28, and a dictionary collation unit 29 as shown in FIG. There is. Here, from the syllable sound identification unit l, as shown in Fig. 2 (a
) regarding reliability.

る量と同時に音節候補が出力される。音節列候補作成部
２７は第２図（ｂ）のように信頼度の順に音節列候補を
作成して辞書照合部２９へ送る。辞書照合部２９は音節
列候補が単語辞書２８に有るかどうかを検出し、なけれ
ば除外して、単語辞書に有った単語候補を表示部３へ送
る。At the same time, syllable candidates are output. The syllable string candidate creation section 27 creates syllable string candidates in order of reliability as shown in FIG. 2(b) and sends them to the dictionary matching section 29. The dictionary collation unit 29 detects whether or not the syllable string candidate exists in the word dictionary 28, excludes it if it does not exist, and sends the word candidate found in the word dictionary to the display unit 3.

文章や文節の認識の場合にも、辞書や照合のみならず複
雑な処理を必要とするが、いずれにしても音節候補から
文節候補１文章候補を表示部３に出力することになる。Recognition of sentences and phrases also requires complex processing in addition to dictionaries and collations, but in any case, one sentence candidate from syllable candidates is output to the display unit 3.

第２図（ｃ）に示すように「たまがわ」と発声しだにも
かかわらず表示された第１単語候補は「神奈川」となっ
たとすると、この場合第２．第３候補まで表示させて、
正しい単語は「土用」であることを指定すべく手動操作
で判定結果処理部２に情報を送れば音節候補（第２図（
ａ））７５・ら「た」を「が」に誤ったということが分
かシ識別の正誤の計数が誤り回数計数手段２３で行なわ
れる。As shown in FIG. 2(c), if the first word candidate displayed is "Kanagawa" even though "Tamagawa" is uttered, then in this case, the second word candidate is "Kanagawa". Display up to the third candidate,
If you manually send information to the judgment result processing unit 2 to specify that the correct word is ``Doyou'', syllable candidates (see Figure 2 (
a)) It is determined that ``ta'' has been incorrectly changed to ``ga'' from 75. The number of errors counting means 23 counts whether the identification is correct or incorrect.

しかし、実用的見地から、第１単語候補が誤った場合・
には、すみやかに発声し直した方がよい事がある。この
場合、「神奈川」が誤りであることは分かるが、「か」
「な」「が」「わ」のどの音節が誤ったかは分からない
。これらの各音節の頻度や正誤の度数を計数からはぶく
ことが妥当な場合もあるが、計数を要する場合には問題
である。However, from a practical standpoint, if the first word candidate is incorrect,
There are some cases where it is better to say the words again as soon as possible. In this case, we know that "Kanagawa" is incorrect, but "ka"
I don't know which syllable of "na", "ga", or "wa" was incorrect. Although it may be appropriate to exclude the frequency of each syllable or the degree of correctness from counting, this is a problem when counting is required.

このような場合、誤った音節列（今の場合「がまがわ」
）を一旦記憶しておき、発声し直して正しいと判定され
た音節列（今の場合「たまがわ」）と比較すれば、「た
Ｊを「か」と誤ったことが分かり頻度や正誤の度数に加
算することができる。In such cases, the wrong syllable string (in this case ``gamagawa'')
), and if you re-pronounce it and compare it with the correct syllable string (in this case, ``tamagawa''), you will be able to see that you have mistaken ``taJ'' as ``ka'', and calculate the frequency and degree of correctness. can be added to.

連続音声から音節部を検出して音節毎に識別する方法よ
りも、区切って発声された音節の識別の方が一般に識別
率が高いと考えられる。従って、連続音声から音節部を
検出して音励毎に識別していく方法で、識別を誤った音
節を表示部のカーノルを移動して、例えばカナ文字列の
相当する位置にもっていって、その音節のみを発声して
識別させることによって修正を行うことも考えられる。It is thought that the identification rate of syllables uttered separately is generally higher than the method of detecting syllable parts from continuous speech and identifying each syllable. Therefore, by detecting syllable parts from continuous speech and identifying them for each sound excitation, the syllables that were incorrectly identified can be moved to the corresponding position in the kana character string by moving the kernel on the display. It is also conceivable that the correction could be made by uttering and identifying only that syllable.

このとき音声は区切り発声の音節音声になっているので
識別は比較的容易である。この場合、同じ音節を再び同
じ音節と誤れば、自動的に別の音節候補に修正すること
によって、一つのカナ文字を修正するのに数多くの発声
は避けられる。At this time, since the sound is a syllable sound of segmented utterances, identification is relatively easy. In this case, if the same syllable is mistaken for the same syllable again, it is automatically corrected to another syllable candidate, thereby avoiding the need for many utterances to correct one kana character.

これらの動作を行うだめの装置の構成例を第４図に示す
。An example of the configuration of a device for performing these operations is shown in FIG.

この第４図に示した装置の構成動作を前述の第２図に示
した例を用いて説明する０表示部３には最初の認識結果「神奈川」が表示されてい
るが、誤りであることをキーボード４１で指定して、「
金沢」と表示し直し、更に誤りを指定しだ後「土用」と
表示される。このときの単語項目の辞書はメモリ２８に
格納されている。音節識別結果の文字列「か」「ま」「
が」「わ」はメモリ７１に記憶されている。The configuration and operation of the device shown in FIG. 4 will be explained using the example shown in FIG. on the keyboard 41, and select "
"Kanazawa" is displayed again, and after specifying an error, "Doyo" is displayed again. The dictionary of word items at this time is stored in the memory 28. The character string “ka” “ma” “is the result of syllable identification
"ga" and "wa" are stored in the memory 71.

メモリ７２は各音節に対して出現回数と誤った回数を格
納している。第２図の例の場合、正回答が「土用」であ
ると分った後（次の発声を行うと同時に正しい回答であ
ると判断することもできる）、音節照合部２０の照合結
果によりメモリ７２の中の「ま」「が」「わ」の各音節
の出現回数のカウント数をインクリメントし、「か」の
音節の誤り回数のカウントをインクリメントする。The memory 72 stores the number of occurrences and the number of errors for each syllable. In the case of the example shown in FIG. 2, after finding out that the correct answer is "Doyo" (it can be determined that the answer is correct at the same time as the next utterance), based on the matching result of the syllable matching section 20, The count of the number of occurrences of the syllables "ma", "ga", and "wa" in the memory 72 is incremented, and the count of the number of errors of the syllable "ka" is incremented.

このようにしてメモリ７２は各単語の認識毎に更新され
ていく。話者が適当な時期にキーボードから指定すれば
メモリ７２の内容を変換手段２６によって処理した後に
メモＩＪ、３１及び３２に記憶された音節出現率表及び
音節識別結果をこの処理された内容を用いて更新するこ
とができる（この処理を表の更新と呼ぶ）。In this way, the memory 72 is updated each time each word is recognized. If the speaker specifies from the keyboard at an appropriate time, the contents of the memory 72 are processed by the conversion means 26, and then the syllable appearance rate table and the syllable identification results stored in the memos IJ, 31 and 32 are converted using the processed contents. (This process is called table update.)

この処理の方法としては例えば、一旦、各音節の出現回
数のカウント数の総和を求めてからこの総和で各音節の
出現回数のカウント数を割った値を各音節の出現率ａｉ
　（ｔは音節を表わす番号）とし、既に音節出現率表に
ある音節ｉの値ｂｉ　と例えば（ｋＪ＋ａｌ　）／（ｋ
＋Ｉ）なる演算（ｋは適当な値、例えば、ｌ−１０）に
よって得られる値をｂｉと置き換える。これによって、
最新の頻度情報を音節出現率表にもり込んでいくことが
できる０同様に誤り回数に対しても、このような処理を
行うことがでへる。各音節に対する誤り率（音節識別。As a method for this processing, for example, first find the total sum of the counts of the number of times each syllable appears, and then divide the count number of the number of times each syllable appears by this sum to calculate the appearance rate ai of each syllable.
(t is a number representing a syllable), and the value bi of syllable i already in the syllable appearance rate table and, for example, (kJ+al)/(k
+I) (k is an appropriate value, for example, l-10), and the value obtained by the operation is replaced with bi. by this,
The latest frequency information can be incorporated into the syllable appearance rate table.Similarly to 0, this type of processing can also be performed for the number of errors. Error rate for each syllable (syllable identification).

率表に格納されている）をｅｉ　とする。(stored in the rate table) is assumed to be ei.

このｅｉがある閾値を越えたことを登録判定部６が判定
すると、判定結果処理部２を介して表示部３に音節ｉの
文字を表示する。従って、登録のやり直しをするかどう
かを話者（使用者）が判断できるようになる。When the registration determination unit 6 determines that this ei exceeds a certain threshold, the character of the syllable i is displayed on the display unit 3 via the determination result processing unit 2. Therefore, the speaker (user) can decide whether to redo the registration.

各音節の出現率は必ずしも一つにする必要はなく、音節
当り標準パターンを５個持つような場合には各標準パタ
ーンについてカウンタやメモリをもつように成せばよい
。The appearance rate of each syllable does not necessarily have to be one; if there are five standard patterns per syllable, a counter or memory may be provided for each standard pattern.

各音節標準パターンは同様の発声状態−で得られたもの
とは限らず、例えば５個の標準パターンの内２個は区切
り発声の音節音声から作られていて、残りの３個は単語
音声中の音節部から作られていることもある。この場合
、音節出現率表、音節識別率表及びメモリ７２は各音節
毎に値を記憶するのではなくて、各標準パターン毎に値
を記憶している。Each syllable standard pattern is not necessarily obtained under similar vocalization conditions; for example, two of the five standard patterns are created from syllable sounds of segmented utterances, and the remaining three are created from word sounds. Sometimes it is made from the syllable part of. In this case, the syllable appearance rate table, syllable identification rate table, and memory 72 do not store values for each syllable, but for each standard pattern.

第２図の例で、第１音節「た」の音声は「か」の３番目
Ｑ標醜パターンとの類似度が最も太きかって誤ったとす
るとき「た」の出現回数をインクリメントして「か」の
３番目の標準パターンに対応する誤り数のカウントをイ
ンクリメントする。In the example in Figure 2, if we make a mistake because the first syllable ``ta'' has the highest degree of similarity to the third Q pattern of ``ka'', we increment the number of occurrences of ``ta'' to Increment the count of the number of errors corresponding to the third standard pattern of "?".

表の更新の結果「か」の３番目の標準パターンに対応す
る音節識別率表の値ｅｉがある閾値を越えた場合にはこ
の「か」の８番目の標準パターンが作成された音声と同
じ発声状態で再登録する必要がある。例えばその音声が
「いか」という音声の「か」の蔀分から作成されたもの
であれば、表示部に例えば「いか：再登録要」と表示す
ることができる。この場合、音節音声識別部内の標準パ
ターンの各パターンの作成状態を記憶するメモリをメモ
リ７２に付は加えておく必要がある。As a result of updating the table, if the value ei of the syllable identification rate table corresponding to the third standard pattern of "ka" exceeds a certain threshold, the eighth standard pattern of "ka" is the same as the created voice. It is necessary to re-register while speaking. For example, if the voice is created from the ``ka'' part of the voice ``squid'', the display section can display, for example, ``squid: re-registration required''. In this case, it is necessary to add a memory to the memory 72 for storing the creation status of each standard pattern in the syllable sound recognition unit.

以上のようにして再登録の必要性のある音節が音節の出
現頻度及び誤り度数にもとすいて表示出力されることに
なる。As described above, syllables that require re-registration are displayed and output based on the syllable appearance frequency and error frequency.

く効　果〉以上の如く、本発明によれば、音声入力実行時に得られ
た音節の識別結果の正誤を指示し、この音節の識別結果
及び正誤の指示にもとすいて各音節の出現頻度及び誤り
度数を求め、この求められた各音節の出現頻度及び誤り
度数に関連して登録あるいは再登録すべき音節を決定す
るように成しているため、全体としての性罷を維持した
ままで、より少ない処理量で再登録を必要とする音節を
効率よく見出して、再登録処理することが可能となる。Effect> As described above, according to the present invention, it is possible to instruct whether the syllable identification result obtained when performing voice input is correct, and to indicate the frequency of appearance of each syllable based on the syllable identification result and the correct or incorrect instruction. Since the syllables to be registered or re-registered are determined based on the frequency of occurrence and error frequency of each syllable, the overall character is maintained. , it becomes possible to efficiently find syllables that require re-registration and perform re-registration processing with a smaller amount of processing.

【図面の簡単な説明】第１図は本発明を実施した音声入力装置の一例を示すブ
ロック図、第２図は音声認識例の説明に供する図、第３
図は判定結果処理部２の一例を示すブロック図、第４図
は本発明を実施した装置の他の一例を示すブロック図で
ある。ｌ・・・音節音声識別部、２・・・判定結果処理１部、
２２・・・出現回数計数手段、２３・・・誤り回数計数
手段、２４・・・出現回数記憶メモリ、２５・・・誤り
回数記憶メモリ、３・・・表示部、４・・・誤識別結果
指示手段、５１・・・音節出現率表記憶メモリ、５２・
・・音節識別率表記憶メモリ、６・・登録判定部。代理人　弁理士　福　士　愛　彦（他２名）音声入力　゛た　　　ま　　　　か゛　　　リｔ　　ｔｙｔ
２−ｔｙわ　　（５，０）２、スフゝなか゛わ１５８５
ノｌｂノ　　　　　　　　　３．　　１Ｌ”　　＠　　ｆ
ｔ）　　（５，４）４、力゛なぎわ　１５．５ノ５、魯まびわｔ５．６ノｔ　　幾１察）１１（ｃ）　　　　　　　　　２．　　　４：ｗ３、　五川[Brief Description of the Drawings] Fig. 1 is a block diagram showing an example of a voice input device implementing the present invention, Fig. 2 is a diagram for explaining an example of speech recognition, and Fig. 3 is a block diagram showing an example of a voice input device implementing the present invention.
The figure is a block diagram showing an example of the determination result processing section 2, and FIG. 4 is a block diagram showing another example of the apparatus implementing the present invention. l...Syllable speech identification unit, 2...Determination result processing part 1,
22... Appearance number counting means, 23... Error number counting means, 24... Appearance number storage memory, 25... Error number storage memory, 3... Display section, 4... Erroneous identification result Instructing means, 51... Syllable appearance rate table storage memory, 52.
...Syllable identification rate table storage memory, 6..Registration judgment unit. Agent Patent attorney Aihiko Fukushi (and 2 others) Voice input
2-tywa (5,0) 2, Suffenakawa 1585
Nolbno 3. 1L"@f
t) (5, 4) 4, Power Nagiwa 15.5 no 5, Luma Biwa t 5.6 not Iku 1 Sen) 11 (c) 2. 4:w3, Gokawa

Claims

[Scope of Claims] 1. Instructing whether the syllable identification result obtained when performing voice input is correct or incorrect, and determining the appearance frequency and error frequency of each syllable based on the syllable identification result and the correct or incorrect instruction; A speech registration method characterized in that a syllable to be registered or re-registered is determined based on the appearance frequency and error frequency of each of the waterlogged syllables. □