JPS6143337A

JPS6143337A - Voice input device for japanese word

Info

Publication number: JPS6143337A
Application number: JP59165058A
Authority: JP
Inventors: Toru Ueda; 徹上田; Mitsuhiro Toya; 充宏斗谷
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 1984-08-06
Filing date: 1984-08-06
Publication date: 1986-03-01

Abstract

PURPOSE:To improve the recognizing performance with no load of users by registering automatically a feature standard pattern obtained from input voices as a feature standard pattern of syllables corrected through collation with a dictionary, etc. CONSTITUTION:The input voices sent from an analog input part (a) undergo the spectrum analysis through a voice analysis part (b), and the syllables are separated through a syllable segmentation part (c). A feature pattern of the syllable separated part and the voice waveforms are stored in a feature pattern temporary memory (d). Thenthe distances are calculated at a single syllable recognizing part (f) among a feature pattern 1g in a pattern memory (g) and standard pattern memories 2g and 3g. The contents of the memory (d) are transferred to the memory 1g and collated with the contents of memories 2g and 3g respectively. The results of these collations are stored to a memory and also displayed to a display device (k). Then a mistake if detected is corrected through operations of a keyboard and collation with a dictionary. If no mistake is detected, a syllable feature pattern is registered automatically to the memory 3g.

Description

【発明の詳細な説明】く技術分野〉本実Ｆ！Ａは入力された音声を音節単位に認識する日本
語音声入力装置の改良に関し、特に入力時に発声された
音声の特徴標準パターンを修正の結果得られる音節の標
準パターンとして自動的に登録できるようにしたもので
ある。[Detailed description of the invention] Technical field> Honji F! A relates to the improvement of a Japanese speech input device that recognizes input speech in units of syllables, and in particular, to automatically register the characteristic standard pattern of the speech uttered at the time of input as the standard pattern of syllables obtained as a result of modification. This is what I did.

〈従来技術〉従来の音声認識装置においては、音声の特徴標準ノミタ
ーンを予め登録する登録モードと、入力音声を認識する
ｇＲモード（入力モード）とを分け、認識モードで入力
された音声を分析して得られた特徴パターンを特徴標準
パターンとして登録することが出来なかった。このこと
は単語を認識の単位とする場合にはあまり問題とならな
いが、音節を単位とする場合には、各音節が前後の音節
の影響を受け（調音結合）、また単語あるいは文箱内の
音節位置による音声の強弱、高低の差もはげしくこの結
果認識性能の低下が生じるという問題点があった。<Prior art> Conventional speech recognition devices are divided into a registration mode, in which standard nomiturns of speech characteristics are registered in advance, and a gR mode (input mode), in which input speech is recognized, and the input speech is analyzed in recognition mode. It was not possible to register the feature pattern obtained as a feature standard pattern. This is not much of a problem when using words as the unit of recognition, but when using syllables as the unit of recognition, each syllable is influenced by the syllables before and after it (articulatory combination), and when the unit of recognition is a word or sentence box, There was a problem in that the strength and pitch of the voice differed greatly depending on the syllable position, resulting in a decrease in recognition performance.

このような問題点を解決するため、従来は登録時に一つ
の音節を数回登録して、多くの特徴標準パターンを持つ
ことによって対応していたが、調音結合や声の強弱、高
低は人により様々であり、総ての場合の音節環境につい
ての特徴標準パターンを予め登録することは不可能でめ
った。Conventionally, to solve these problems, one syllable was registered several times at the time of registration, and this was done by having many characteristic standard patterns. They vary, and it is rarely possible to register in advance a characteristic standard pattern for the syllable environment in all cases.

く目　的〉それゆえ本発明の主たる目的は、入力時に発声された音
声の特徴標準パターンを辞書照合などによって修正され
た音節列の標準パターンとして自動的に予め登録した標
準パターンに追加あるいは入れ換えることにより、使用
者の負担なしに認識性能を向上させることにある。Therefore, the main purpose of the present invention is to automatically add or replace the characteristic standard pattern of the voice uttered at the time of input with the standard pattern registered in advance as the standard pattern of the syllable string corrected by dictionary checking etc. The objective is to improve recognition performance without burdening the user.

本発明のもう一つの目的は、各音節標準パターンの認識
率を求めるカウンターによって求められた認識率の悪い
音節だけ入力時に登録することによって認識性能を向上
させることにある。Another object of the present invention is to improve recognition performance by registering at the time of input only syllables with poor recognition rates determined by a counter that determines the recognition rate of each syllable standard pattern.

本発明の他の目的は、入力時の標準パターン登録を主記
憶装置だけに行ない、さらにキーボードなどの外部指示
操作で効果のあるパターンのみ補助記憶装置へ登録する
ようにして、入力時の自動登録による悪影響すなわち誤
ったパターンの登録による悪影響を最小限に抑えること
にある。Another object of the present invention is to register standard patterns only in the main memory at the time of input, and further register only patterns that are effective by external instruction operations such as a keyboard in the auxiliary memory, thereby automatically registering patterns at the time of input. The objective is to minimize the negative effects caused by incorrect pattern registration.

〈実施例〉以下、図面を参照して本発明を、連続的に発声された音
声を音節単位に認識し、この認識結果を辞書照合によ・
って修正した後に、単語等の単位で外部装置に転送する
機能を有する日本語音声入力装置を一例として説明する
。<Example> Hereinafter, with reference to the drawings, the present invention will be described in which continuously uttered speech is recognized in units of syllables, and the recognition results are compared with a dictionary.
An example of a Japanese speech input device that has a function of transmitting the input data in units of words or the like to an external device after correction will be explained.

第１図は本発明の一実施例装置の構成を示すブロック図
である。FIG. 1 is a block diagram showing the configuration of an apparatus according to an embodiment of the present invention.

第′Ｉ図において、発声され入力された音声はマイクロ
ホン等を介してアナログ入力部ａに入力され、該アナロ
グ入力部ａ内の増幅器によって増幅された後、アナログ
／デジタル変換部によってデジタル信号に変換され、そ
のデジタル信号が音声分析部す及び音節セグメンテーシ
ョン部Ｃに入力される。In Figure 'I, the voice that is uttered and input is input to the analog input section a via a microphone, etc., and after being amplified by the amplifier in the analog input section a, it is converted into a digital signal by the analog/digital conversion section. The digital signal is input to the speech analysis section and the syllable segmentation section C.

音声分析部すでは入力音声を１６ｍ５程度のフレームに
分け、スペクトル分析を行ない、８ｍＳ程度の間隔で音
節セグメンテーション部Ｃに特徴パターンと、音節のセ
グメンテーションに必要な情報（パワー、零交差数等）
を転送する。The speech analysis section divides the input speech into frames of about 16m5 and performs spectrum analysis, and at intervals of about 8mS, the syllable segmentation section C analyzes characteristic patterns and information necessary for syllable segmentation (power, number of zero crossings, etc.).
Transfer.

音節゛セグメンテーション部Ｃでは、音声分析部すから
の種々の情報を用いて、入力音声から音節を切シ出す。The syllable segmentation section C cuts out syllables from the input speech using various information from the speech analysis section.

その切り出した部分の特徴パターンと、その区間の音声
波形を波形・特徴パターン一時メモＩｊ　ｄにだくわえ
る。そして、音節を切り出したこと１ＣＰＵｅに伝達す
ると共に波形・特徴パターン一時メモリｄ内のアドレス
も同時に伝達する０波形・特徴パターン一時メモリｄは複数の音節をたくわ
えることができるように構成されている。The feature pattern of the cut out portion and the audio waveform of that section are stored in the waveform/feature pattern temporary memo Ijd. Then, the fact that the syllable has been cut out is transmitted to the CPUe, and the address in the waveform/feature pattern temporary memory d is also transmitted at the same time.The waveform/feature pattern temporary memory d is configured to be able to store a plurality of syllables.

音節セグメンテーション部Ｃの処理はＣＰＵｅからの命
令により、開始・停止がコントロールされるように構成
されている。The processing of the syllable segmentation unit C is configured such that the start and stop of the processing is controlled by instructions from the CPUe.

ｆは単音節認識部であシ、該単音節認識部ｆでは、ＣＰ
Ｕｅからの命令によシバターンメモリを内の特徴パター
ンメモ１７１　Ｆと標準パターンメモリ２？及び３１と
の間で距離計算等を行ない、その結果をＣＰＵｅに戻す
。そして、ＣＰＵｅはその結果を、認識結果格納メモリ
ｌ、にたくわえ、表示装置ｋに表示する０認識結果格納
メモＩＪ　ｌ、は、複数の音節に対する認識結果をたく
わえることができるように構成されている。f is a monosyllable recognition unit, and in the monosyllable recognition unit f, CP
According to the command from Ue, the characteristic pattern memo 171F and the standard pattern memory 2? and 31, and returns the results to the CPUe. Then, the CPU e stores the results in the recognition result storage memory l, and the recognition result storage memo IJ, which is displayed on the display device k, is configured to be able to store recognition results for a plurality of syllables. .

なお、上記パターンメ゛モリ２は三つの部分に分かれて
おり、Ｉｔは特徴パターンメモリであり入力された音節
に対応する特徴パターンを一個分だけ記憶できる０後の
二つの２１及び３？は特徴標準パターン用メモリであＪ
）、５）２ｆのエリアは登録モードで登録する音節の特
徴パターン用のメモリで６．６、ａｙのエリアは本特許
で実現される認識（入力）モードで登録される音節の特
徴パターン用のメモリでらる０上記音節の特徴パターン用メモリ２２及び３１は後述す
るようにそれぞれ各音節基をコードで記憶するエリア、
登録の有無を記憶するフラグエリア及び特徴標準パター
ンデータを記憶する特徴標準パターンエリアより構成さ
れている□ｉは展開メモリでアリ、これは辞書照合のと
きに音節候補を展開して作成した単語候補を記憶するメ
モリである。またｊはキーボード等により構成された入
力部であり、例えば第２図に示すように登録モードキー
ｊ１、認識モードキーｊ２、カナキーｊ１４等が備えら
れているｏｋはＣＲＴ等の指示装置、ｔは認識結果を外
部装置に転送する際のデータの送受信の制御を行なうＩ
／Ｆ（インターフェース）部、ｍは後述するように正解
カウンターと認識カウンターから成る音節カウンタ部、
ｎは自動修正のために入力される音声を総て記憶してい
る辞書メモリ、Ｏは特徴パターンを記憶するための補助
記憶部（例えばフロッピーディスク）でるる。The pattern memory 2 is divided into three parts, and It is a feature pattern memory that can store only one feature pattern corresponding to the input syllable, the two after 0, 21 and 3? is the memory for the feature standard pattern.
), 5) The 2f area is the memory for the syllable feature pattern to be registered in the registration mode, and the ay area is the memory for the syllable feature pattern to be registered in the recognition (input) mode realized in this patent. Memory 0 The syllable characteristic pattern memories 22 and 31 are areas for storing each syllable base in the form of a code, respectively, as will be described later.
It is composed of a flag area that stores the presence or absence of registration, and a feature standard pattern area that stores feature standard pattern data. □i is an expansion memory, which is a word candidate created by expanding syllable candidates during dictionary matching. It is a memory that stores information. Further, j is an input unit composed of a keyboard, etc., and for example, as shown in FIG. 2, it is equipped with a registration mode key j1, a recognition mode key j2, a kana key j14, etc. ok is an instruction device such as a CRT, and t is an instruction device such as a CRT. I controls data transmission and reception when transferring recognition results to an external device
/F (interface) part, m is a syllable counter part consisting of a correct answer counter and a recognition counter as described later;
n is a dictionary memory that stores all the voices input for automatic correction, and O is an auxiliary storage section (for example, a floppy disk) for storing characteristic patterns.

次に、上記の如く構成された装置の動作を登録モード及
び認識モードについて説明する０第３図は登録モードに
おける（：：ＰＵｅの処理フローを示したものである。Next, the operation of the apparatus configured as described above will be explained in the registration mode and the recognition mode. FIG. 3 shows the processing flow of (::PUe) in the registration mode.

第３図において、装置本体が登録モードキーｊ＋の操作
によって登録モードに設定されるとまずステップｎｌに
おいてパターンメモリ１が初期化され、標準パターンが
総て消去される。第１表は標準パターンメモリ２ｔの構
成を示したものであり、標準パターンメモリ８ｆも同様
に構成されている０第ｔｉ標準パターン２７の構成例ステップｎｌにおける初期化の処理は標準パターンメモ
リ２７及び３２の登録の有無のフラグエリアに「０」を
入れることで実現される。次にステップｎ２に移行して
発声すべき単音節が表示装置ｋに次のように表示される
。In FIG. 3, when the main body of the apparatus is set to the registration mode by operating the registration mode key j+, the pattern memory 1 is first initialized in step nl, and all standard patterns are erased. Table 1 shows the configuration of the standard pattern memory 2t, and the standard pattern memory 8f is configured in the same way.The initialization process in step nl is the configuration example of the 0th ti standard pattern 27. This is achieved by putting "0" in the flag area indicating whether or not registration is available in No. 32. Next, the process moves to step n2, and the monosyllable to be uttered is displayed on the display device k as follows.

「あｌ」ここで添字の「１」は「ア」のパターンの中の一番目で
あることを示している。"A" Here, the subscript "1" indicates that it is the first in the "A" pattern.

オペレータはこの表示装置にの表示を見て、所定の単音
節の音声を発声して入力する。The operator looks at the display on the display and inputs a predetermined monosyllabic sound by uttering it.

この音声入力に応じてステップｎ３に移行して音節セグ
メンテーシコン部Ｃに音声の切り出しの開始の指示を行
ない、音節セグメンテーションＣは単音節を切り出し、
その区間の波形及び音声分析部すで得られた特徴パター
ンを波形・特徴パターン一時メモリｄに記憶させる。In response to this audio input, the process moves to step n3, where the syllable segmentation unit C is instructed to start audio segmentation, and the syllable segmentation unit C segments monosyllables.
The waveform of that section and the characteristic pattern obtained by the voice analysis section are stored in the waveform/characteristic pattern temporary memory d.

ステップｎ４では音節セグメンテーション部Ｃで単音節
が切り出されたかどうかのチェックを行ない、切り出さ
れると次のステップｎ５に移行する０ステップｎ５では音節セグメンテーシコン部Ｃに切り出
し処理の停止を命令し登録の処理を継続するＯステップｎ６では今切シ出された音節に対応する音声部
分を波形・特徴パターン一時メモリｄより読み出して音
声出力制御部を介して再生出力させる。In step n4, the syllable segmentation unit C checks whether or not a single syllable has been segmented, and if it has been segmented, the process moves to the next step n5.In step n5, the syllable segmentation unit C is commanded to stop the segmentation process, and the registration is performed. Continue the process. At step n6, the audio portion corresponding to the syllable just output is read out from the waveform/characteristic pattern temporary memory d and reproduced and outputted via the audio output control section.

ステップｎ７では再生出力された音声にもとすいてオペ
レータが正確に切り出されたかどうかを判定し、その結
果のキーボードｊによる指示に従い、再切り出しか登録
の実行かを決定する０このステップｎ７において、オペ
レータが再生出力を聞いて正確に切り出されたと判断し
た場合には実行キーｊ７を操作することになってステッ
プｎ８に移行し、オペレータが再切り出しを指示する場
合には、解除キーｊ６の操作に応じて、ステップｎ３に
戻ることになる。In step n7, it is determined whether the operator has been accurately cut out based on the reproduced audio, and in accordance with the resulting instructions from the keyboard j, it is determined whether to re-cut or register. If the operator listens to the playback output and determines that the clipping has been performed accurately, he or she will operate the execution key j7 and proceed to step n8, and if the operator instructs re-cutting, he or she will operate the release key j6. Accordingly, the process returns to step n3.

ステップｎ８では表示装置ｋＫ！示されている音節に対
応する特徴標準）くターンメモリ２２の位置に特徴標準
パターンを記憶させると共に対応する登録の有無を示す
フラグに「１」をセットする。In step n8, the display device kK! The feature standard pattern corresponding to the indicated syllable is stored in the location of the turn memory 22, and a flag indicating whether or not the corresponding registration is present is set to "1".

ステップｎ９では全標準パターンの登録が終了されたか
どうかの判ｗｆｒを行ない、終了していなければステッ
プｎ２に戻シ、次の単音節の表示、例えば「あｚＪｔ−
ＦＥ示し、同様の処理を行なう。In step n9, it is determined whether or not all standard patterns have been registered. If not, the process returns to step n2, and the next monosyllable is displayed, for example, "AzJt-".
FE is indicated and the same processing is performed.

このようにして、登録が終了すると標準パターンメモリ
２２には総ての単音節の特徴標準パターンが数個ずつ登
録されることになる。In this way, when the registration is completed, several characteristic standard patterns of all monosyllabic characters are registered in the standard pattern memory 22.

次に認識モードの動作を説明する。Next, the operation in recognition mode will be explained.

■、ｇ繊モードの説明第４図は、ｍ識モードにおけるＣ　Ｐ　Ｕ　ｅの処理フ
ローを示したものである。(2) Explanation of the g-thread mode FIG. 4 shows the processing flow of the CPU e in the m-thread mode.

まず、認識モードキーｊ２の操作によって装置が認識モ
ードに設定され、オペレータが認識すべき音声を発声す
ると、この入力音声に応じてステップｎｉｌでは音節セ
グメン・テーシＭ／部Ｃに音節の切シ出し開始の命令を
与える。そして、音節セグメンテーション部Ｃは波形・
特徴パターン一時メモＩＪｄｔ初期化し、以後切り出し
た音節に対応する特徴パターンと波形を先頭番地から入
れていき、各音節の波形及び特徴パターンの始端と終端
番地の情報をＣＰＵｅに与える。First, the device is set to the recognition mode by operating the recognition mode key j2, and when the operator utters the voice to be recognized, in step nil, the syllable is cut into the syllable segment M/part C according to the input voice. Give the command to start. Then, the syllable segmentation part C is a waveform
The feature pattern temporary memo IJdt is initialized, and thereafter feature patterns and waveforms corresponding to the cut out syllables are entered from the first address, and information on the waveform of each syllable and the start and end addresses of the feature pattern is given to the CPUe.

ステップｎ１２では音節が切夛出されたかどうかのチェ
ックを行ない、切り出されるとステップｎｌＢに移る。In step n12, a check is made to see if a syllable has been cut out, and if it has been cut out, the process moves to step nlB.

ステップｎ１３では、波形・特徴パターン一時メモリｄ
の特徴パターンをパターンメモリ１の特徴パターンメモ
リ１　ｆの領域に転送して認識を行なう。即ち単音節認
識部ｆＫ認識の命令を与えることにより特徴パターンメ
モリｌｆの内容と標準パターンメモＩＪ２ｆ、３ｆの内
容の照合によシ認識が行なわれ、その結果を認識結果格
納メモ’Ｊ　ｈに入れるとともに、嵌示装置ｋに表示す
る（ステップｎ１４）。In step n13, the waveform/feature pattern temporary memory d
The characteristic pattern is transferred to the characteristic pattern memory 1f area of the pattern memory 1 for recognition. That is, by giving a command for monosyllable recognition unit fK recognition, recognition is performed by comparing the contents of the characteristic pattern memory lf with the contents of the standard pattern memos IJ2f and 3f, and the result is stored in the recognition result storage memo 'Jh. At the same time, it is displayed on the fitting device k (step n14).

例えば、入力音声として「かいもの」と発声したときの
認識結果の第１位が「かぎもも」でおれば宍示装置ｋＫ
はかぎもも− と表示され、また認識結果格納メモリｈＫは、各音節に
対する複数の認識結果候補が例えば第２我に示すように
格納される。For example, if the first recognition result when uttering "Kaimono" as an input voice is "Kagimomo", the display device kK
Hakagimomo- is displayed, and the recognition result storage memory hK stores a plurality of recognition result candidates for each syllable, for example, as shown in the second part.

第　　２　表（ただし〔〕内は第」候補の値で正規化した距離をしめ
す）上記の「かいもの」といった単語の入力が終わると
、オペレータはキーボードｊの「終了」キーｊ５を入力
する。そうすると、音節セグメンテーション部Ｃに切シ
出しの停止が命令される（ステップｎ１５．ｎ１６）。Table 2 (The numbers in brackets indicate distances normalized by the value of the ``th'' candidate.) When the operator finishes inputting the above word such as ``kaimono,'' the operator presses the ``end'' key j5 on the keyboard j. Then, the syllable segmentation unit C is commanded to stop cutting (steps n15 and n16).

そして、全文字列が正解であれば「転送」のキーｊ８に
入力することにより、Ｉ／Ｆ部ｔを介して外部装置にカ
ナ文字を出力することができる（ステップ１１８．ｎ２
０）。Then, if all the character strings are correct, by inputting the "transfer" key j8, the kana characters can be output to the external device via the I/F section t (step 118.n2
0).

また認識結果の表示を見て、はとんどの文字が間違って
いたり、言い間違いをしたとぎには「取消」キーｊ４を
入力することにより、ステップｎ１７の判断によシ、初
期状態に戻すことができる０また、一部の認識結果が違っている場合には、単語次候
補キーｊ９を押すことにより辞書照合による修正が行な
われる（ステップｎ１９］。ここで、次候補キーｊ９が
音声入力後最初に押されたときには、各単語の候補を第
３表に示すようにその距離の総和が小さい順に展開する
。Also, if you look at the display of the recognition results and find that most of the characters are incorrect or you have made a mistake, you can return to the initial state based on the judgment in step n17 by inputting the "cancel" key j4. In addition, if some of the recognition results are incorrect, correction is performed by dictionary comparison by pressing the word next candidate key j9 (step n19). When pressed for the first time, word candidates are expanded in order of decreasing total distance, as shown in Table 3.

第　　３　表展開が終わると先頭の候補から辞書との一致をとシ、一
致したものを表示部ｋに表示する（ステップｎ２０）０
このとき前に表示されていた候補は情味される。3. When the table development is completed, the first candidate is searched for a match with the dictionary, and the matched one is displayed on the display section k (step n20)0
At this time, the candidates that were previously displayed are highlighted.

また、２回目以降に単語次候補キーｊ９が押されると上
記した展開は行なわず、前回の照合候補の次の候補から
辞書との一致をとる。こうして展開候補の総ての一致を
とったならば再び先頭にもど９、先頭の候補を表示して
次のキーを待つ。また、単語前候補キーｊｌＯで一つ前
に一致のとれた候補を表示することが出来る。Furthermore, when the word next candidate key j9 is pressed for the second time or later, the above expansion is not performed and a match with the dictionary is determined from the next candidate of the previous matching candidate. When all the expansion candidates are matched in this way, the process returns to the beginning 9, displays the first candidate, and waits for the next key. In addition, the previous matching candidate can be displayed using the pre-word candidate key jlO.

次にステップ２１では各音節に対応するカウンターｍの
操作を行なう０この音節カウンターは第４茨に示す通ｐ
１正解カウンター、認識カウンター、正解全体カウンタ
ー及び認識全体カウンターの計４種のカウンターで構成
されている。Next, in step 21, the counter m corresponding to each syllable is operated.
It consists of a total of four types of counters: 1 correct answer counter, recognition counter, total correct answer counter, and total recognition counter.

第４表　音節カウンターの構成上記各カウンターは下記の項目に応じて操作される。Table 4: Structure of syllable counter Each of the above counters is operated according to the following items.

ｌ〕　出力結果の音節の認識カウンターに１を加える０２）認識結果格納メモリｈのｇＦＪ１候補と出力結果と
が等しい音節について正解カウンターにＩを加える。l] Add 1 to the recognition counter of the syllable of the output result 0 2) Add I to the correct answer counter for the syllable for which the gFJ1 candidate in the recognition result storage memory h and the output result are equal.

３）認識した音節の数を認識全体カウンターに加える。3) Add the number of recognized syllables to the total recognition counter.

４）認識結果格納メモリｈの第１候補と出力結果とが等
しい音節の数を正解全体力クンターに加える。4) Add the number of syllables for which the first candidate in the recognition result storage memory h and the output result are equal to the correct total force Kunter.

上記した各項目にもとづいてカウンター操作が成される
とステップｎ２２に進む。When the counter operation is performed based on each item described above, the process proceeds to step n22.

ステップｎ２２では次に挙げる判定方法によって規定さ
れる特徴パターン一時メモリｄ内の音声の音節特徴パタ
ーンをパターンメモＩＪ　ｆの標準パターンメモリ３Ｆ
すなわち主記憶装置に登録する。In step n22, the syllable feature pattern of the voice in the feature pattern temporary memory d defined by the following determination method is transferred to the standard pattern memory 3F of the pattern memo IJ f.
That is, it is registered in the main memory.

く判定方法〉 ■　カウンターによらず、総ての特徴パターンを出力結
果で与えられる音節の標準パターンとして登録する。Judgment method> ■Register all feature patterns as standard patterns of syllables given in the output results, regardless of the counter.

■　カウンターによらず、出力結果と第１位の認識結果
が異なる音節だけ出力結果で規定される音節標準パター
ンとして登録する０ ■　出力結果で規定される音節のカウンターが次の条件
を満たす音節だけ登録する。■ Regardless of the counter, only syllables for which the output result and the first recognition result differ are registered as the syllable standard pattern specified by the output result0 ■ Only syllables whose syllable counter specified by the output result satisfies the following conditions register.

口　正解カウンター／認識カウンター＜　ＴＨＩ■　■
でかつ■の条件を満たす音節だけを登録するＯ ■　正解全体力クンター／！識全体カウンター＞　　Ｔ
Ｈ２のときに上の■■■■のいずれかの方法もしくは、
登録を行なわない方法をと９、それ以外の時には、上の
■■■■のいずれかの方法（ただし先のものとは異なる
）ｔ−とる。Mouth Correct answer counter/recognition counter < THI■ ■
Register only the syllables that meet the conditions of big ■ O ■ Correct answer: Overall power Kunta/! Whole knowledge counter > T
At H2, either method above ■■■■ or
9. Use the method of not registering. Otherwise, use one of the above methods (however, it is different from the previous one) t-.

上記の方法は、第４表のカウンターを持った装置に於け
る一例であり、これと同様の事は各音節　　・にりいて
のリングカウンターをもって過去数音節の認識率で発鈴
音節を判定するものや、過去数音節連続して認識誤りを
した音節を登録する方法などでも実現できる。The above method is an example of a device with a counter in Table 4, and a similar method is to use a ring counter for each syllable to determine the pronunciation syllable based on the recognition rate of the past few syllables. This can also be achieved by registering syllables that have been incorrectly recognized several times in a row in the past.

ただし、上記の総ての方法は、キーボードからのモード
切り替えによって無効となり、登録しな　　　□いこと
を選択できるものとする。However, all of the above methods will be disabled by switching the mode from the keyboard, and you will be able to choose not to register.

なお、標準パターンメモＩ７３　ｆの構成は第５表に示
す通りである。The configuration of the standard pattern memo I73f is as shown in Table 5.

第５表　標準パターン３？の構成例この例では、大刀音声を「い」として登録を行なうので
、音節基「い」のエリアで登録の無いところ、すなわち
「い、」に特徴パターンを転送し、登録の有無ｉ　ｒｌ
Ｊにする。もし登録の有無が総て「ｌ」の場合は、例え
ば第５表の「あ」の場合には、「ア。」までが登録され
ているので、一番時間的に古い「あ３」のエリアデータ
を消してから、そのエリアに登録する。Table 5 Standard pattern 3? Configuration example In this example, since the long sword voice is registered as "i", the feature pattern is transferred to the area where there is no registration in the area of the syllable base "i", that is, "i", and the presence/absence of registration is determined.
Make it J. If the presence/absence of registration is "l", for example, in the case of "a" in Table 5, up to "a." are registered, so "a3" is the oldest in terms of time. Delete area data and then register in that area.

このように正解カウンター及び認識カウンターによって
各音節毎に認識率を算出し、認識率の悪い音節だけを自
動的に主記憶装置である特徴標準パターンメモリ３１に
登録し、これによって認識率の良い音節の標準パターン
が入れ換えられるのを防いでいる。その後パターン書き
込みキーｊ１３が押されると、上記特徴標準パターンメ
モリ３Ｆの内容はフロッピーディスクなどの補助記憶装
置に登録される（ステップｎ２３．ｎ２４）０以上のよ
うにして、上記実施例によれば次の様な効果を奏するこ
とができる。In this way, the recognition rate is calculated for each syllable using the correct answer counter and the recognition counter, and only the syllables with a poor recognition rate are automatically registered in the feature standard pattern memory 31, which is the main storage device. This prevents the standard pattern from being replaced. After that, when the pattern write key j13 is pressed, the contents of the feature standard pattern memory 3F are registered in an auxiliary storage device such as a floppy disk (steps n23 and n24). It is possible to achieve the following effects.

（６）　最小限の登録を行なうと後は入力時に自動的に
追加の登録が成されるので、予めすべての標準パターン
を登録する必要がない。・・・・・・（登録語数を減ら
せる）（ロ）従来のすべての音節を登録する方法でも登録時の
発声と入力時の発声はかな９違ったものになることが多
く、更新登録などの方法で何度が標準パターンを更新し
ないと最終的に安定した認識率が得られないが、本発明
によれば、入力時に自動的に標準パターンが更新される
ため、更新登録などの必要がなく速く最終的な認識率に
達する。・・・・・・（認識率の向上が速い）（ハ）　
登録モードでの発声は認識モードでの発声とかなり異な
ったものとなる場合があり、認識モードでの自動登録を
行なえばこの不都合が解消され最終認識が向上する。・
・・・・・（最終認識率が向上する）に）登録時と入力時が時間的にはなれたり、かぜなどに
かかって声が変わっても、入力時の自動登録を行なうこ
とでその時の声の標準パターンが作成される為、認識率
が激しく落ちるのを防ぐことが出来る。・・・・・・（
声の経時変化に対応できる）（ホ）　各音節毎に認識カウンターと正解カウンターを
持ち、音節毎に認識率を算出して、その認識率の悪い音
節だけを特徴標準パターンとして登録を行なうことで認
識率の良い音節の標準パターンが入れ換えられるのを防
ぎ、認識率の向上が図れる。(6) Once the minimum amount of registration has been performed, additional registration is automatically performed upon input, so there is no need to register all standard patterns in advance.・・・・・・(The number of registered words can be reduced) (B) Even with the conventional method of registering all syllables, the pronunciation at the time of registration and the pronunciation at the time of input are often different, so updating registration etc. However, according to the present invention, the standard pattern is automatically updated at the time of input, so there is no need for update registration. The final recognition rate is reached quickly.・・・・・・(Recognition rate improves quickly) (c)
The utterances made in the registration mode may be quite different from the utterances made in the recognition mode, and automatic registration in the recognition mode eliminates this inconvenience and improves the final recognition.・
(Improve the final recognition rate)) Even if the time of registration and the time of input are different in time, or the voice changes due to a cold, etc., automatic registration at the time of input allows the voice to be used at that time. Since a standard pattern is created, it is possible to prevent the recognition rate from dropping drastically.・・・・・・(
(E) By having a recognition counter and a correct answer counter for each syllable, calculating the recognition rate for each syllable, and registering only those syllables with a poor recognition rate as feature standard patterns. It is possible to prevent the standard pattern of syllables with a good recognition rate from being replaced, and improve the recognition rate.

（へ）　入力時に登録されたパターンが実際の入力に効
果があったときには外部指示操作によって補助記憶装置
に書き込み、次の入力のときにはその標準パターンを使
うようにしたから、入力時の自動登録による悪影響を最
小限に抑えることが出来る。(f) If the pattern registered at the time of input is effective in actual input, it is written to the auxiliary storage device by external instruction operation, and the standard pattern is used for the next input, so automatic registration at the time of input Negative effects can be minimized.

なお、上記実施例では標準パターンメモリヲ２Ｆ、３５
’の２つに分けて登録するように構成したが、特願昭５
７−２１７２９６号で開示している標準パターンの良否
を示すカウンターを利用することによって最も悪い特徴
標準パターンを消してそのエリアに入力時の音節を登録
するようにしてもよい。In the above embodiment, the standard pattern memories 2F and 35
Although it was structured so that it was registered in two parts,
By using a counter that indicates the quality of standard patterns disclosed in Japanese Patent No. 7-217296, the worst feature standard pattern may be erased and the syllable at the time of input may be registered in that area.

く効　呆〉以上述べたように本発明によれば、入力された音声を予
め登録された複数種類の音節の特徴標準　　　。As described above, according to the present invention, a plurality of types of syllable characteristic standards in which input speech is registered in advance.

パターンとの類似度計算によって音節単位に認識し、そ
の結果を辞書との照合もしくはキーボードなどの外部指
示操作によって修正して最終的な入力を得る日本語音声
入力装置に於て、入力時に発声された音声を分析して得
られた特徴標準パターンを辞書照合などによって修正さ
れた音節の特徴標準パターンとして自動的に登録する手
段を備えるようにしたから、使用者の負担なしに入力時
の音声の特徴を標準パターンに追加または入れ換えて認
識率の向上を図ることが出来る。A Japanese voice input device recognizes the syllable by syllable by calculating the similarity with a pattern, and then corrects the result by checking it with a dictionary or by operating an external command such as a keyboard to obtain the final input. Since the feature standard pattern obtained by analyzing the input voice is automatically registered as the syllable characteristic standard pattern corrected by dictionary comparison, etc., the input voice can be changed without any burden on the user. Features can be added to or replaced with standard patterns to improve recognition rates.

[Brief explanation of drawings]

第１図は本発明の一実施例装置の構成を示すブロック図
、第２図はキーボードの一例を示す平面図、第３図は登
録モードの動作を説明するための処理フロー図、第４図
は認識モードの動作を説明するための処理フロー図であ
る。ｂは音声分析部、Ｃは音節セグメンテーション部、ｄは
特徴パターン−次メモリ、ｅはＣＰＵ。１はパターンメモ’）、ｉｌ’ｌｔ展開メモリ、ｍは音
節カウンタ部、ｏは補助記憶部。代理人　弁理士　福　士　愛　彦（他２名）：＄２　図FIG. 1 is a block diagram showing the configuration of a device according to an embodiment of the present invention, FIG. 2 is a plan view showing an example of a keyboard, FIG. 3 is a processing flow diagram for explaining the operation in registration mode, and FIG. 4 FIG. 2 is a processing flow diagram for explaining operations in recognition mode. b is a speech analysis unit, C is a syllable segmentation unit, d is a feature pattern-order memory, and e is a CPU. 1 is a pattern memo'), il'lt expansion memory, m is a syllable counter section, and o is an auxiliary storage section. Agent Patent attorney Aihiko Fuku (and 2 others): $2 Figure

Claims

[Claims] 1. Recognize the input voice syllable by syllable by calculating the similarity with pre-registered feature standard patterns of multiple types of syllables, and check the result with a dictionary or give external instructions such as a keyboard. In a Japanese speech input device that obtains the final input by modifying it through operations, the feature standard pattern obtained by analyzing the voice uttered at the time of input is used as the feature standard pattern of the syllable corrected by dictionary matching etc. A Japanese voice input device characterized by comprising a means for automatic registration. 2. The above-mentioned registration means includes means for registering only standard patterns with poor recognition rates obtained by a counter for obtaining recognition rates of syllable standard patterns. speech input device. 3. The Japanese language recited in claim 1, wherein the registration means includes means for storing the syllable standard pattern in the main memory in the auxiliary memory in response to an instruction from the outside. Voice input device.