JPS60120400A

JPS60120400A - Voice recognition equipment

Info

Publication number: JPS60120400A
Application number: JP58228274A
Authority: JP
Inventors: 宇佐美　隆一; 新家　修
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1983-12-02
Filing date: 1983-12-02
Publication date: 1985-06-27

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】〔発明の技術分野〕本発明は、音声認識装置に係シ、特に事前に話者の音声
パターンを登録後、認識を行う音声認識装置に関する゛
も”のである。DETAILED DESCRIPTION OF THE INVENTION [Technical Field of the Invention] The present invention relates to a speech recognition device, and particularly to a speech recognition device that performs recognition after registering a speaker's speech pattern in advance.

[Prior art and problems]

第１図は音声認識装置の構成例を示す図、第２図はハン
ディ・パネルの構成例を示す図である。FIG. 1 is a diagram showing an example of the configuration of a voice recognition device, and FIG. 2 is a diagram showing an example of the configuration of a handy panel.

図において、１はコントローラ、２はディスプレイ、３
はキーボード、４は音声認識具ニット、５はマイク、６
はハンディ・パネル、６１は送シキー、６２は戻しキー
、６３は再登録キー、６４はポーズ・キーを示す。In the figure, 1 is the controller, 2 is the display, and 3
is a keyboard, 4 is a voice recognition device knit, 5 is a microphone, 6
61 is a forward key, 62 is a return key, 63 is a re-registration key, and 64 is a pause key.

従来の特定話者音声認識装置は、第１図に示す如く、°
マイク５から入力された音声の認識を行ってその結果を
通知する音声認識ユニット４、その出力とマージされる
キーボード３、音声特有の指示を行うためのハンディ・
パネル６、オペレータとの対話用のディスプレイ２、及
びそれらの入出力を制御するコントローラ１よ多構成さ
れている。As shown in Fig. 1, the conventional speaker-specific speech recognition device is
A voice recognition unit 4 recognizes the voice input from the microphone 5 and notifies the result, a keyboard 3 that is merged with the output, and a handy keyboard 3 for giving voice-specific instructions.
It is composed of a panel 6, a display 2 for interaction with an operator, and a controller 1 for controlling input and output thereof.

ハンディ・パネル６は、第２図に示すように、プロンプ
トの空送シと戻しを行うだめの送りキー６１と戻しキー
６２、個人の音声特微量で音声の照合対象として作成さ
れた辞書を消去し再登録を行うための再登録キー６３、
及び音声入力をしゃ断したり再度入力可能状態に戻した
りするだめのポーズ・キー６４を有している。As shown in FIG. 2, the handy panel 6 has a forward key 61 and a return key 62 for forwarding and reversing prompts, and for erasing a dictionary created as a target for voice matching based on individual voice features. a re-registration key 63 for re-registering;
It also has a pause key 64 for cutting off voice input and returning it to a state where it can be input again.

次に音声登録の概要を説明する。まず、音声登録処理が
起動されると、ディスプレイ２上に発声を指示するプロ
ンプトが表示され、オペレータはそのプロンプトに従っ
て発声を行う。正常に登録が行われると、次の語のプロ
ンプトが表示され、以下同様に処理を行う。ここでもし
登録が正常に行われない場合、オペレータは、ハンディ
・パネル６の再登録キー６３を押下して、作成された辞
書を消去して再発声する。既に音声が登録済の語に対し
て再発声すると、学習処理と呼ばれる処理を行い、既存
の辞書の修正処理を行って発声変動に対応可能とする。Next, an overview of voice registration will be explained. First, when the voice registration process is started, a prompt instructing to speak is displayed on the display 2, and the operator speaks according to the prompt. If registration is successful, a prompt for the next word will be displayed, and the process will continue in the same way. If the registration is not performed normally, the operator presses the re-registration key 63 on the handy panel 6 to erase the created dictionary and re-voice. When a word whose voice has already been registered is re-uttered, a process called a learning process is performed to correct the existing dictionary so that it can respond to variations in pronunciation.

又、辞書間に距離の近い類似語があると類似語側の辞書
を修正する処理も行われる。音声入力を途中でオペレー
タが中断する必要がある場合には、ハンディ・パネル６
のポーズ・キー６４を押下することによ多入力しゃ断を
行う。更にもう一度ポーズ・キー６４を押下することに
よ多入力可能状態に戻す。Furthermore, if there are similar words that are close to each other in the dictionaries, a process is also performed to correct the dictionary for the similar words. If the operator needs to interrupt voice input midway through, use the handy panel 6.
Multiple inputs are cut off by pressing the pause key 64. Furthermore, by pressing the pause key 64 again, the state returns to the state where multiple inputs are possible.

以上に述べたような従来の登録方式は、ハンディ・パネ
ルという特殊なハードウェアを準備する必要があること
、及びハンディ・パネル上のキーの使用には特に制限を
設けないため、例えば先に述べた類似語側辞書の修正の
くり返しによシ、類似語側辞書が逆に破壊される可能性
がある等の問題点があった。The conventional registration method described above requires the preparation of special hardware called a handy panel, and there are no particular restrictions on the use of keys on the handy panel. There is a problem in that the similar word dictionary may be destroyed due to repeated corrections of the similar word dictionary.

[Purpose of the invention]

本発明は、上記の考察に基づくものであって、ハンディ
・パネルのような特殊なハードウェアを準備する必要が
なく、音声登録処理を効率的かつ効果的に行うことが可
能な音声認識装置を提供することを目的とするものであ
る。The present invention is based on the above considerations, and provides a voice recognition device that can perform voice registration processing efficiently and effectively without the need to prepare special hardware such as a handy panel. The purpose is to provide

[Structure of the invention]

そのために本発明の音声認識装置は、音声認識を行う音
声認識ユニットと、ディスプレイと、キーボードと、全
体の入出力を制御するコントローラとを具備した音声認
識装置において、キーボードにファンクション・キーを
設けると共に、コントローラは、音声特微量を作成する
ための音声登録処理ではファンクション・キーに音声登
録処理用のプロンプトの送りや戻し、ポーズ、再登録、
終了などの機能を定義すると共に、音声登録処理におけ
る初期登録を登録済辞書の修正／追加を行う学習処理よ
りも優先度を上げ、学習処理と認識処理の結果や状態、
コマンド入力フィールドをディスプレイ上に表示する処
理を行うように構成されたことを特徴とするものである
。To this end, the speech recognition device of the present invention is a speech recognition device equipped with a speech recognition unit that performs speech recognition, a display, a keyboard, and a controller that controls overall input/output. , during voice registration processing to create voice features, the controller uses function keys to send and return prompts for voice registration processing, pause, reregister,
In addition to defining functions such as termination, the initial registration in the voice registration process is given priority over the learning process that corrects/adds registered dictionaries, and the results and status of the learning process and recognition process,
The present invention is characterized in that it is configured to display a command input field on a display.

[Embodiments of the invention]

以下、本発明の実施例を図面を参照しつつ説明する。 Embodiments of the present invention will be described below with reference to the drawings.

第３図は本発明の１実施例構成を示す図、第４図は本発
明が適用されるディスプレイの画面イメージを示す図、
第５図は本発明が適用されるキーボード上のファンクシ
ョン・キーの定義例を示す図である。第３図において、
工ないし５は第１図に対応するものを示す。FIG. 3 is a diagram showing the configuration of one embodiment of the present invention, FIG. 4 is a diagram showing a screen image of a display to which the present invention is applied,
FIG. 5 is a diagram showing an example of definition of function keys on a keyboard to which the present invention is applied. In Figure 3,
Items 1 to 5 correspond to those shown in FIG.

本発明は、第３図に示すように、従来のハンディ・パネ
ルを削除し、その機能をキーボード３のファンクション
・キーに持たせるものである。ファンクション・キーは
、通常ユーザが自由に定義付けできるため、登録モード
において一意に決めても問題はない。ファンクション・
キーノ定義例を示したのが第４図である。第４図に示す
例では、ファンクション・キーのＡ１がポーズ・キー、
Ａ３がプロンプトの戻しキー、Ａ４がプロンプトの送シ
キー、Ｂ１が終了キー、Ｂ２が再登録キーとして定義さ
れる。ここで終了キーとは、登録処理を終了させる機能
を持つキーである。As shown in FIG. 3, the present invention eliminates the conventional handy panel and provides the function keys of the keyboard 3 with the function. Since a function key can normally be freely defined by a user, there is no problem even if it is uniquely determined in the registration mode. function·
FIG. 4 shows an example of keno definition. In the example shown in FIG. 4, function key A1 is the pause key;
A3 is defined as a prompt return key, A4 is defined as a prompt send key, B1 is defined as an end key, and B2 is defined as a reregistration key. The end key here is a key that has the function of ending the registration process.

さらに本発明においては、すべての語に対して登録処理
が終了した後にのみ学習処理を可能とし、辞書の品質を
均一に保つようにする。すなわち、成る特定の語に着目
すると高認識率が得られるが、逆修正を受けた語は認識
率が低下するというような、部分的に登録完了時に学習
を許すことによる辞書のひずみが起こらないようにする
。Furthermore, in the present invention, learning processing is enabled only after registration processing has been completed for all words, thereby maintaining uniform dictionary quality. In other words, a high recognition rate can be obtained by focusing on specific words consisting of the following words, but the recognition rate decreases for words that have been reversely modified.This is because the dictionary is not distorted by allowing learning to occur partially when registration is completed. Do it like this.

第１表は本発明の特定話者音声認識装置による登録時状
態遷移を示し、第２表は同じく学習時状態遷移を示す。Table 1 shows state transitions during registration by the specific speaker speech recognition device of the present invention, and Table 2 similarly shows state transitions during learning.

本発明は、この第１表及び第２表〔第１表　登録時〕に示すように、モードによってキー動作を制限すること
により実現する。The present invention is realized by restricting key operations depending on the mode, as shown in Table 1 and Table 2 [Table 1: At the time of registration].

次に、本発明が適用される音声登録画面イメージを第５
図を参照しつつ説明する。第５図において、発声レベル
とは、オペレータの発声が大きすぎるか小さすぎるかを
指示し、登録状況は、総語数に対してどれだけ登録が完
了したかをオペレータに伝える。また、類似発声とは、
登録語に距離の近い語（似た発声）があることをオペレ
ータに伝え、読みを変えたほうが良いことを指示する。Next, a fifth image of the voice registration screen to which the present invention is applied is shown.
This will be explained with reference to the figures. In FIG. 5, the utterance level indicates whether the operator's utterance is too loud or too soft, and the registration status indicates to the operator how much registration has been completed relative to the total number of words. Also, similar utterances are
Inform the operator that there is a word (pronounced similarly) that is close to the registered word, and instruct him or her to change the pronunciation.

完成度は、辞書の完成度を示し、候補は、学習時に距離
の近い候補が存在する場合にその候補を表示するだめの
ものである。さらに、コマンドとは、プロンプトの送シ
や戻しのみでは任意の語の位置への移動が困難であるた
めに、プロンプト語の移動用のコマンド入力フィールド
等に使用される。The degree of completion indicates the degree of completion of the dictionary, and the candidate is used to display a nearby candidate when there is a nearby candidate during learning. Further, since it is difficult to move to an arbitrary word position by simply sending or returning a prompt, a command is used in a command input field for moving a prompt word.

第６図は登録モードでのコントローラによる処理の流れ
を説明する図である。コントローラは、ファンクション
・キーの意味付けを行う手段を有する。例えばユティリ
ティ機能である。具体的にはファンクション・キーが押
下されるとファンクション・キーに対応する文字列がコ
ントローラに通知される。これによシコントローラは、
先に述べたように、従来のハンディ・パネルに代えてキ
ーボードのファンクション・キーに同様の機能を持たせ
る。以下に登録モードでの処理の流れを第６図を参照し
つつ説明する。FIG. 6 is a diagram illustrating the flow of processing by the controller in registration mode. The controller has means for assigning meaning to the function keys. For example, it is a utility function. Specifically, when a function key is pressed, a character string corresponding to the function key is notified to the controller. This controller is
As mentioned above, instead of the conventional handy panel, the function keys on the keyboard have similar functions. The flow of processing in the registration mode will be described below with reference to FIG.

■　全単語が登録済みか否かを調べる０ｙｅｓの場合に
は学習処理を行い、Ｎｏの場合には■の処理を行う。■ Check whether all words have been registered. If 0, the learning process is performed, and if the answer is No, the process (■) is performed.

■　次の発声プロンプトをディスプレイに表示する。次
に■の処理を行う。■ Show the next spoken prompt on the display. Next, perform the process (■).

■　発声が行われたか、ファンクション・キーが押下さ
れたかを調べる。■ Determine whether a utterance was made or a function key was pressed.

発声が行われた場合にはその発声語の登録を行って■の
処理に戻り、ファンクション・キーが押下された場合に
は■の処理を行う。If the utterance has been made, the uttered word is registered and the process returns to step 2, and if the function key has been pressed, the step 2 is performed.

■　再登録指示か否かを調べる。■ Check whether there is a re-registration instruction.

Ｙｅｓの場合には■の処理を行い、Ｎｏの場合には■の
処理を行う。In the case of Yes, the process ``■'' is performed, and in the case of No, the process ``■'' is performed.

■　登録済の辞書の内容をクリアする。次に■の処理に
戻る。■ Clear the contents of registered dictionaries. Next, return to the process of ■.

■　プロンプトの送シ指示か否かを調べる。■ Check whether there is an instruction to send the prompt.

■　プロンプトの送り指示に対して動作が可能か否かを
調べる。■ Check whether the operation is possible in response to the prompt sending instruction.

Ｙｅｓの場合には■の処理を行い、Ｎｏの場合には■の
処理に戻る。In the case of Yes, the process ``■'' is performed, and in the case of No, the process returns to the process ``■''.

■　発声プロンプトのポインタを次に進める。■ Advance the voice prompt pointer to the next point.

次に■の処理に戻る。Next, return to the process of ■.

■　プロンプトの戻し指示か否かを調べる。■ Check whether the prompt is returned or not.

Ｙｅｓの場合には［相］の処理を行い、Ｎｏの場合には
＠の処理を行う。In the case of Yes, the process of [phase] is performed, and in the case of No, the process of @ is performed.

［相］　プロンプトの戻し指示に対して動作が可能か否
かを調べる。[Phase] Check whether the action is possible in response to the return instruction from the prompt.

Ｙｅｓの場合には０の処理を行い、Ｎｏの場合には■の
処理に戻る。If Yes, process 0 is performed, and if No, return to process ■.

■　発声プロンプトのポインタを前に戻す。次に■の処
理に戻る。■ Move the vocal prompt pointer back. Next, return to the process of ■.

＠　終了指示か否かを調べる。@ Check whether it is a termination instruction.

Ｙｅｓの場合にはＯの処理を行い、Ｎｏの場合には［相
］の処理を行う。If Yes, process O is performed, and if No, process [phase] is performed.

＠　ボーズ処理を行う。次に■の処理に戻る。@Perform Bose processing. Next, return to the process of ■.

■　終了処理を行う。■ Perform termination processing.

以上のように、基本的には成る特定のファンクション・
キーが押下されたときに、そのファンクション・キーの
機能を有効にするか無効にするかの判定及び次の可能動
作の制限処理がコントローラ中の処理機能となる。As mentioned above, a specific function basically consists of
When a key is pressed, the processing functions in the controller include determining whether to enable or disable the function of the function key and limiting the next possible operation.

〔Effect of the invention〕

以上の説明から明らかなように、本発明によれば、特定
話者音声認識装置において、重要な認識率を左右する要
素である音声登録処理を、ノ１ンディ・パネルのような
特殊なハードウェアを準備するととなく、効率的かつ効
果的に行うことが可能となる。As is clear from the above description, according to the present invention, in a specific speaker speech recognition device, the speech registration process, which is an important element that influences the recognition rate, is performed using special hardware such as a one-day panel. If you prepare, you will be able to carry out the process efficiently and effectively.

[Brief explanation of drawings]

第１図は音声認識装置の構成例を示す図、第２図はハン
ディ・パネルの構成例を示す図、第３図は本発明の１実
施例構成を示すｉ、第４図は本発明が適用されるディス
プレイの画面イメージを示す図、第５図は本発明が適用
されるキーボード上のファンクション・キーの定義例を
示す図、第６図は登録モードでのコントローラによる処
理の流れを説明する図である。１・・・コントローラ、２・・・ディスプレイ、３・・
・キーボード、４・・・音声認識ユニット、５・・・マ
イク、６・・・ハンディ・パネル、６１・・・送Ｊ）キ
ー、６２・・・戻しキー、６３・・・再登録キー、６４
・・・ポーズ・キ特許出願人　富士通株式会社代理人弁理士　京　谷　四　部ノ　１　図イ　２　ｍ）Ｘ３ｆｆｉ −′ｆ　４　閏）！　■　閏特開昭ＧＯ−１２０４００（５）ぞ　６　図間　上台 ■　Ｙｅｓくべ一υ史（里〉 ■　希ｐプロンプト表示 ■　斧Ｐ計Ｋ　ＦＫ（渭７偽ンキー） ■・□。７ ■　辞書リグリア ■ ■　送り　７闘　重Ｂ作可？FIG. 1 is a diagram showing an example of the configuration of a voice recognition device, FIG. 2 is a diagram showing an example of the configuration of a handy panel, FIG. 3 is a diagram showing an example of the configuration of an embodiment of the present invention, and FIG. A diagram showing a screen image of a display to which the present invention is applied; FIG. 5 is a diagram showing an example of definition of function keys on a keyboard to which the present invention is applied; FIG. 6 explains the flow of processing by the controller in registration mode. It is a diagram. 1... Controller, 2... Display, 3...
・Keyboard, 4... Voice recognition unit, 5... Microphone, 6... Handy panel, 61... Forward J) key, 62... Return key, 63... Re-registration key, 64
... Pose Ki Patent Applicant Fujitsu Ltd. Representative Patent Attorney Kyo Tani 4 Section 1 Figure I 2 m)X3ffi -'f 4 Leap)! ■ Entokukai Showa GO-120400 (5) 6 Uedai ■ Yes Kubeichi υ history (sato) ■ Nozomi p prompt display ■ Ax P meter K FK (歭7 false key) ■・□.7 ■ Dictionary Liguria ■ ■ Send 7 Fight Heavy B work possible?

Claims

[Claims]

A voice recognition unit that performs voice recognition, a display,
In a speech recognition device equipped with a keyboard and a controller that controls overall input/output, the keyboard is provided with function keys, and the controller registers speech to the function keys in speech registration processing for creating speech features. Send and return prompts for processing,
In addition to defining functions such as pause, re-registration, and termination, the initial registration in the voice registration process is prioritized over the learning process that corrects/adds registered dictionaries, and the results, status, and commands of the learning process and recognition process are defined. A speech recognition device characterized in that it is configured to perform a process of displaying an input field on a display.