JP2744039B2

JP2744039B2 - Voice recognition device

Info

Publication number: JP2744039B2
Application number: JP1013426A
Authority: JP
Inventors: 正一亀井; 正幸飯田; 宏樹大西; 真一鶴藤; 計美大倉
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 1989-01-23
Filing date: 1989-01-23
Publication date: 1998-04-28
Anticipated expiration: 2013-04-28
Also published as: JPH02193196A

Description

【発明の詳細な説明】（イ）産業上の利用分野本発明は操作性の優れた登録手段を備えた特定話者用
の音声認識装置に関するものである。DETAILED DESCRIPTION OF THE INVENTION (a) Industrial application field The present invention relates to a voice recognition device for a specific speaker provided with registration means having excellent operability.

（ロ）従来の技術使用者が予め認識後を登録して使う特定話者音声認識
装置においては、認識結果として認識語に対応した文字
や数字などを使用者が予め入力しておく必要があるが、
入力音声パタンと認識結果表示用データは一組の対応づ
けられたデータであるため、一語毎に、認識語の音声登
録に続けて表示用データをキー入力するという方法が取
られていた。しかし、この方法では音声の発声とキー入
力を交互に行なわねばならず、操作が煩わしくなり使用
者の負担増の原因になっていた。(B) Conventional technology In a specific-speaker speech recognition device that a user registers and uses after recognition in advance, it is necessary for the user to input characters or numbers corresponding to the recognition words as recognition results in advance. But,
Since the input voice pattern and the recognition result display data are a set of associated data, a method has been adopted in which, for each word, the display data is input by a key following the voice registration of the recognition word. However, in this method, it is necessary to alternately utter a voice and input a key, which makes the operation cumbersome and increases the burden on the user.

（ハ）発明が解決しようとする課題認識語の音声パタンの登録だけを先にまとめて行い、
その後で認識結果表示用データをまとめてキー入力する
という方法にすれば、音声入力とキー入力を分けて行な
うことができ、操作はわかりやすくなるが、語数が増え
ると何番目にどの語を発声したかを覚えておくのは困難
であり、この点が問題である。(C) Problems to be Solved by the Invention Only registration of speech patterns of recognition words is performed at once,
After that, if the method of inputting the recognition result display data collectively and key input is used, the voice input and the key input can be performed separately, making the operation easy to understand, but when the number of words increases, which word is uttered first It is difficult to remember what happened, which is the problem.

（ニ）課題を解決するための手段本発明による音声認識装置では、認識語の登録時に、
認識用の音声パタンとして分析して格納すると同時に、
合成用の音声パタンとして分析したデータを格納する手
段と、該合成用音声パタンを合成する手段と、音声合成
後に入力された認識結果の表示用データを認識用音声パ
タンに対応づけた位置に格納する手段を設け、認識語の
登録をまとめて行なった後で、該合成用音声パタンを最
初から順番に音声合成することにより、その認識語に対
応する認識結果の表示用データを入力することが可能と
なる。(D) Means for Solving the Problems In the speech recognition device according to the present invention, when a recognition word is registered,
At the same time as analyzing and storing as a voice pattern for recognition,
Means for storing data analyzed as a voice pattern for synthesis; means for synthesizing the voice pattern for synthesis; and storage of display data of a recognition result input after voice synthesis in a position corresponding to the voice pattern for recognition. After the registration of the recognition words is performed collectively, the speech pattern for synthesis is synthesized in order from the beginning to input the display data of the recognition result corresponding to the recognition word. It becomes possible.

（ホ）作用本発明によれば、音声登録の際に何番目に何という認
識語を登録したかを覚えておく必要がなく、音声の入力
と認識結果の表示用データのキー入力を、それぞれ、ま
とめて行なうことが可能であるので、登録時の煩わしさ
が軽減される。(E) Function According to the present invention, it is not necessary to remember the order of what recognition word was registered at the time of voice registration, and the voice input and the key input of the display data of the recognition result are performed separately. , It is possible to perform the registration at a time, so that the troublesomeness at the time of registration is reduced.

（ヘ）実施例第１図に本発明の音声認識装置の一実施例を示す。(F) Embodiment FIG. 1 shows an embodiment of the speech recognition apparatus of the present invention.

同図によって、音声登録時に認識語の音声登録と認識
結果表示用データのキー入力をそれぞれ、まとめて行な
う場合の処理の流れを以下に示す。As shown in the figure, the flow of processing when voice registration of a recognized word and key input of recognition result display data are performed collectively at the time of voice registration will be described below.

マイクロホン11より入力された音声は、増幅器12で振
幅が飽和しない程度に増幅され、認識用の音声分析部13
と合成用の音声分析部14に送られ、それぞれ、認識用パ
ラメータと合成用パラメータに分析される。そして、認
識用パラメータは標準音声パタンメモリ15に格納され、
合成用パラメータは合成用パタンメモリ17に格納され
る。全ての認識語の登録が終了するまでこの処理が繰り
返される。The voice input from the microphone 11 is amplified by the amplifier 12 to such an extent that the amplitude is not saturated, and the voice analysis unit 13 for recognition is used.
Is sent to the speech analysis unit 14 for synthesis, and is analyzed into a recognition parameter and a synthesis parameter, respectively. Then, the recognition parameters are stored in the standard voice pattern memory 15,
The synthesis parameters are stored in the synthesis pattern memory 17. This process is repeated until registration of all recognized words is completed.

次に、合成用パタンメモリ17の先頭から順番に合成パ
タンが音声合成部20に送られて、認識語が合成出力され
る。使用者は、この合成音を確認した後で、認識結果表
示用データをキー入力部18から入力し、その入力データ
は、認識結果の表示用データとしてメモリ18aに格納さ
れる。而して、使用者は音声合成部20で合成出力される
合成パタンを合成音として確認しながら、合成出力表示
部19において確認する。Next, synthesis patterns are sent to the speech synthesis unit 20 in order from the top of the synthesis pattern memory 17, and the recognition word is synthesized and output. After confirming the synthesized sound, the user inputs recognition result display data from the key input unit 18, and the input data is stored in the memory 18a as recognition result display data. Thus, the user checks the synthesized pattern synthesized and output by the voice synthesis unit 20 on the synthesized output display unit 19 while checking the synthesized pattern as a synthesized sound.

以上の如くして、全ての認識語に対する認識結果表示
用データの入力が終了するまでこの処理が繰り返され
る。As described above, this processing is repeated until the input of the recognition result display data for all the recognized words is completed.

尚、認識時には、マイクロホン11より入力された音声
は、増幅器12で振幅が飽和しない程度に増幅され、認識
用の音声分析部13で分析されて、入力音声パタンが作成
される。該入力音声パタンと標準音声パタンメモリ15内
の標準音声パタンとで、マッチング部16においてパタン
マッチングを行い、最も距離の小さい標準音声パタンを
算出し、認識語を決定する。そして、表示部19において
認識語に対する認識結果表示用データが表示される。At the time of recognition, the voice input from the microphone 11 is amplified by the amplifier 12 to such an extent that the amplitude is not saturated, and is analyzed by the voice analysis unit 13 for recognition to create an input voice pattern. The matching unit 16 performs pattern matching between the input voice pattern and the standard voice pattern in the standard voice pattern memory 15, calculates a standard voice pattern with the shortest distance, and determines a recognition word. Then, the display unit 19 displays recognition result display data for the recognized word.

（ト）発明の効果以上に説明した如く、本発明によれば、登録時の認識
語の音声パタン入力と認識結果表示用データのキー入力
を、それぞれ、まとめて行なうことができるので登録時
の煩わしさが軽減される。(G) Effect of the Invention As described above, according to the present invention, the voice pattern input of the recognition word at the time of registration and the key input of the recognition result display data can be performed at the same time. The annoyance is reduced.

また、合成音を確認することにより入力音声が正しく
登録されていることが確認できるため、登録音声のアッ
プデートが省け、使用者の負担が軽減できる。In addition, by confirming the synthesized sound, it is possible to confirm that the input voice is correctly registered, so that the update of the registered voice can be omitted, and the burden on the user can be reduced.

[Brief description of the drawings]

第１図は本発明による音声認識装置の一実施例を示す構
成図である。 11……マイクロフォン、16……パタンマッチング部、12
……増幅器、17……合成用音声パタンメモリ、13……認
識用音声分析部、18……表示用データ入力部、14……合
成用音声分析部、19……表示部、15……標準音声パタン
メモリ、20……音声合成部。FIG. 1 is a block diagram showing one embodiment of a speech recognition apparatus according to the present invention. 11 ... Microphone, 16 ... Pattern matching part, 12
…… Amplifier, 17… Speech pattern memory for synthesis, 13… Speech analysis unit for recognition, 18… Data input unit for display, 14… Speech analysis unit for synthesis, 19 …… Display unit, 15 …… Standard Voice pattern memory, 20 ... Voice synthesis unit.

フロントページの続き (72)発明者鶴藤真一大阪府守口市京阪本通２丁目18番地三洋電機株式会社内 (72)発明者大倉計美大阪府守口市京阪本通２丁目18番地三洋電機株式会社内 (56)参考文献特開昭63−292196（ＪＰ，Ａ) 実開昭59−147243（ＪＰ，Ｕ)Continuing from the front page (72) Inventor Shinichi Tsuruto 2-18-18 Keihanhondori, Moriguchi-shi, Osaka Sanyo Electric Co., Ltd. (72) Inventor Mitsumi Okura 2-18-18 Keihanhondori, Moriguchi-shi, Osaka (56) References JP-A-63-292196 (JP, A) JP-A-59-147243 (JP, U)

Claims

(57) [Claims]

In a specific speaker voice recognition apparatus provided with voice input means and voice analysis means, a memory for storing a voice pattern for recognition at the time of voice registration, a voice for synthesizing an input voice corresponding to each voice pattern. Memory means for storing data, and memory means for storing display data for recognition results, and means for analyzing input speech into a speech pattern for recognition at the time of speech registration and simultaneously analyzing and storing the speech data for synthesis, Means for performing voice synthesis on the stored voice data for synthesis;
A means is provided to store the display data of the recognition result input after speech synthesis in a position corresponding to the recognition voice pattern, and the input voice is recorded when the recognition word is registered, and registration of all the recognition words is completed. A speech recognition apparatus characterized by inputting display data of a recognition result while reproducing in order from the beginning at a point in time.