JPH04134397A

JPH04134397A - Voice recognizing device

Info

Publication number: JPH04134397A
Application number: JP2258058A
Authority: JP
Inventors: Etsuji Shuda; 周田　悦治
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1990-09-26
Filing date: 1990-09-26
Publication date: 1992-05-08

Abstract

PURPOSE:To prevent the same recognition error from being repeated many times by outputting a recognition word having the highest degree of resemblance from a successive memory at the time of first recognition and outputting a recognition word having the highest degree of resemblance other than recognition words, which are already outputted as recognition results, at the time of n-th recognition. CONSTITUTION:When a control part 8 discriminates first voice recognition by a recognition end signal from a collator 4, an output switch 12 is connected to a contact (a) to output the recognition result having the highest degree of resemblance stored in a memory 1 from a recognizing device. The recognition result of first recognition is inputted to a register 10 and is compared with the recognition result having the highest degree of resemblance of second recognition, namely, contents of the memory 1 of a successive rewrite memory 6; and if they are different, the output switch 12 is connected to the contact (a). If they are equal as the comparison result, the output switch 12 is connected to a contact (b) to output the recognition result having the highest degree of resemblance different from that of first recognition. Thus, the same recognition result error is not outputted for rephrasing.

Description

【発明の詳細な説明】産業上の利用分野本発明は音声を用いて各種機器を制御したり、ビデオテ
ープレコーダーなどのタイマー予約などのデータを入力
する音声認識装置に関するものである。DETAILED DESCRIPTION OF THE INVENTION Field of the Invention The present invention relates to a voice recognition device that uses voice to control various devices and input data such as timer reservations for video tape recorders and the like.

従来の技術近年、音声認識技術の進歩に伴い、別の作業を行ないな
がら音声で機器を制御したり、操作としての簡便さから
ビデオテープレコーダーなとのタイマー予約データの入
力等に音声認識装置が用い始められてきた。Conventional technology In recent years, with the advancement of voice recognition technology, voice recognition devices have become popular for controlling equipment by voice while performing other tasks, and for inputting timer reservation data for video tape recorders due to the ease of operation. It has started to be used.

以下、図面を参照しながら、上述した従来の音声認識装
置の一例について説明する。An example of the conventional speech recognition device described above will be described below with reference to the drawings.

第２図は従来の音声認識装置の構成を示すものである。FIG. 2 shows the configuration of a conventional speech recognition device.

第２図において、１１は音声を電気信号に変換するマイ
クロフォンである。１２は増幅器である。１３はアナロ
グ−デジタル変換器（以下、Ａ−Ｄ変換器と称す）であ
る。１４は照合器である。１５は認識する単語のパター
ンを登録しておく認識単語メモリーである。１６は認識
結果を格納する逐次書き替えメモリーである。１７は認
識のスタートを行なう認識ボタンである。１８は認識を
制御する制御部である。In FIG. 2, 11 is a microphone that converts audio into electrical signals. 12 is an amplifier. 13 is an analog-to-digital converter (hereinafter referred to as an A-D converter). 14 is a collation device. 15 is a recognition word memory in which patterns of words to be recognized are registered. Reference numeral 16 denotes a sequentially rewritten memory for storing recognition results. 17 is a recognition button for starting recognition. 18 is a control unit that controls recognition.

以上のように構成された従来の音声認識装置について、
以下その動作を説明する。Regarding the conventional speech recognition device configured as above,
The operation will be explained below.

まず、音声認識を行なうために認識ボタン１７を操作す
ると、制御部１８は、照合器１４に音声認識開始制御を
行なう。発音された音声はマイクロフォン１１で電気信
号に変換されて、増幅器１２で照合に必要な振幅まで増
幅される。増幅されたアナログ電気信号はＡ−Ｄ変換器
１３でデジタル信号に変換される。認識単語メモリー１
５は認識させようとする単語のパターンをデジタル信号
として登録しであるもので、あらかじめ登録しである読
みだし専用メモリーであってもよいし、使うときに利用
者の登録か可能な逐次書き替えメモリーであってもよい
。例えば、数字の０から９までを認識させる場合１５の
認識単語メモリーにはＯから９までのそれぞれの音声の
パターンをデジタル化したものを格納しである。照合器
１４はＡ−Ｄ変換器１３でデジタル化した音声入力と認
識単語メモリー１５のデータを照合し類似度の結果を認
識結果逐次書き替えメモリー１６に格納する。発音され
た音声の認識結果は類似度の最も高かったものが出力さ
れる。First, when the recognition button 17 is operated to perform voice recognition, the control section 18 controls the collation device 14 to start voice recognition. The spoken voice is converted into an electrical signal by the microphone 11, and amplified by the amplifier 12 to the amplitude required for verification. The amplified analog electrical signal is converted into a digital signal by an A-D converter 13. Recognition word memory 1
5 is a device in which the pattern of the word to be recognized is registered as a digital signal, which may be registered in advance in a read-only memory, or may be rewritten sequentially by user registration at the time of use. It may be memory. For example, in order to recognize the numbers 0 to 9, the digitized sound patterns of each of the numbers 0 to 9 are stored in the recognition word memory 15. The collation device 14 collates the voice input digitized by the A/D converter 13 with the data in the recognition word memory 15, and stores the similarity result in the recognition result sequential rewriting memory 16. The recognition result of the pronounced voice with the highest degree of similarity is output.

発明が解決しようとする課題しかしながら上記のような構成では音声認識の誤りを補
正することができないという問題点を持っている。音声
認識は１００％の認識率を持つことは極めて困難であり
、一定の誤り率を念頭においた認識装置が必要である。Problems to be Solved by the Invention However, the above configuration has a problem in that errors in speech recognition cannot be corrected. It is extremely difficult to achieve a 100% recognition rate in speech recognition, and a recognition device is required with a certain error rate in mind.

単語の中には９（きゅう）と１０（しゅう）のように発
音の極めて似通ったものが存在し、利用者の発音の癖に
よっては何度も発音した内容と異なる認識結果に終わる
可能性があるという問題点を持っている。Some words, such as 9 (kyu) and 10 (shuu), have extremely similar pronunciations, and depending on the user's pronunciation habits, the recognition result may differ from what has been pronounced many times. It has a certain problem.

本発明は上記問題点に鑑み、複数の音声の認識がされた
場合、最初に認識したものと異なる、類似度の高い単語
を認識結果として出力する手段を持つことで、何度も同
じ認識誤りを繰り返さないという使い勝手のよい音声認
識装置を提供するものである。In view of the above-mentioned problems, the present invention has a means for outputting a highly similar word different from the first recognized word as a recognition result when multiple voices are recognized, thereby preventing the same recognition error from occurring over and over again. To provide a voice recognition device that is easy to use and does not repeat the process.

課題を解決するための手段上記問題点を解決するために本発明の音声認識装置は入
力された音声と複数の認識単語を登録した認識単語メモ
リーとの照合度に応じて類似度の高さを格納する逐次メ
モリー手段と、認識の１回目は最も類似度の高い認識単
語を出力する認識結果出力手段と、ｎ回目の認識では、
すでに認識結果として出力した認識単語以外の最も類似
度の高い認識単語を出力する認識結果出力手段と、１回
目の認識に戻すリセット手段を備えた構成を持つもので
ある。Means for Solving the Problems In order to solve the above problems, the speech recognition device of the present invention determines the degree of similarity depending on the degree of matching between input speech and a recognition word memory in which a plurality of recognition words are registered. A sequential memory means for storing, a recognition result output means for outputting a recognized word with the highest similarity in the first recognition, and a recognition result output means in the nth recognition.
The recognition result output means outputs a recognition word with the highest degree of similarity other than the recognition words already output as recognition results, and the recognition result output means includes a reset means for returning to the first recognition.

作用本発明は上記した構成によって、入力された音声と、複
数の認識単語を登録した認識単語メモリーとの照合度に
応して類似度の高さを格納する逐次メモリー手段をもち
、認識の１回目は最も類似度の高い認識単語を逐次メモ
リーから出力し、ｎ回目の認識では、すでに認識結果と
して出力した認識単語以外の最も類似度の高い認識単語
を出力できるようになる。また再び−から認識を開始す
る時は、１回目の認識に戻すリセット手段によって認識
のスタートに戻せるようにできることとなる。Effects The present invention has the above-described configuration, and has a sequential memory means for storing the degree of similarity according to the degree of matching between the input speech and the recognition word memory in which a plurality of recognition words are registered, In the n-th recognition, the recognition words with the highest degree of similarity are sequentially output from the memory, and in the n-th recognition, the recognition words with the highest degree of similarity other than the recognition words that have already been output as recognition results can be output. Furthermore, when starting the recognition from - again, the reset means for returning to the first recognition can be used to return to the start of the recognition.

実施例以下、本発明の一実施例の音声認識装置について図面を
参照しながら説明する。Embodiment Hereinafter, a speech recognition device according to an embodiment of the present invention will be described with reference to the drawings.

第１図は本発明の一実施例の音声認識装置の構成を示す
ブロック図である。FIG. 1 is a block diagram showing the configuration of a speech recognition device according to an embodiment of the present invention.

第１図において、１は音声を電気信号に変換するマイク
ロフォンである。２は増幅器である。３はアナログ−デ
ジタル変換器（以下、Ａ−Ｄ変換器と称す）である。４
は照合器である。５は認識する単語のパターンを登録し
ておく認識単語メモリーである。６は認識結果を格納す
る逐次書き替えメモリーである。７は認識のスタートを
行なう認識ボタン、である。８は認識を制御する制御部
である。９は認識の類似度の最も高い結果を出力するス
イッチである。１０は認識の類似度の最も高い結果を一
時的に保持するレジスタである。In FIG. 1, 1 is a microphone that converts audio into electrical signals. 2 is an amplifier. 3 is an analog-to-digital converter (hereinafter referred to as an A-D converter). 4
is a matcher. 5 is a recognition word memory in which patterns of words to be recognized are registered. 6 is a sequentially rewritten memory that stores recognition results. 7 is a recognition button for starting recognition. 8 is a control unit that controls recognition. 9 is a switch that outputs the result with the highest recognition similarity. 10 is a register that temporarily holds the result with the highest degree of recognition similarity.

１１は比較器である。１２は認識結果の出力切り替えス
イッチである。11 is a comparator. 12 is a recognition result output changeover switch.

以上のように構成された音声認識装置について以下その
動作を説明する。The operation of the speech recognition device configured as described above will be explained below.

まず、音声認識を行なうために認識ボタン７を操作する
と制御部８は照合器４に音声認識開始制御を行なう。発
音された音声はマイクロフォン１で電気信号に変換され
て増幅器２で照合に必要な振幅まで増幅される。増幅さ
れたアナログ電気信号はＡ−Ｄ変換器３でデジタル信号
に変換される。認識単語メモリー５は認識させようとす
る単語のパターンをデジタル信号として登録しであるも
ので、あらかじめ登録しである読みだし専用メモリーで
あってもよいし、使うときに利用者の登録が可能な逐次
書き替えメモリーであってもよい。例えば数字の０から
９までを認識させる場合、認識単語メモリー５には０か
ら９までのそれぞれの音声のパターンをデジタル化した
ものを格納しである。照合器４はＡ−Ｄ変換器３でデジ
タル化した音声入力と認識単語メモリー５のデータを照
合し、類似度の結果を認識結果逐次書き替えメモリー６
に格納する。類似度の最も高かった順に、順次逐次書き
替えメモリー６のメモリー１から順に符合化されて格納
される。First, when the recognition button 7 is operated to perform voice recognition, the control section 8 controls the collation device 4 to start voice recognition. The voice produced is converted into an electrical signal by a microphone 1 and amplified by an amplifier 2 to the amplitude necessary for verification. The amplified analog electrical signal is converted into a digital signal by an A-D converter 3. The recognition word memory 5 registers the pattern of the word to be recognized as a digital signal, and may be a read-only memory that is pre-registered, or can be registered by the user at the time of use. It may be a sequentially rewritten memory. For example, when the numbers 0 to 9 are to be recognized, the recognition word memory 5 stores digitized speech patterns for each of the numbers 0 to 9. The collation device 4 collates the voice input digitized by the A-D converter 3 with the data in the recognition word memory 5, and stores the similarity results in the recognition result memory 6.
Store in. The data are encoded and stored in order from memory 1 of the rewriting memory 6 in order of highest similarity.

制御部８は初めての音声認識であるかを照合器４から認
識終了信号により判別し、１回目の認識であれば出力ス
イッチ１２をａにたおし、メモリー１に格納された最も
類似度の高い認識結果を認識装置から出力する。また、
スイッチ９を閉じ、レジスタ１０にメモリー１のデータ
を格納し再びスイッチ９を開放する。The control unit 8 determines whether this is the first speech recognition based on the recognition end signal from the collation device 4, and if it is the first recognition, sets the output switch 12 to a, and selects the recognition with the highest degree of similarity stored in the memory 1. Output the results from the recognition device. Also,
The switch 9 is closed, the data of the memory 1 is stored in the register 10, and the switch 9 is opened again.

ここで、再び音声認識がされ、上記の過程と同様に逐次
書き替えメモリー６に２回目の認識結果が登録された時
、制御部８は２回目の認識であることを検知して、比較
器１１を作動させる。１回目の認識結果はレジスタ１０
に入っており２回目の認識の最も、類似度の高い認識単
語、すなわち逐次書き替えメモリー６のメモリー１の内
容と比較し異なっている場合は出力スイッチ１２をａに
倒す。もし、比較結果が同一の場合は出力スイッチ１２
をｂに倒すことで、１回目と異なる類似度の高い認識結
果を出力することができる。レジスタ１０の数を増やし
複数の過去の認識結果を登録するようにすればこの動作
をｎ回繰り返すことも可能である。また初期段階に戻す
には、制御部８において、回数の制限を設けるか、スイ
ッチ７か押されることで可能である。Here, when voice recognition is performed again and the second recognition result is registered in the sequential rewriting memory 6 in the same way as in the above process, the control unit 8 detects that it is the second recognition and starts the comparator. 11 is activated. The first recognition result is in register 10
If the word is different from the recognized word with the highest degree of similarity in the second recognition, that is, the content of memory 1 of the sequential rewriting memory 6, the output switch 12 is turned to a. If the comparison results are the same, the output switch 12
By changing b to b, it is possible to output a recognition result with a high degree of similarity that is different from the first recognition result. This operation can be repeated n times by increasing the number of registers 10 and registering a plurality of past recognition results. Further, in order to return to the initial stage, it is possible to set a limit on the number of times in the control section 8 or by pressing the switch 7.

発明の効果以上のように本発明によれば、入力された音声と、複数
の認識単語を登録した認識単語メモリーとの照合度に応
じて類似度の高さを格納する逐次メモリー手段と、認識
の１回目は最も類似度の高い認識単語を出力する認識結
果出力手段と、ｎ回目の認識では、すでに認識結果とし
て出力した認識単語以外の最も類似度の高い認識単語を
出力する認識結果出力手段と、１回目の認識に戻すリセ
ット手段を備えた構成を持つことで言い直しを行なった
時に同じ認識結果の誤りを出力することがなくなるとい
った優れた効果を得ることができる。Effects of the Invention As described above, according to the present invention, there is provided a sequential memory means for storing the degree of similarity according to the degree of matching between input speech and a recognition word memory in which a plurality of recognition words are registered; a recognition result output means for outputting a recognized word with the highest degree of similarity for the first recognition; and a recognition result output means for outputting a recognized word of the highest degree of similarity for the n-th recognition, other than the recognition word that has already been output as a recognition result. By having a configuration including a reset means for returning to the first recognition, it is possible to obtain an excellent effect that an error in the same recognition result will not be output when rewording is performed.

[Brief explanation of drawings]

第１図は本発明の一実施例のタイマー予約装置を示すブ
ロック図、第２図は従来のタイマー予約装置のブロック
図である。１・・・・・・マイクロフォン、２・・・・・・増幅器
、３・・・・・・Ａ−Ｄ変換器、４・・・・・・照合器
、５・・・・・・認識単語メモリー　６・・・・・・認
識結果メモリー　７・・・・・・認識スイッチ、８・・
・・・・制御部、９・・・・・・スイッチ、１０・・・
・・・レジスタ、１１・・・・・・比較器、１２・・・
・・・出カスインチ。FIG. 1 is a block diagram showing a timer reservation device according to an embodiment of the present invention, and FIG. 2 is a block diagram of a conventional timer reservation device. 1...Microphone, 2...Amplifier, 3...A-D converter, 4...Verifier, 5...Recognized word Memory 6... Recognition result memory 7... Recognition switch, 8...
...Control unit, 9...Switch, 10...
...Register, 11...Comparator, 12...
... Out of stock.

Claims

[Claims]

A speech recognition device for recognizing speech of a plurality of words, comprising a sequential memory means for storing a degree of similarity according to a degree of matching between an input speech and a recognition word memory in which a plurality of recognized words are registered; , a recognition result output means that outputs the recognized word with the highest degree of similarity in the first recognition, and a recognition result output means which outputs the recognized word with the highest degree of similarity in the n-th recognition, other than the recognition word that has already been output as the recognition result. output means;
A speech recognition device comprising a reset means for returning to the first recognition.