JPH03160499A

JPH03160499A - Speech recognizing device

Info

Publication number: JPH03160499A
Application number: JP1301133A
Authority: JP
Inventors: Shinichi Tsurufuji; 鶴藤　真一
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 1989-11-20
Filing date: 1989-11-20
Publication date: 1991-07-10

Abstract

PURPOSE:To avert the erroneous recognition by noises without executing sound volume control of an acoustic apparatus which is a noise source by prohibiting the comparison processing of an identifying section or nullifying the output thereof when a voice pattern and an acoustic pattern coincide. CONSTITUTION:A sound analyzing section 5 which analyzes ambient sounds is provided separately from a voice analyzing section 1 for voice recognition and a means 7 for deciding whether the sound inputted to the voice analyzing section 1 is the ambient sound or not according to the degree of coincidence of the voice for voice recognition and the sound outputted from the acoustic apparatus is provided. If the voice analyzing section 1 is regarded to analyze not the voice but the ambient noise by the deciding means 7, the voice recognition to the voice pattern of this time is not executed to avert the erroneous recognition of the voice. The inconvenience to recognize the ambient noise is averted in this way and the erroneous recognition is lessened.

Description

【発明の詳細な説明】（イ）産業上の利用分野本発明は音声や音楽などの音響を出力する音響機器、例
えばＴＶ、ステレオなどの周囲雑音源の近傍でも有効な
音声認識を可能とできる音声認識装置に関する。[Detailed description of the invention] (a) Industrial application field The present invention enables effective speech recognition even in the vicinity of ambient noise sources such as audio equipment that outputs sounds such as voice and music, such as TVs and stereos. The present invention relates to a speech recognition device.

（ロ）従来の技術従来の音声認識装置は、ステレオなどの音響機器の近傍
では、この機器の出力音響が周囲雑音となり、誤認識を
多発する危惧があった。特に、このような機器自体を音
声認識装置で操作使用とする場合は、音響機器が発する
音声や音楽で音声認識装置が勝手に動作する危惧を免れ
ることはでさなかった。(B) Prior Art Conventional speech recognition devices have the risk of causing frequent recognition errors when placed near audio equipment such as a stereo, as the output sound of this equipment becomes ambient noise. Particularly, when such equipment itself is operated by a voice recognition device, there is a risk that the voice recognition device may operate on its own due to the voice or music emitted by the audio device.

従って、このような誤動作防止の為に、例えば音声入力
を行う場合には、その音声入力期間だけ音響機器の出力
音を小さくするようになした音声認識装置が提案されて
いる［特開昭６３−２９７５５号公報］。しかしながら
、このような装置では、音響機器からの音声や音楽の中
断を強いられる不都合があるばかりか、以下の問題が残
る。Therefore, in order to prevent such malfunctions, a speech recognition device has been proposed in which, for example, when inputting a voice, the output sound of the audio device is reduced during the voice input period [JP-A-63 -29755 publication]. However, such a device not only has the inconvenience of having to interrupt the audio and music from the audio equipment, but also has the following problems.

即ち、上述の装置では、音声入力期間を指定するための
スイッチ操作が必要であり、また、このスイッチ操作を
回避してマイクロホンへの音声入力の音圧レベルにて音
声入力期間を検出するようにしても、やはり音響機器か
らの音響がこの検出動作を妨げる危惧を完全には回避で
きない。That is, the above-mentioned device requires a switch operation to specify the audio input period, and this switch operation is avoided and the audio input period is detected based on the sound pressure level of the audio input to the microphone. However, it is still not possible to completely avoid the possibility that the sound from the audio equipment may interfere with this detection operation.

（ハ）発明が解決しようとする課題本発明は、雑音源たる音響機器の音量制御を行うことな
しにこの雑音による誤認識を回避できる音声認識装置を
提供するものである。(c) Problems to be Solved by the Invention The present invention provides a speech recognition device that can avoid erroneous recognition due to noise without controlling the volume of an audio device that is a noise source.

（二）課題を解決するための手段本発明の音声認識装置は、入力された音声を分折する音
声分析部と該音声分析部で分析された音声から音声パタ
ンを作成する第１パタン作成部、該第１パタン作底部で
作威された音声パタンをあらかじめ用意された標準的な
音声パタンと比較して認識結果を出力する識別部を有す
るものであって、更に、上記音声分析部とは別に周囲音
響を分析する音響分析部と該音響分析部で分析された周
囲音響から音響パタンを作成する第２パタン作成部、該
第２パタン作成部で作威された音響パタンと上記第１パ
タン作成部で作成された音声パタンとの一致を判定する
判定部を設け、該判定部にて第１パタン作成部で作成さ
れた音声パタンと第２パタン作成部で作成された音響パ
タンとか一致すると判定した場合には、上記識別部の比
較処理を禁止あるいはその出力を無効にするものである
。(2) Means for Solving the Problems The speech recognition device of the present invention includes a speech analysis section that separates input speech and a first pattern creation section that produces speech patterns from the speech analyzed by the speech analysis section. , an identification unit that compares the voice pattern created in the first pattern creation section with a standard voice pattern prepared in advance and outputs a recognition result; an acoustic analysis section that separately analyzes ambient sound; a second pattern creation section that creates an acoustic pattern from the ambient sound analyzed by the acoustic analysis section; the acoustic pattern created by the second pattern creation section and the first pattern; A determining unit is provided to determine whether the audio pattern created by the creating unit matches, and the determining unit determines whether the audio pattern created by the first pattern creating unit matches the acoustic pattern created by the second pattern creating unit. If it is determined, the comparison process of the identification section is prohibited or its output is invalidated.

（ホ）作用本発明の音声認識装置によれば、音声認識用の音声分析
部とは別に周囲音響音響を分析する音響分析部を設け、
音声認識用の音声と音響機器から出力される音響との一
致の度合いにより、音声分析部に入力された音が周囲音
響であるかどうかを判定する手段を設け、この判定手段
により上記音声分析部が音声でなくて周囲雑音を分析し
ていると見做せる場合には、この時の音声パタンに対す
る音声認識を行わないようにしたので、音声の誤認識を
回避することができる。(E) Function According to the speech recognition device of the present invention, an acoustic analysis section for analyzing ambient acoustics is provided separately from the speech analysis section for speech recognition,
A means is provided for determining whether the sound input to the voice analysis section is ambient sound based on the degree of coincidence between the voice for voice recognition and the sound output from the audio equipment, and this determination means determines whether the sound input to the voice analysis section is ambient sound or not. When it can be assumed that the analysis is not a voice but ambient noise, voice recognition is not performed for the voice pattern at this time, so that erroneous voice recognition can be avoided.

（へ）実施例第１図に、本発明の音声認識装置の構戊のブロック図を
示す。同図に於で、１は音声分析部であり、話者に向け
られてセットされたマイク９によって入力された音響信
号から音声時間領域を切り出し、この領域の音声信号を
周波数分析してスペクトル分析あるいはケプスドラム分
析などの音声分析を行う。２は第１パタン作成部であり
、音声分析部ｌでの分析結果をサンプリング正規化して
ベクトルパタンあるいはケプストラムパタンなどで表さ
れる音声パタンを得る。３は音声パタン格納部であり、
あらかじめ標準的な特定あるいは不特定の音声を分析し
て抽出した複数の音声パタンを認識時の標準パタンとし
て格納している。４は識別部であり、第１パタン作成部
２で作成された音声パタンをあらかじめ音声パタン格納
部３に格納されている複数の標準パタンと順番にマッチ
ングを行い、最も類似しているものを認識結果として出
力するところである。(F) Embodiment FIG. 1 shows a block diagram of the structure of the speech recognition device of the present invention. In the figure, reference numeral 1 denotes a voice analysis unit, which cuts out the voice time domain from the acoustic signal input by the microphone 9 set toward the speaker, performs frequency analysis on the voice signal in this domain, and performs spectrum analysis. Or perform voice analysis such as ceps drum analysis. Reference numeral 2 denotes a first pattern creation section, which samples and normalizes the analysis results from the speech analysis section 1 to obtain a speech pattern expressed as a vector pattern, a cepstrum pattern, or the like. 3 is a voice pattern storage section;
A plurality of speech patterns extracted by analyzing standard specific or unspecified speech in advance are stored as standard patterns for recognition. 4 is an identification unit that sequentially matches the audio pattern created by the first pattern creation unit 2 with a plurality of standard patterns stored in advance in the audio pattern storage unit 3, and recognizes the most similar one. This is where the results will be output.

以上の溝或は従来装置の構戊と同じであり、本発明装置
が特徴とするところは、以下の構戒を付設した点にある
。The structure of the groove or the conventional device is the same as described above, and the feature of the device of the present invention is that the following structure is added.

同図の５は音響分析部であり、カーステレオなどの音響
機器８で出力された音声あるいは音楽を分析するところ
であり、その分析手法は上記音声分析部５と同一のスペ
クトル分析あるいはケプストラム分析が採用される。Reference numeral 5 in the figure denotes an acoustic analysis section, which analyzes the audio or music output from the audio equipment 8 such as a car stereo, and uses the same spectral analysis or cepstral analysis as the above-mentioned audio analysis section 5. be done.

６は第２パタン作成部であり、該音響分析部５で分析さ
れた音響に対して上記第１パタン作成部２と同様のサン
プリング正規化によって音響パタンを得る。Reference numeral 6 denotes a second pattern creation section, which obtains an acoustic pattern from the sound analyzed by the acoustic analysis section 5 through sampling normalization similar to that of the first pattern generation section 2 described above.

７は音声判定部であり、第１パタン作成部２と第２パタ
ン作成部６で同時に作成された両パタンを比較して上記
識別部４の識別処理を行うかどうかを判定するところで
ある。Reference numeral 7 denotes a voice determination section, which compares both patterns created simultaneously by the first pattern creation section 2 and the second pattern creation section 6 to determine whether or not to perform the identification process of the identification section 4.

以下に、これら音響分析部５、第２パタン作成部６、音
声判定部７の処理を中心に本発明装置の動作について解
説する。The operation of the apparatus of the present invention will be explained below, focusing on the processing of the acoustic analysis section 5, second pattern creation section 6, and voice determination section 7.

まず、音声分析部１及び音響分析部５について説明する
。First, the speech analysis section 1 and the acoustic analysis section 5 will be explained.

これら音声分析部１及び音響分析部５は同一構或であり
、同時に一定時間毎（例えば５　ｍｓｅｃ）にサンプリ
ングされる。これらの分析部１．５の構戊は、従来装置
の分析部と同様に、第２図に示す如く、アンブ１０１、
ブリエンファシス１０２、バンドバスフィルタ１０３、
ＡＤコンハータｌＯ４、音声検出回路１０５、バツファ
１０６などで構或される。即ち、音声分析部１ではマイ
ク９から入力された音はこのアンプ１０１で増幅され、
プリエンファシス１０２で高城強調がかけられ、バンド
パスフィルタ１０３によりフィルタ分析されＡＤコンバ
ータ１０４でデイジタル値に変換される。一方の音声検
出回路１０５ではＡＤコンバータ１０４でデイジタル値
に変換されたデータから音声の語頭をチェックし、切り
出しのしきい値を越えた場合には、データをバツファ１
０６に貯える。これと同時に音響分析部５に音声の取り
組みの開始を伝達する。These voice analysis section 1 and acoustic analysis section 5 have the same structure, and simultaneously sample at fixed time intervals (eg, 5 msec). The structure of these analysis sections 1.5 is similar to the analysis section of the conventional device, as shown in FIG.
Bi-emphasis 102, bandpass filter 103,
It is composed of an AD converter lO4, a voice detection circuit 105, a buffer 106, and the like. That is, in the voice analysis section 1, the sound input from the microphone 9 is amplified by the amplifier 101,
A pre-emphasis 102 applies Takagi emphasis, a bandpass filter 103 performs filter analysis, and an AD converter 104 converts the signal into a digital value. On the other hand, the voice detection circuit 105 checks the beginning of the voice from the data converted into digital values by the AD converter 104, and if it exceeds the cutout threshold, the data is buffered by 1.
Save in 06. At the same time, the start of audio processing is transmitted to the acoustic analysis section 5.

基本的に音声分析部１と同溝戊の音響分析部５では、上
記音声分析部ｌから音声の取り組みの開始の伝達を受け
ると音響出力装置１０から出力された音響信号を受信し
て、これを上述と同様に分析した分析データをそのバッ
ファ１０６に取り込む。Basically, when the voice analysis section 1 and the same sound analysis section 5 receive the notification of the start of voice work from the voice analysis section 1, they receive the acoustic signal output from the acoustic output device 10 and receive the acoustic signal. The analysis data analyzed in the same manner as described above is loaded into the buffer 106.

続いて、音声分析部１は分析を継続しながら音声検出回
路１０５において語尾の検出を行い、語尾と判定される
までそのバッファ１０６にデータを取り込み、語尾と判
定されると取り込みを終了するとともに音響分析部５に
取り込みの終了を伝達する。このように音声分析部１か
ら取り込みの終了の伝達をうけた音響分析部５はカース
テレオからの音の分析を終了し、そのバツファ１０６の
取り込みを終了する。Next, while continuing the analysis, the speech analysis unit 1 detects the ending of a word in the speech detection circuit 105, and captures the data into the buffer 106 until it is determined that it is the ending of a word.When it is determined that it is the ending of a word, the speech analysis unit 1 ends the capturing and detects the sound. The end of the import is transmitted to the analysis unit 5. In this way, the acoustic analysis section 5 receives the notification of the end of the capture from the voice analysis section 1 and finishes analyzing the sound from the car stereo, and finishes capturing the buffer 106.

上述の如くして、音声分析部ｌと音響分析部５との分析
時間を同期せしめることができる。As described above, the analysis times of the voice analysis section 1 and the acoustic analysis section 5 can be synchronized.

なお、これら両分析部は、個別の７）一ド構或で示した
が、ディジタル回路部分は時分割的に共通使用できる。Although both of these analysis sections are shown as separate 7) one-domain structures, the digital circuit portions can be used in common in a time-division manner.

次に、第１パタン作成部２及び第２ノ｛タン作成部６に
ついて説明する。Next, the first pattern creation section 2 and the second pattern creation section 6 will be explained.

これらのパタン作成部２，６もまた、従米装置のパタン
作底部を採用でき、ともに同一構戊で実現できる。該第
１パタン作成部２では、上記音声分析部ｌで分析された
分析結果を音声認識に適する音声パタンに変換される。These pattern making sections 2 and 6 can also adopt the pattern making bottom section of the follow-up device, and both can be realized with the same structure. In the first pattern creation section 2, the analysis result analyzed by the speech analysis section 1 is converted into a speech pattern suitable for speech recognition.

一方、音響分析部５で分析された分析結果は第２パタン
作成部６で、第１パタン作成部２と同じ手法で音響ノく
タンに変換される。従って、これらのパタン作成部もま
た同一ハードを時分割使用しても良ｂ）。On the other hand, the analysis result analyzed by the acoustic analysis section 5 is converted into an acoustic pattern by the second pattern generation section 6 using the same method as the first pattern generation section 2. Therefore, these pattern creation units may also use the same hardware in a time-sharing manner b).

更に、音声判定部７について説明する。この音声判定部
７では、第１パタン作成部２で作成された音声パタンと
第２パタン作成部６で作成された音響パタンのマッチン
グを行う。このマ・ノチングは通常は線形マッチングを
行い、二つのノく夕冫の類似度を計算する。この類似度
とあらかじめ定められたしきい値により音声であるかど
うかの判定を行う。Furthermore, the voice determination section 7 will be explained. This audio determination section 7 performs matching between the audio pattern created by the first pattern creation section 2 and the acoustic pattern created by the second pattern creation section 6. This ma-notching usually performs linear matching to calculate the degree of similarity between two notchings. Based on this degree of similarity and a predetermined threshold, it is determined whether or not it is a voice.

即ち、類似度がしきい値より小さく両者が類似している
時、これを一致と判定し、音声パタンがカーステレオか
らの音響によるものと見做せるので、この場合は識別部
４の処理動作を禁止する。That is, when the degree of similarity is smaller than the threshold value and the two are similar, this is determined to be a match, and the audio pattern can be considered to be caused by the sound from the car stereo, so in this case, the processing operation of the identification unit 4 is prohibited.

あるいは、この場合識別部４からの識別結果の出力を遮
断して無効にしてもよい。一方、これとは逆に、類似度
がしきい値より大きく両者が類似していない時は、これ
を不一致と判定し、話者がマイクに音声入力したものと
見做せるので、この場合は識別部４にその胸の信号が伝
達され、該識別部４は識別動作を行う。従って、この識
別部４の動作禁止、あるいは出力無効によって、音響機
唇８がどんなに大出力でマイク９への音響の回り込みが
あっても、これとは別のマイク９への入力があるまでは
、音声認識しないことになる。Alternatively, in this case, the output of the identification result from the identification unit 4 may be blocked and invalidated. On the other hand, if the degree of similarity is greater than the threshold and the two are not similar, it is determined that there is a mismatch, and it can be assumed that the speaker has input voice into the microphone. The chest signal is transmitted to the identification section 4, and the identification section 4 performs an identification operation. Therefore, no matter how high the sound output from the sounder lip 8 is and the sound wraps around to the microphone 9 by disabling the operation or disabling the output of the identification unit 4, the sound will not be input until another microphone 9 is input. , voice recognition will not be possible.

府して、該識別部４は上記判定部７の結果が、音声入力
を表しているという判断を受けて初めて音声パタン格納
部３から順次パタンを取り出しマッチングを行い、最も
類似しているパタンを認識結果として出力する。First, only after receiving the judgment that the result of the judgment section 7 represents a voice input, the identification section 4 sequentially extracts patterns from the voice pattern storage section 3 and performs matching, and selects the most similar pattern. Output as recognition result.

更に、具体例を挙げて、説明を加える。Further, a specific example will be given and an explanation will be added.

該実施例の場合の前提条件としては、音声認識による操
作制御の対称が周囲雑音を発する音響機器８であるカー
ステレオそのちとする。この場合カーステレオからの音
響信号の音響分析部５への送信は、カーステレオのスピ
ーカ付近に設けたマイクを用いてもよいが、直接カース
テレオのオーディオ出力端子から音響分析部５の入力に
結線するすれば、上記のマイクは不要になる。As a prerequisite for this embodiment, it is assumed that the object of operation control by voice recognition is a car stereo, which is an audio device 8 that emits ambient noise. In this case, a microphone installed near the car stereo's speakers may be used to transmit the acoustic signal from the car stereo to the acoustic analysis section 5, but a wire is connected directly from the audio output terminal of the car stereo to the input of the acoustic analysis section 5. If you do that, you won't need the microphone mentioned above.

まず、カーステレオ（８）からの音楽により音声分析部
１が音声の切り出しを行った場合について説明する。First, a case will be described in which the audio analysis section 1 cuts out audio based on music from the car stereo (8).

ここで音声分析部ｌのアンプは、通常の音楽の音圧に対
しては音声分析部１が音声時間領域をしてそれを切り出
さないようなレベルに設定しておけば、音楽を分析して
しまう可能性を低減できることになる。斯して、カース
テレオ（８）から大きな音が出力されるとドライバー用
にセットされたマイク９にこれが入力され、音声分析部
ｌは、カーステレオ（８）からの音声の取り込みを開始
して、音響分析部５に於ても同時に取り込みを開始する
。Here, if the amplifier of the audio analysis section 1 is set to a level that prevents the audio analysis section 1 from extracting the sound pressure of normal music in the audio time domain, it will be possible to analyze the music. This will reduce the possibility of it being lost. In this way, when a loud sound is output from the car stereo (8), this is input to the microphone 9 set for the driver, and the audio analysis section l starts capturing the sound from the car stereo (8). At the same time, the acoustic analysis section 5 also starts capturing.

そして分析を続けて語尾のチェックを行い語尾を検出す
ると音声判定部７により音声の判定を行う。この場合に
は、音声分析部１の入力もカーステレオ８のみであるた
め、第１パタン作成部２で作威された音声パタンと第２
パタン作成部６で作成された音響パタンは一致するため
類似度は大きくなり、音声判定部７は、音声認識の為の
ドライバーの音声ではないと判断する。ここでカーステ
レオ（８）の出力がマイク９に入力されるまでの時間的
な誤差により完全には一致しないが、音響分析部５に時
ｒｒＩ遅れの回路を付加することによりこの誤差を少な
くすることが可能である。Then, the analysis continues and the ending of the word is checked, and when the ending of the word is detected, the speech is judged by the speech determining section 7. In this case, since the input to the audio analysis section 1 is only from the car stereo 8, the audio pattern created by the first pattern creation section 2 and the second
Since the acoustic patterns created by the pattern creation unit 6 match, the degree of similarity increases, and the voice determination unit 7 determines that the sound is not the driver's voice for voice recognition. Here, the output of the car stereo (8) does not match perfectly due to a time error until it is input to the microphone 9, but this error can be reduced by adding a time rrI delay circuit to the acoustic analysis section 5. Is possible.

また音声判定部７のマッチングに線形シフトマッチング
などを用いることにより誤差を吸収することも可能であ
る。Furthermore, it is also possible to absorb errors by using linear shift matching or the like for matching in the voice determining section 7.

次に、カーステレオ（８）が音響を出力中に音声により
制御を行う場合について説明する。Next, a case will be described in which the car stereo (8) is controlled by voice while outputting sound.

音声パタン格納部３には「イジエクト」、「ブレイ」、
「停止」、ｒ巻戻し」、「早送りｊ等があらかじめカー
ステレオの音声制御用に分析されたパタンが格納されて
いるとする。ドライバーがマイク９に「停止」と発声す
ると音声分析部１はカーステレオ（８）からの例えば音
楽とマイク９からのドライバーの音声「停止」が混在し
たものの分析を行う。音声分析部ｌの音声検出回路ｌ０
５は、「停止」の語頭を検出して音声の取り込みを開始
すると同時に、音響分析部５に取り込み開始を伝達する
。これを受けた音響分析部５はカーステレオ（８）の出
力音を音声分析部１の音声検出回路１０５が語尾の検出
するまで分析し、これを取り込む。The voice pattern storage section 3 contains "Eject", "Bray",
It is assumed that patterns such as "stop", "rewind r", and "fast forward j" are stored that have been analyzed in advance for voice control of a car stereo.When the driver speaks "stop" into the microphone 9, the voice analysis section 1 For example, a mixture of music from the car stereo (8) and the driver's voice "stop" from the microphone 9 is analyzed. Voice detection circuit l0 of voice analysis unit l
5 detects the beginning of the word "stop" and starts capturing audio, and simultaneously transmits the start of capturing to the acoustic analysis unit 5. Upon receiving this, the acoustic analysis section 5 analyzes the output sound of the car stereo (8) until the speech detection circuit 105 of the speech analysis section 1 detects the ending of the word, and captures this.

このようにして音声分析部ｌと音響分析部５での分析が
終了すると第１パタン作成部２及び第２パタン作成部６
でそれぞれパタンに変換される。When the analysis in the voice analysis section 1 and the acoustic analysis section 5 is completed in this way, the first pattern generation section 2 and the second pattern generation section 6
Each is converted into a pattern.

変換されたパタンは音声判定部７により音声であるかど
うかが判定されるが、音声分析部１で分析されたパタン
はドライバーの音声とカーステレオ（８）の音響とが混
在されているので、音響分析部５で分析されたパタンと
は一致せず、ドライバーの音声入力があったと判断し、
識別部４に識別動作の開始を実行させる信号を伝達する
。識別部４は音声パタン格納部３からまず”イジエクト
”のパタンを取り込んでマッチングを行い、続いて「プ
レイ」、「停止」と順次パタンの最後までマッチングを
行い、最も類似度が大きい「停止」を認識結果として出
力する。この結果、ドライバーの意図どおり、カーステ
レオ（８）の動作を停止させることができる。The converted pattern is determined by the voice determination unit 7 to determine whether it is voice or not, but since the pattern analyzed by the voice analysis unit 1 is a mixture of the driver's voice and the sound of the car stereo (8), It did not match the pattern analyzed by the acoustic analysis unit 5, and it was determined that there was a voice input from the driver.
A signal is transmitted to the identification unit 4 to start the identification operation. The identification unit 4 first imports the "eject" pattern from the audio pattern storage unit 3 and performs matching, then matches "play" and "stop" sequentially until the end of the pattern, and selects "stop" with the highest degree of similarity. is output as the recognition result. As a result, the operation of the car stereo (8) can be stopped as intended by the driver.

ここでは、第１パタン作成部２で作成されたパタンをそ
のまま用いたが、第１パタン作成部２で作處されたパタ
ンから第２パタン作成部で作成されたパタンを引いたも
のを識別部４の処理に用いると雑音戊分が相殺された音
声底分だけのパタンか得られ、これによって更に高い認
識性能が得られる。Here, the pattern created in the first pattern creation section 2 was used as is, but the identification section subtracted the pattern created in the second pattern creation section from the pattern created in the first pattern creation section 2. When used in the processing in step 4, a pattern consisting only of the voice base part with the noise part canceled out can be obtained, and thereby higher recognition performance can be obtained.

上述の説明では、音声認識装置の音声制御対象自身が周
囲音響雑音源である場合を示したが、本発明はこれに限
定されるのもでなく、例えば、カーステレオの音響下で
パワーウィンドの開閉動作を音声制御する場合であって
も本発明を採用できる。In the above description, the voice control target of the voice recognition device itself is a source of ambient acoustic noise, but the present invention is not limited to this. For example, the voice control target of the voice recognition device is an ambient acoustic noise source. The present invention can be applied even when the opening/closing operation is voice-controlled.

（ト）発明の効果本発明の音声認識装置によれば、話者の音声入力がない
かぎり、周囲雑音を認識する不都合を回避でき、誤認識
の低減が図れる。(G) Effects of the Invention According to the speech recognition device of the present invention, as long as there is no voice input from the speaker, it is possible to avoid the inconvenience of recognizing ambient noise, and to reduce misrecognition.

[Brief explanation of the drawing]

第１図は本発明の音声認識装置の構戊を示すブロック図
、第２図は第１図の本発明装置の要部の構戊図である。１・・・音声分析部、２・・・第１パタン作成部、３・
・・音声パタン格納部、４・・・識別部、５・・・音響
分析部６・・・第２パタン作成部、７・・・音声判定部
、８・・・音響出力装置、９・・・マイク。FIG. 1 is a block diagram showing the structure of the speech recognition apparatus of the present invention, and FIG. 2 is a block diagram showing the structure of the main part of the apparatus of the present invention shown in FIG. 1... Voice analysis section, 2... First pattern creation section, 3.
. . . Audio pattern storage unit, 4 . . Identification unit, 5 . ·microphone.

Claims

[Claims]

(1) A voice analysis section that analyzes input voice; a first pattern generation section that creates voice patterns from the voice analyzed by the voice analysis section; and a voice pattern created by the first pattern generation section that is prepared in advance. In a speech recognition device that has an identification section that outputs a recognition result by comparing it with a standard speech pattern that has been analyzed, an acoustic analysis section that analyzes ambient sound separately from the speech analysis section, and an acoustic analysis section that analyzes ambient sound a second pattern creation section that creates an acoustic pattern from ambient sound; and a determination section that determines whether the acoustic pattern created by the second pattern creation section matches the audio pattern created by the first pattern creation section. If the determination unit determines that the audio pattern created by the first pattern creation unit and the acoustic pattern created by the second pattern creation unit match, the comparison process of the identification unit is prohibited or the output thereof is A voice recognition device characterized by disabling the.

(2) The speech recognition device according to claim 1, wherein the analysis time domain of the acoustic analysis section is synchronized with the analysis time domain of the speech analysis section.

(3) A speech recognition device characterized in that the acoustic analysis section is directly input with an output signal of an audio device that affects the speech input to the first speech analysis section.