JPH04155400A

JPH04155400A - Voice recognition device

Info

Publication number: JPH04155400A
Application number: JP2281020A
Authority: JP
Inventors: Shinichi Tsurufuji; 鶴藤　真一
Original assignee: Sanyo Electric Co Ltd
Current assignee: Sanyo Electric Co Ltd
Priority date: 1990-10-18
Filing date: 1990-10-18
Publication date: 1992-05-28

Abstract

PURPOSE:To prohibit the operation of either a microphone, a sound analyzer, a voice pattern generator, or a registered voice pattern memory, even in a voice registering mode under a noisy ambiance, by prohibiting the operation of either the microphone, the sound analyzer, the voice pattern generator, or the registered voice pattern memory, when the input sound level of the sound input member is larger than a specific value, in the case that the registering mode is set. CONSTITUTION:A sound input member 7 picks up the ambient noise by a voice analyzer 2 at every specific time, and when the input is finished, it is delivered to the sound input member 7 and a deciding member 8. The input member 7 decides whether there is a noise or not. The sound level from a terminal 9 is compared with a specific threshold level, and when the sound level is higher than the threshold level, a prohibition signal is generated. Since no prohibition signal is generated when there is no input of the ambient noise, a pattern of the voice analized in a pattern generator 3 is produced, and stored in a registered voice pattern memory 5. When there is the input of an ambient noise, a prohibition signal is generated, and the operation of the generator 3 is prohibited. When the voice registering mode is completed, the voice registering mode is set by a manual operation by using a mode setting member 4, and it is transferred to the voice recognition processing of a recognition processor 6.

Description

【発明の詳細な説明】（イ）産業上の利用分野本発明は雑音となる周囲音響の存在下で音声の認識を行
う音声認識装置に関する。DETAILED DESCRIPTION OF THE INVENTION (A) Field of Industrial Application The present invention relates to a speech recognition device that recognizes speech in the presence of ambient sound as noise.

（ロ）従来の技術現在実用化されている音声認識装置としては、特定話者
を対象としたものが殆どであり、この様な音声認識装置
では、特定話者の特定音声をあらかじめ登録しておく必
要がある。即ち、予じめ、音声登録モードで多数の音声
を夫々パターン化して登録しておき、これに続く音声認
識モードで入力した音声のパターンをすでに登録されて
いる多数の音声のパターンと比較し、入力音声に最も類
似の登録音声を見出すことで、音声認識が行われるので
ある。(b) Conventional technology Most of the speech recognition devices currently in practical use are aimed at specific speakers.In such speech recognition devices, the specific speech of a specific speaker is registered in advance. It is necessary to keep it. That is, in advance, a large number of voices are registered as patterns in the voice registration mode, and then the voice pattern input in the voice recognition mode is compared with the many voice patterns that have already been registered. Speech recognition is performed by finding the registered voice that is most similar to the input voice.

従って、音声認識モードに於て話者が音声入力を行う時
に、この話者の周囲に何らかの音響があるとこの音響が
雑音となり、この雑音に埋もれた音声に対して認識処理
しなければならないので、誤認識が顯繁に発生する危惧
があった。Therefore, when a speaker inputs voice in speech recognition mode, if there is any sound around the speaker, this sound becomes noise, and recognition processing must be performed on the voice buried in this noise. There was a fear that misunderstandings would occur frequently.

このような雑音の悪影響は、音声認識モードでの認識の
ための音声入力のみならず、音声登録モードでの音声登
録のための音声入力時に於ても、重大な課題である。即
ち、音声登録モードに於て雑音で歪んだ音声を登録して
しまうと、その後の音声認識モードに於て、たとえ雑音
の無い状況を選んで、慎重に正確な音声を入力しても、
結局、これを正確に認識することができなくなるのであ
る。The adverse effects of such noise are a serious problem not only during voice input for recognition in the voice recognition mode, but also during voice input for voice registration in the voice registration mode. In other words, if you register a voice that is distorted due to noise in the voice registration mode, it will be difficult to register the voice in the subsequent voice recognition mode, even if you choose a situation without noise and carefully input the correct voice.
In the end, it becomes impossible to recognize this accurately.

（ハ）発明が解決しようとする課題本発明は上述の従来の欠点に鑑みてなされたものであり
、雑音環境下の音声登録モードに於てもマイク、音響分
析部、音声パターン作成部、あるいは登録音声パターン
メモリのいずれかの動作を禁止できる音声認識装置を提
供するものである。(c) Problems to be Solved by the Invention The present invention has been made in view of the above-mentioned conventional drawbacks, and even in the voice registration mode in a noisy environment, the microphone, acoustic analysis section, voice pattern creation section, or An object of the present invention is to provide a speech recognition device that can prohibit the operation of any one of registered speech pattern memories.

（ニ）課題を解決するための手段本発明の音声認識装置は、音声を入力するためのマイク
、該マイクから入力される音声を分析する音声分析部、
該音声分析部の分析結果に基づいて音声の特徴をパター
ン化した音声パターンを作成する音声パターン作成部、
音声登録モード時に上記マイクに入力した音声に対して
上記音声パターン作成部から得られる音声パターンを登
録音声パターンとして貯える登録音声パターンメモリ、
認識モード時に上記マイクに入力した音声に対して上記
音声パターン作成部から得られる音声パターンを上記登
録音声パターンメモリの登録音声パターンに基づいてパ
ターン認識する認識処理部、登録音声パターンを登録音
声パターンメモリに貯えるための音声登録モターンに基
づいた音声認識を行うための音声認識モードなどのモー
ドを選択　　−設定するモード設定部、更に、周囲音響
を入力する音響入力部を備え、上記モード設定部が登録モードを設定している時に該音
響入力部の入力音響レベルが所定の値より大きい場合に
は、上記マイク、音響分析部、音声パターン作成部、あ
るいは登録音声パターンメモリのいずれかの動作を禁止
するものである。(d) Means for Solving the Problems The speech recognition device of the present invention includes a microphone for inputting speech, a speech analysis section for analyzing speech input from the microphone,
a voice pattern creation unit that creates a voice pattern in which voice characteristics are patterned based on the analysis results of the voice analysis unit;
a registered voice pattern memory that stores, as a registered voice pattern, a voice pattern obtained from the voice pattern creation section for the voice input to the microphone in the voice registration mode;
a recognition processing unit that recognizes a voice pattern obtained from the voice pattern creation unit based on the registered voice pattern of the registered voice pattern memory for the voice input to the microphone in the recognition mode; a voice pattern memory that registers the registered voice pattern; Select a mode such as voice recognition mode for performing voice recognition based on the voice registration pattern for storing the voice in the memory. If the input sound level of the sound input section is higher than a predetermined value when setting the mode, the operation of any one of the microphone, sound analysis section, voice pattern creation section, or registered voice pattern memory is prohibited. It is something.

（ホ）作用本発明の音声認識装置によれば、モード設定部が登録モ
ードを設定している時に該音響入力部の入力音響レベル
が所定の値より大きい場合には、上記マイク、音響分析
部、音声パターン作成部、あるいは登録音声パターンメ
モリのいずれかの動作を禁止するので、登録音声パター
ンの作成あるいは記憶を禁止することになる。(E) Effect According to the voice recognition device of the present invention, when the mode setting section sets the registration mode and the input sound level of the sound input section is higher than a predetermined value, the microphone, the sound analysis section , the voice pattern creation section, or the registered voice pattern memory, thus inhibiting the creation or storage of registered voice patterns.

（へ）実施例第１図は本発明の音声認識装置の構成を示すブロック図
である。同図に於て、１はマイク、２はマイク１から入
力された音声を分析する音声分析部、３は音声分析部２
の分析結果に基づいてパターンを作成するパターン作成
部である。(F) Embodiment FIG. 1 is a block diagram showing the configuration of a speech recognition apparatus according to the present invention. In the figure, 1 is a microphone, 2 is a voice analysis section that analyzes the voice input from microphone 1, and 3 is a voice analysis section 2.
This is a pattern creation section that creates a pattern based on the analysis results.

４はパターン作成部３で作成されたパターンが登録用音
声のパターンであるか、認識用音声のそれであるかの設
定を行うモード設定部である。Reference numeral 4 denotes a mode setting section for setting whether the pattern created by the pattern creation section 3 is a pattern for registration audio or a recognition audio pattern.

５はモード設定部４で設定した動作モードが登録モード
である場合に、パターン作成部３で作成されたパターン
を標準パターンとして貯える登録音声パターンメモリ、６はモード設定部４で設定した動作モードが認識モード
である場合に、入力音声の認識処理を行う認識処理部で
ある。Reference numeral 5 indicates a registered audio pattern memory for storing the pattern created in the pattern creation unit 3 as a standard pattern when the operation mode set in the mode setting unit 4 is the registration mode; 6 indicates a memory in which the operation mode set in the mode setting unit 4 is This is a recognition processing unit that performs recognition processing of input speech when in recognition mode.

７は周囲雑音を入力する音響入力部であり、例えば、カ
ーステレオの音響出力が周囲雑音となる場合には、その
スピーカーからの出力をマイクロフォンで検出して、こ
の音響入力部７に入力してもよいが、カーステレオのオ
ーディオ出力端子９の出力信号をこの音響入力部７に入
力するようにできる。Reference numeral 7 denotes an audio input section for inputting ambient noise. For example, when the acoustic output of a car stereo becomes ambient noise, the output from the speaker is detected by a microphone and inputted to the acoustic input section 7. Alternatively, the output signal from the audio output terminal 9 of the car stereo may be input to the audio input section 7.

８は音響入力部７の状態によりパターンの有効性を判定
する判定部であり、この場合、音響入力部７の入力音響
レベルが所定の値より大きいかどうかの判定を行い、こ
の判定結果に基づいて、パターン作成部３の動作を制御
する。8 is a determination unit that determines the validity of the pattern based on the state of the audio input unit 7; in this case, it determines whether the input sound level of the audio input unit 7 is higher than a predetermined value, and based on this determination result, and controls the operation of the pattern creation section 3.

以上の構成の本発明の音声認識装置の登録モードでの動
作を以下に解説する。The operation of the speech recognition device of the present invention having the above configuration in the registration mode will be explained below.

使用者は、斯る音声認識装置にて音声認識を行う前に、
あらかじめ該装置の登録音声パターンメモリ５に複数の
音声パターンを登録する必要がある。従って、使用者は
、モード設定部４を用いて音声登録モードを手操作（ス
イッチ操作など）にて設定する。Before performing voice recognition with such a voice recognition device, the user must:
It is necessary to register a plurality of voice patterns in the registered voice pattern memory 5 of the device in advance. Therefore, the user uses the mode setting section 4 to manually set the voice registration mode (switch operation, etc.).

このようにして、登録モードが設定されると、この装置
で音声認識させようとする音声をマイク１に向かって発
声する。この時、音声分析部２は一定時間（例えば、５
　ｍ５ｅｃ）毎に、マイク１からの音声を分析する（例
えば、８チヤネルのフィルタバンクなど）と共に、音声
の切り出しを行う。When the registration mode is set in this way, the user speaks into the microphone 1 the voice that is to be recognized by this device. At this time, the voice analysis section 2 performs a certain period of time (for example, 5
m5ec), the audio from the microphone 1 is analyzed (for example, using an 8-channel filter bank) and the audio is extracted.

音響入力部７は音声分析部２で音声の切り出しが開始さ
れたことを受けて周囲雑音の取り込みを開始する。The audio input unit 7 starts capturing ambient noise in response to the start of audio extraction by the audio analysis unit 2.

この音響入力部７は、音声分析部２で切り出しが終了す
るまで一定時間（例えば、５　ｍ５ｅｃ）毎に周囲雑音
の取り込みを続ける。音声の入力が終了すると、音声分
析部２は、音響入力部７と判定部８にその旨を伝達する
。この判定部８は音声の入力終了の伝達を受けると、音
声入力期間中に音響入力部７に入力された周囲音響中で
、雑音の有無を判定する。この時、端子９から得られる
音響レベルを所定の閾値レベルと比較し、これより大な
るときに禁止信号を発生する。The audio input unit 7 continues to capture ambient noise at fixed time intervals (for example, 5 m5ec) until the audio analysis unit 2 completes the extraction. When the voice input is completed, the voice analysis section 2 notifies the audio input section 7 and the determination section 8 of this fact. Upon receiving the notification that the voice input has ended, the determining unit 8 determines whether or not there is noise in the ambient sound input to the audio input unit 7 during the voice input period. At this time, the sound level obtained from the terminal 9 is compared with a predetermined threshold level, and when the sound level is higher than the predetermined threshold level, a prohibition signal is generated.

判定部８に於て、音声入力期間中に音響入力部７に、音
声認識に悪影響する程度の周囲雑音の入力が無いと判定
された場合、禁止信号の発生がないので、音声分析部２
で分析された音声が有効であるとしてパータン作成部３
で分析された音声のパターンを作成し、これを登録音声
パターンメモリ５に貯える。If the determination unit 8 determines that there is no ambient noise input to the acoustic input unit 7 during the voice input period to the extent that it adversely affects voice recognition, no inhibition signal is generated, so the voice analysis unit 2
The pattern creation unit 3 determines that the voice analyzed is valid.
A voice pattern analyzed is created and stored in a registered voice pattern memory 5.

これに対して、音声入力期間中に音響入力部７に、音声
認識に悪影響する程度の周囲雑音の入力があると判定さ
れた場合には、禁止信号が発生するので、この信号によ
って、音声パターン作成部３の動作が禁止される。On the other hand, if it is determined that there is ambient noise input to the acoustic input unit 7 during the voice input period to the extent that it has an adverse effect on voice recognition, a prohibition signal is generated. The operation of the creation unit 3 is prohibited.

この結果、音声分析部２で分析された分析結果を無効と
する。そして、この時、表示手段、あるいは音声応答手
段などを用いて、登録音声の入力記憶が無効になった旨
を使用者に伝達して、再度の音声入力を促すようになす
のが、好ましい。As a result, the analysis result analyzed by the voice analysis section 2 is invalidated. At this time, it is preferable to use display means or voice response means to inform the user that the input memory of the registered voice has become invalid, and prompt the user to input the voice again.

以上の説明では、判定部８の判定結果で、音声パターン
作成部３の動作を禁止させる構成について例示したが、
登録音声パターンメモリの記憶動作を禁止させてもよい
。また、場合によっては、上記マイク１や音響分析部２
自体のいずれかの動作を禁止してもよい。In the above explanation, the configuration in which the operation of the voice pattern creation unit 3 is prohibited based on the determination result of the determination unit 8 was exemplified.
The storage operation of the registered voice pattern memory may be prohibited. In some cases, the microphone 1 and the acoustic analysis section 2 may
It may also prohibit any of its operations.

以上の如くして音声登録モードが完了すると、使用者は
、モード設定部４を用いて音声認識モードを手操作（ス
イッチ操作など）にて設定し、音声認識処理に移行させ
る。When the voice registration mode is completed as described above, the user manually sets the voice recognition mode (switch operation, etc.) using the mode setting unit 4, and shifts to voice recognition processing.

（ト）発明の効果本発明の音声認識装置によれば、上記モード設定部が登
録モードを設定している時に該音響入力部の入力音響レ
ベルが所定の値より大きい場合には、該装置の登録音声
パターンの作成記憶動作を禁止するので、周囲雑音の悪
影響で歪んだ音声パターンの登録を回避できる。従って
、信頼性の高い登録音声パターンを用いた音声認識が可
能であるので、音声認識装置の認識率の低下を回避でき
る。(G) Effects of the Invention According to the speech recognition device of the present invention, when the mode setting section sets the registration mode and the input sound level of the sound input section is higher than a predetermined value, the device Since the creation and storage operation of registered voice patterns is prohibited, it is possible to avoid registering voice patterns that are distorted due to the adverse effects of ambient noise. Therefore, since speech recognition using highly reliable registered speech patterns is possible, a decrease in the recognition rate of the speech recognition device can be avoided.

[Brief explanation of the drawing]

第１図は本発明の音声認識装置の構成を示すブロック図
である。１・・・マイク、２・・・音声分析部、３・・・音声パ
ターン作成部、４・・・モード設定部、５・・・登録音
声バクーンメモリ、６・・・認識処理部、７・・・音声
入力部、８・・・判定部、９・・・端子。FIG. 1 is a block diagram showing the configuration of a speech recognition device according to the present invention. DESCRIPTION OF SYMBOLS 1... Microphone, 2... Voice analysis section, 3... Voice pattern creation section, 4... Mode setting section, 5... Registered voice Bakun memory, 6... Recognition processing section, 7. ...Audio input section, 8...Judgment section, 9...Terminal.

Claims

[Claims]

(1) A microphone for inputting voice, a voice analysis section that analyzes the voice input from the microphone, and a voice pattern creation that creates a voice pattern with voice characteristics patterned based on the analysis result of the voice analysis section. a registered voice pattern memory that stores the voice pattern obtained from the voice pattern creation section as a registered voice pattern for the voice input to the microphone in the voice registration mode; a recognition processing unit that recognizes the voice pattern obtained from the pattern creation unit as a pattern based on the registered voice pattern in the registered voice pattern memory; a voice registration mode for storing the registered voice pattern in the registered voice pattern memory; and a voice registration mode for storing the registered voice pattern in the registered voice pattern memory; A voice recognition device comprising a mode setting unit for selecting and setting a mode such as a voice recognition mode for performing voice recognition based on registered voice patterns, comprising an acoustic input unit for inputting ambient sound, and configured to set the mode. If the input sound level of the audio input unit is higher than a predetermined value when the unit is set to the registration mode, any one of the microphone, the acoustic analysis unit, the audio pattern creation unit, or the registered audio pattern memory operates. A voice recognition device that features the ability to prohibit