JPS60179798A

JPS60179798A - Voice recognition equipment

Info

Publication number: JPS60179798A
Application number: JP59036446A
Authority: JP
Inventors: 文雄前原
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1984-02-27
Filing date: 1984-02-27
Publication date: 1985-09-13
Also published as: JPH0449955B2

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】産業上の利用分野本発明は音声認識装置に関する。[Detailed description of the invention] Industrial applications The present invention relates to a speech recognition device.

従来例の構成とその問題点従来、音声認識装置では、入力音声信号を分析すること
によって得られる特徴ベクトル系列に対し、辞書として
、あらかじめ装置内に登録しである複数個の標準パター
ンベタ１ニル列の中からこれと距離の最も近いものをも
って認識結果としているが、その際、標準パターン作成
のための音声パラメータ登録時の発声レベルと認識時の
発声レベルに差異が生じることに起因した誤認識が生じ
る。Conventional configuration and its problems Conventionally, in a speech recognition device, for a feature vector series obtained by analyzing an input speech signal, a plurality of standard patterns are registered in advance as a dictionary in the device. The recognition result is the one closest to this in the row, but in this case, there may be erroneous recognition due to a difference between the utterance level at the time of registering the voice parameters for standard pattern creation and the utterance level at the time of recognition. occurs.

これに対して従来の音声認識装置では、入力音声のレベ
ルを、レベルメータあるいはＬＥＤアレイ等を用いて表
示する方法が一般である。In contrast, conventional speech recognition devices generally display the level of input speech using a level meter, an LED array, or the like.

しかし、この表示方法では操作者が登録時のレベルメー
タの指示をいちいち記憶しておく必要が有り、有効なレ
ベル合せ法とは言えなかった。However, this display method requires the operator to memorize each level meter instruction at the time of registration, and cannot be said to be an effective level matching method.

発明の目的本発明は、上記欠点に震み、音声認識装置における、登
録時と認識時の発声レベルを均一化することができ、認
識率の改善を図る音声認識装置を提供することを目的と
する。Purpose of the Invention The present invention has been made in view of the above-mentioned drawbacks, and an object of the present invention is to provide a speech recognition device that can equalize the utterance level at the time of registration and recognition, and improve the recognition rate. do.

発明の構成前記目的を達成するため本発明は、入力音声のレベルを
表示する。表示手段と、入力音声を分析するパラメータ
分析手段と、あらかじめ分析されたパラメータを標準パ
ターンとして記憶する記憶手段と、前記パラメータ分析
手段で分析された入力音声のパラメータと、前記記憶手
段内の標準パラメータとの距離を計算し、距離最小を与
える標準パターンをもって認識結果とするパターン比較
手段と、標準パターン登録時に、各標準パターンの電力
を計算する電力計算手段を設け、登録時の標準パターン
の電力の最大値と最小値もしくは平均値を入力レベル表
示手段の近傍に表示せしめるように構成している。Structure of the Invention To achieve the above object, the present invention displays the level of input audio. a display means, a parameter analysis means for analyzing input speech, a storage means for storing parameters analyzed in advance as a standard pattern, parameters of the input speech analyzed by the parameter analysis means, and standard parameters in the storage means. A pattern comparison means calculates the distance between the standard pattern and the standard pattern giving the minimum distance as the recognition result, and a power calculation means calculates the power of each standard pattern when registering the standard pattern. The maximum value and the minimum value or the average value are displayed near the input level display means.

実施例の説明以下、本発明の一実施例について図面を参照しながら説
明する。DESCRIPTION OF EMBODIMENTS An embodiment of the present invention will be described below with reference to the drawings.

図は本発明の一実施例における音声認識装置のブロック
図である。同図において、１は入力信号、２は入力音声
のレベルを表示するレベル表示部、、３は入力音声をパ
ラメータ分析して、パラメータベクトル列に遂次変換す
るパラメータ分析部で、フィルタバンク、フーリエ変換
器、線形予測係数型分析器などを用いるのが一般である
。４はスイッチで、標準パターン作成時にはＢ側に、パ
ターン比較時にはＡ側に切シ換る。５はパターン記憶部
で、パラメータ分析部３によシ作成されたパラメータベ
クトル列を標準パターンとして記憶する。The figure is a block diagram of a speech recognition device according to an embodiment of the present invention. In the figure, 1 is an input signal, 2 is a level display unit that displays the level of the input audio, and 3 is a parameter analysis unit that performs parameter analysis of the input audio and sequentially converts it into a parameter vector sequence. Generally, a converter, linear prediction coefficient type analyzer, etc. are used. 4 is a switch which is switched to the B side when creating a standard pattern and to the A side when comparing patterns. Reference numeral 5 denotes a pattern storage unit that stores the parameter vector sequence created by the parameter analysis unit 3 as a standard pattern.

６はパターン比較部で、パターン記憶部５に記憶されて
いる標準パターンと入カバターンとの間でパターン比較
を行い、標準パターンのうち距離最小を与えるものを認
識結果として信号線７に出力する。８は電力計算部で、
標準パターンの作成に際して各々の平均電力を計算し、
その最大値、最小値をめる。９は範囲表示部で、電力計
算部８でめた標準パターン平均電力の最大値、最小値も
しくは平均値を、レベル表示部の該当する箇所に、もし
くは数値の形で表示する。Reference numeral 6 denotes a pattern comparison section which performs pattern comparison between the standard pattern stored in the pattern storage section 5 and the input cover pattern, and outputs the standard pattern that provides the minimum distance to the signal line 7 as a recognition result. 8 is the power calculation section,
Calculate the average power of each when creating the standard pattern,
Calculate the maximum and minimum values. Reference numeral 9 denotes a range display section that displays the maximum value, minimum value, or average value of the standard pattern average power determined by the power calculation section 8 at a corresponding location on the level display section or in the form of a numerical value.

次に上記のように構成された装置の動作について、標準
パターン作成時、パターン比較時とに分けて各々説明す
る。Next, the operation of the apparatus configured as described above will be explained separately for the time of standard pattern creation and the time of pattern comparison.

先づ標準パターン作成時にはスイッチ４をＢ側に接続し
、入力した音声信号をパラメータ分析部３により、パラ
メータベクトルの列に遂次変換した後、パターン記憶部
５に記憶させる。この動作を繰り返すことによりパター
ン記憶部５内に標準パターンベクトル列が記憶される。First, when creating a standard pattern, the switch 4 is connected to the B side, and the input audio signal is sequentially converted into a string of parameter vectors by the parameter analysis section 3, and then stored in the pattern storage section 5. By repeating this operation, a standard pattern vector sequence is stored in the pattern storage section 5.

電力計算部８では標準パターンが入力される毎に、該当
パターンの平均電力もしくはピーク電力を計算する。全
標準パターンの記憶が終了した段階で、電力計算部８は
電力の最大値、最小値を範囲表示部９に出力し、標準パ
ターンの電力の範囲をレベル表示部２の近傍に表示する
。Every time a standard pattern is input, the power calculation unit 8 calculates the average power or peak power of the corresponding pattern. When all standard patterns have been stored, the power calculation section 8 outputs the maximum and minimum values of power to the range display section 9, and displays the power range of the standard pattern near the level display section 2.

次にパターン比較の場合について説明する。Next, the case of pattern comparison will be explained.

パターン比較に際しては、スイッチ４をＡ側に接続する
。パラメータ分析部１は、標準パターン登録の場合と同
様に、入力音声をパラメータベクトル列に変換する０分
析された入力パラメータベクトル列はスイッチ４を介し
て、パターン比較部６の一方の入力端に入力される。パ
ターン記憶部５は、標準パターンベクトル列の１つをパ
ターン比較部の他の入力端に入力し、入力パラメータベ
クトル列と標準パターンベクトル列との間で距離計算を
行う。以上の動作をパターン記憶部５のすべての標準パ
ターンについて行い、入力パラメータベクトル列との距
離が最小となる標準パターンをもって認識結果として出
力信号線７に出力する。For pattern comparison, switch 4 is connected to the A side. The parameter analysis section 1 converts the input voice into a parameter vector string as in the case of standard pattern registration.The analyzed input parameter vector string is input to one input end of the pattern comparison section 6 via the switch 4. be done. The pattern storage unit 5 inputs one of the standard pattern vector sequences to the other input terminal of the pattern comparison unit, and performs distance calculation between the input parameter vector sequence and the standard pattern vector sequence. The above operation is performed for all standard patterns in the pattern storage section 5, and the standard pattern with the minimum distance from the input parameter vector sequence is output to the output signal line 7 as a recognition result.

以上の認識動作に先立って、範囲表示部９には標準パタ
ーン作成時に計算されたレベル範囲が表示されている。Prior to the above recognition operation, the range display section 9 displays the level range calculated at the time of creating the standard pattern.

従って利用者は発声に際して、レベル表示部２の指示を
参照しながら、自分の発声が標準パターンのレベル範囲
におさまるようにコントロールすることが容易となる。Therefore, when making a speech, the user can easily control his/her speech so that it falls within the level range of the standard pattern while referring to the instructions on the level display section 2.

以上のように、本実施例によれば、レベル表示部２の近
傍に、範囲表示部９を設け、電力計算部８で計算した登
録標準パターンの最大値、最小値もしくは平均値を前記
、範囲表示部９に表示することにより、認識に際して話
者の発声レベルを標準パターンの許容範囲内におさえる
様に指示でき、認識率の改善が得られる。As described above, according to this embodiment, the range display section 9 is provided near the level display section 2, and the maximum value, minimum value, or average value of the registered standard pattern calculated by the power calculation section 8 is displayed within the range. By displaying this on the display unit 9, it is possible to instruct the speaker to keep the utterance level within the allowable range of the standard pattern during recognition, thereby improving the recognition rate.

なお、本文中のレベル表示部２．範囲表示部９け数字表
示器、メータとＬＥＤの絹合せ２発光素子の組合せ等に
よっても実現できる。In addition, the level display section 2 in the main text. It can also be realized by a range display unit, a nine-digit numeric display, a combination of a meter and an LED, and two light-emitting elements.

又、本実施例では使用に先立ってパターイを登録する登
録型の認識装置を用いて説明したが、あらかじめ別装置
で標準パターンを分析しておく型のものでも分析に際し
て電力を計算しておくことにより応用が可能である。Furthermore, although this embodiment has been explained using a registration type recognition device that registers the pattern before use, even if the recognition device is of the type in which the standard pattern is analyzed in advance with a separate device, the power can be calculated at the time of analysis. It can be applied by

又電力計算部８における電力としては、平均電力、ピー
ク電力の他、母音定常部の電力を用いる方法がある。As the power in the power calculation section 8, there is a method of using the power of the vowel stationary part in addition to the average power and the peak power.

又、本実施例は、コンピュータ並びに表示器を用いその
動作をプログラム的に行うことが可能である。Further, in this embodiment, the operation can be performed programmatically using a computer and a display.

発明の効果以上のように、本発明の音声認識装置は、入力音声の入
力１／ベルを表示する表示手段と合せて、標準パターン
の電力の最大値、最小値もしくは平均値を表示する表示
手段を設けることにより、話者の発声レベルの変動を許
容範囲におさめる様に話者に指示を与えることにより認
識率の向上を図ることかでき、その工業的価値は犬なる
ものがある。Effects of the Invention As described above, the speech recognition device of the present invention includes display means for displaying the input 1/bell of the input voice as well as display means for displaying the maximum value, minimum value, or average value of the power of the standard pattern. By providing this, it is possible to improve the recognition rate by giving instructions to the speaker to keep fluctuations in the speaker's utterance level within an allowable range, and this has great industrial value.

[Brief explanation of the drawing]

図は本発明の一実殉例における音声認識装置のブロック
図である。２・・・・・レベル表示部、３・・・・・・パラメータ
分析部、４・・・・・・スイッチ、５・・・・・パター
ン記憶部、６・・・・・・パターン比較部、８・・・・
・・電力計算部、９・・・・・・範囲表示部。The figure is a block diagram of a speech recognition device in a practical example of the present invention. 2: Level display section, 3: Parameter analysis section, 4: Switch, 5: Pattern storage section, 6: Pattern comparison section , 8...
... Power calculation section, 9... Range display section.

Claims

[Claims]

(1) Level display means for displaying the level of input audio;
A speech recognition device comprising: power calculation means for calculating the average power of the input speech when registering it as a standard pattern; and range display means for displaying the maximum value, minimum value, or average value of the power calculation means.

(2) The speech recognition device according to claim 1, wherein the level display means or the range indication means comprises a plurality of light emitting elements arranged in parallel.

(3) The speech recognition device according to claim 1, wherein the power calculation means uses the power of the vowel stationary part as the average power of the corresponding standard pattern.