JPS58143396A

JPS58143396A - Voice recognition unit

Info

Publication number: JPS58143396A
Application number: JP57025701A
Authority: JP
Inventors: 弘之金田
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1982-02-19
Filing date: 1982-02-19
Publication date: 1983-08-25

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】本発明は音声認識装置に関テるものである。[Detailed description of the invention] The present invention relates to a speech recognition device.

従来の音声認識装置は、人力音声を分析して得られる１
つの特徴（パラメータ）についてのみ標準パラメータと
比較を行なうか、あるいは複数の特徴について順次直列
的に比較を行なっていく方式であった。このため、認識
率が低くかつ認識に多大な時間を歓した。Conventional speech recognition devices are based on 1, which is obtained by analyzing human speech.
The method used was to compare only one feature (parameter) with standard parameters, or to compare multiple features sequentially in series. For this reason, the recognition rate was low and recognition took a lot of time.

本発明の目的は高速認識を行な９裟に’に提供すること
にあり、複数の特徴を抽出し、その各々についての比較
動作を並列に行なうようにしたものである。更に、比較
結果は判定部によシ重みつきで判定され、標準パラメー
タと最も近似したものを選び出すようにして、ＶＡ認識
率を著しく低減している。An object of the present invention is to perform high-speed recognition and provide a wide range of features, by extracting a plurality of features and performing comparison operations for each feature in parallel. Furthermore, the comparison results are weighted and determined by the determination section, and the one most similar to the standard parameters is selected, thereby significantly reducing the VA recognition rate.

以下、図面を用いて本発明の一８！尻例を１ｌ−１＝細
に説明する。Hereinafter, part 8 of the present invention will be explained using drawings. An example of 1l-1 will be explained in detail.

第１図は本発明の一実施例發部のブロック図であり、こ
こでは３種の％ｇ、（−Ｔｉｆ−声パラメータ）につい
て比較を行なう例を提示している。１０は音声入力端子
、２０は％像抽出部、３０は比較部（イ）、３１は比較
部（ロ）、３２は比較部（ハ）、４０は標準特徴（イ）
メモリ、４１は標準時ｇＬ（ロ）メモリ、４２は標準％
徴（ハ）メモリ、５０は判定部、６０は認識結果出力端
子である。FIG. 1 is a block diagram of an embodiment of the present invention. Here, an example is presented in which three types of %g and (-Tif-voice parameters) are compared. 10 is an audio input terminal, 20 is a % image extraction section, 30 is a comparison section (a), 31 is a comparison section (b), 32 is a comparison section (c), 40 is a standard feature (a)
Memory, 41 is standard time gL (b) memory, 42 is standard %
50 is a determination unit, and 60 is a recognition result output terminal.

音声入力端子１０より入力された音声からは、特徴抽出
部２０によ９３８Ｍ類の％徴が抽出される。From the voice input from the voice input terminal 10, the feature extraction unit 20 extracts 938M class % features.

ここで抽出される％徴とは、例えば周波数スペクトルバ
タンイあるいは零交差数口あるいはバワ−変化成分ハ等
であわ、いずれも１認識率位だけの集合を示す。％徴抽
出部２０で得られた特徴は同様にして前もって得られて
いる標準特徴１０メモリ４０、標準特徴（ロ）メモリ４
１、標準特徴（ハ）メモリ４２内のそれぞれの内容と対
応する比較部イ３０、比較部ロ３１．比較部ハ３２で比
較される。比較においては、各％徴が１個あるいは２個
あるいは３個等任意の個数の認識結果を夫々優先順位を
つけて出力するように設定されている。判定部５０では
、比較部３０〜３２にそれぞれ重みづけし、かつ各比較
部よシ複数個ｇｉ＊結果が得られる場合にはその順位を
も考慮し、最終的に１個の認識結果を求め、認識結果出
力端子６０に出力する。The % characteristics extracted here include, for example, a frequency spectrum, a number of zero crossings, a power change component, etc., and each represents a set of only one recognition rate. The features obtained by the % feature extraction unit 20 are stored in the standard feature 10 memory 40 and the standard feature (b) memory 4, which have been previously obtained in the same way.
1. Standard features (c) Comparison section A 30 and comparison section B 31 corresponding to the respective contents in the memory 42. Comparison is made in comparison section C32. In the comparison, it is set to output an arbitrary number of recognition results such as 1, 2, or 3 for each % feature, with respective priorities assigned. The determination unit 50 weights each of the comparison units 30 to 32, and if a plurality of gi* results are obtained from each comparison unit, the ranking thereof is also considered, and finally one recognition result is determined. , is output to the recognition result output terminal 60.

−例としてｒＡＪからｒＪＪまでの音声のうち任意の１
音声が入力された場合を仮定する。比較部ＡではｒＡ」
、ｒＪ、ｒｃＪの順に比較部ＢではｒＢＪ、ｒｃｊ、ｌ
ＡＪのＩＩ＆に、また比較部Ｃでは「Ｂ」。- For example, any one of the voices from rAJ to rJJ
Assume that voice is input. rA in comparison part A.”
, rJ, rcJ in the order of comparison part B, rBJ, rcj, l
AJ's II&, and comparison section C is "B".

ＦＤ」、「ｃ」の順に認識したとする。判定部では比較
部の重みを比較部イ、比較部口、比較部ハに対しそれぞ
れ１．２．３とつけるとする。判定部では各比較部の認
識結果に対し、順位の逆数に各比較部につけられた重み
を来した値の総和を求め　その最大値をとる音声を最終
的な認識結果として出力する。上記例では、ｒＡＪについ”’ＣＩＸｌ＋−ｘ２−５−３ｒＢＪについて　−Ｘｌ＋１ｘ２＋１ｘ３一旦２ｒｃＪについて　↓Ｘ１＋’Ｘ２＋’Ｘ３＝−Ｚ３　　
　２　　　３　　　３「１）」について　１×３一旦２となり、最大値旦をとるｒＢＪが認識結果として出力さ
れる。これら各比較動作は並列して行なわれるように、
各比較部へ供給される制御クロックを共通に与えている
ため、高速でこれを実行できる。FD" and "c" are recognized in this order. In the determination section, weights of the comparison section are given as 1.2.3 for comparison section A, comparison section 7, and comparison section C, respectively. The determination section calculates the sum of the reciprocals of the ranks and the weights assigned to each comparison section for the recognition results of each comparison section, and outputs the voice that takes the maximum value as the final recognition result. In the above example, for rAJ "'CIXl+-x2-5-3 for rBJ -Xl+1x2+1x3 once 2 for rcJ ↓X1+'X2+'X3=-Z3
2 3 3 Regarding "1)", 1×3 becomes 2 once, and rBJ, which takes the maximum value 1, is output as a recognition result. These comparison operations are performed in parallel.
Since a common control clock is supplied to each comparing section, this can be executed at high speed.

以上説明したように、入力音声を複数の特、ｇｌ、につ
いて、並列に標準特徴と比較し、各比較結果に重みをつ
けて判定することによシ、従来の１％徴についてのみ比
較するものに比べてｖＡｂ率が著しく低減でき、また直
列的に複数の特徴について比較するものに比べ並列に同
時に比較できるため認識処理に嶽する時間も短縮される
という効果がある。As explained above, by comparing input speech with standard features in parallel for multiple features, gl, and making decisions by weighting each comparison result, the conventional method that compares only the 1% feature This method has the effect that the vAb rate can be significantly reduced compared to the previous method, and the time required for recognition processing can be shortened since it can be compared simultaneously in parallel compared to a method that compares a plurality of features serially.

同第１図の丈施例では、特徴抽出部２０で各音声入力ご
とに３柚の特徴を抽出しているが、線形予測係数尋特徴
抽出が難解な場合には、各特徴ごとに専用の抽出部を設
けてもよい。In the example shown in Fig. 1, the feature extractor 20 extracts three features for each voice input, but if the linear prediction coefficient feature extraction is difficult to understand, a dedicated An extraction section may also be provided.

[Brief explanation of the drawing]

第１図は本発明の一実施例の安部フロック図である。１０−・・・・・・音声入力端イ、２０・・・・・・特
徴抽出部、３０・・・・・・比較部（イ）、３１・・・
・・・比較部（ロ）、３２・・・・・・比較部０．４０
・・・・・・標準％徴（イ）メモリ、４１・・・・・・
標準％徴（ロ）メモリ、４２・・・・・・標準特徴（９
メモリ、５０・・・・・・判定部、６０・・・・・・認
識結果出力端子。５−FIG. 1 is an Abe block diagram of an embodiment of the present invention. 10-...Audio input end a, 20...feature extraction section, 30...comparison section (a), 31...
...Comparison part (b), 32...Comparison part 0.40
...Standard percentage (a) Memory, 41...
Standard % characteristics (b) Memory, 42...Standard characteristics (9
Memory, 50... Judgment unit, 60... Recognition result output terminal. 5-

Claims

[Claims]

An i-voice recognition device comprising: a comparison section that compares different characteristic parameters of input speech with standard parameters; and a judgment section that makes a weighted comprehensive judgment on the output of the comparison section.