JPS62217298A - Voice recognition equipment - Google Patents

Voice recognition equipment

Info

Publication number
JPS62217298A
JPS62217298A JP61061725A JP6172586A JPS62217298A JP S62217298 A JPS62217298 A JP S62217298A JP 61061725 A JP61061725 A JP 61061725A JP 6172586 A JP6172586 A JP 6172586A JP S62217298 A JPS62217298 A JP S62217298A
Authority
JP
Japan
Prior art keywords
dictionary
word
similarity
recognition
missing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP61061725A
Other languages
Japanese (ja)
Other versions
JPH0792675B2 (en
Inventor
安田 晴剛
潤一郎 藤本
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ricoh Co Ltd
Original Assignee
Ricoh Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ricoh Co Ltd filed Critical Ricoh Co Ltd
Priority to JP61061725A priority Critical patent/JPH0792675B2/en
Publication of JPS62217298A publication Critical patent/JPS62217298A/en
Publication of JPH0792675B2 publication Critical patent/JPH0792675B2/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。
(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】 韮1B辷艷 本発明は、音声認識装置、より詳細には、音声認識装置
の認識処理及び辞書作成処理に関する。
DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a speech recognition device, and more particularly, to a recognition process and a dictionary creation process of the speech recognition device.

児來流豊 音声認識装置における音声区間検出において、100%
完壁な入力を得ることは難しく、特に、騒音下の場合な
どは可変閾値方式を構じているものが多いため、語頭2
語尾の子音ブロックなどが検出されないことがたびたび
生じる。又、不特定話者認識方式などにおいては個人間
のバラツキがあり、子音ブロックのパワーの強弱や時間
幅が極端に違う場合がある。このような場合、ある特定
のブロックが欠落し、まず、辞書作成を行う場合や認識
時にその差異のために正常な辞書が作れなかったり、誤
認識を起こしてしまうことになる。
100% of voice interval detection in the voice recognition device
It is difficult to obtain perfect input, especially in noisy environments, as many devices have a variable threshold method.
Word-final consonant blocks are often not detected. Furthermore, in speaker-independent recognition methods, there are variations between individuals, and the power strength and duration of consonant blocks may be extremely different. In such a case, a certain specific block is missing, and due to the difference when creating a dictionary or during recognition, a correct dictionary cannot be created or erroneous recognition occurs.

而して、従来においては、ある音声ブロックが話者によ
って欠落したりしなかったりした場合、あらかじめその
単語を認識しにくい単語として欠落している単語と欠落
していない単語の両方の辞書を有するマルチプレート処
理を行っていた。又。
Therefore, conventionally, when a certain speech block is missing or not depending on the speaker, a dictionary is prepared in advance for both the missing words and the non-missing words as words that are difficult to recognize. Multi-plate processing was performed. or.

辞書作成時にはそのまま加算していたため、その辞書の
精度が悪くなることが多かった。
When creating a dictionary, the numbers were simply added, which often resulted in poor accuracy of the dictionary.

目     的 本発明は、上述のごとき実情に鑑みてなされたもので、
特に、音声認識装置における辞書の精度を向上させて音
声認識精度を向上させることを目的としてなされたもの
である。
Purpose The present invention was made in view of the above-mentioned circumstances.
In particular, this was done with the aim of improving the precision of the dictionary in the speech recognition device, thereby improving the speech recognition accuracy.

皇−一威 本発明は、上記目的を達成するために、マイクから入力
された音声の特微量を抽出する手段と、音声区間を検出
する手段と、一単語内の有音区間と無音区間を切り分け
る手段と、その無音区間に基づいてブロック単位で加算
辞書を登録する手段と、無音区間に基づいて認識処理を
行う手段とを具備した音声認識装置において、辞書作成
時に、比較的欠落しやすい子音ブロック等を有する単語
をあらかじめ指定しておき、認識時に、入力パターンに
対してそのブロックが欠落した場合とそうでない場合の
両方の類似度を算出し、高い方をその単語の類似度とす
ることを特徴としたものである。以下、本発明の実施例
に基づいて説明する。
In order to achieve the above-mentioned object, the present invention provides means for extracting the characteristic amount of voice input from a microphone, means for detecting voice sections, and means for detecting voiced sections and silent sections within one word. In a speech recognition device that is equipped with means for segmenting, means for registering an additive dictionary in block units based on the silent sections, and means for performing recognition processing based on the silent sections, consonants that are relatively likely to be omitted when creating a dictionary are used. A word having a block, etc. is specified in advance, and during recognition, the degree of similarity is calculated for both cases in which the block is missing and not, and the higher one is taken as the degree of similarity for that word. It is characterized by Hereinafter, the present invention will be explained based on examples.

第1図は、本発明の一実施例を説明するための電気的ブ
ロック線図で、図中、1はマイク、2は前処理部、3は
特徴抽出部、4は音声区間検出部。
FIG. 1 is an electrical block diagram for explaining an embodiment of the present invention. In the figure, 1 is a microphone, 2 is a preprocessing section, 3 is a feature extraction section, and 4 is a voice section detection section.

5は認識処理及び辞書作成部、6は識別子チェック部、
7は単語辞書で、以下、−例として、BTSP方式を用
いた無音区間に着目した不特定話者認識装置に適用した
場合について説明する。
5 is a recognition processing and dictionary creation unit; 6 is an identifier check unit;
Reference numeral 7 denotes a word dictionary, and as an example, a case will be described in which it is applied to a speaker-independent recognition device using the BTSP method and focusing on silent sections.

第2図は、無音区間に着目した認識方法の一例を説明す
るための図で、この無音区間に着目した認識方法は、音
声区間A中に生じる無音区間Bを節としてその両側の有
音ブロックC同志を対応づけてパターン照合を行うもの
である。ところが、例えば、図示の′ストップ′という
単語は、話者によっては、第3図に示すように1プ′が
欠落する場合がある。例えば入力時に′プ′が欠落した
場合、入力の1スト′と辞書の′ストップ′とを認識す
ることになり、誤認識の原因となる。
FIG. 2 is a diagram for explaining an example of a recognition method that focuses on a silent section. This recognition method that focuses on a silent section uses a silent section B that occurs in a vocal section A as a node, and blocks the sound on both sides of the silent section B. This is to match C comrades and perform pattern matching. However, for example, in the illustrated word ``stop,'' depending on the speaker, 1 p' may be omitted, as shown in FIG. For example, if ``pu'' is missing during input, the input 1st stroke'' will be recognized as the dictionary ``stop'', resulting in erroneous recognition.

本発明は、上述のごとき誤認識を防止するためになされ
たもので、本発明においては、上述のごとき、単語にあ
らかじめ、辞書作成時に識別子をつけておき、この識別
子のついているものに関しては、辞書の語尾のブロック
についている場合と、ついていない場合の2回の認識演
算を行い、その類似度の大きい方をその単語の類似度と
するようにしたものである。このような語尾の欠落は特
に英語や独語に多く ’5kip”Up’等も同様な事
がいえる。又1語尾が欠落する場合や、語頭の場合、そ
の場合に応じて識別子を変化させれば認識時のこの2回
の設定の仕方は容易に判断ができる。
The present invention was made in order to prevent the above-mentioned misrecognition.In the present invention, as mentioned above, an identifier is attached to a word in advance at the time of dictionary creation, and for words with this identifier, Recognition calculations are performed twice, one for when a word is attached to a block at the end of a word in the dictionary, and another for when it is not attached, and the one with the greater degree of similarity is taken as the degree of similarity for that word. This type of missing endings is especially common in English and German, and the same can be said for things like '5kip' and 'Up'.Also, if one ending is missing, or if it is at the beginning of a word, you can change the identifier accordingly. The method of setting these two times during recognition can be easily determined.

次に、辞書作成時に本発明を適用する場合について述べ
る。認識時と同様に無音区間に着目して辞書作成するた
め、語中の有音ブロックが欠落すると、作成された辞書
の質が低下する。また、辞書作成時に例えば3回の加算
辞書を作る場合、その類似度を求めてチェックしながら
行う方法がある。従って、その類似度を求める際にブロ
ックが欠落している場合としていない場合とを想定して
、2回の類似度演算を用い、その類似度の大きい方が正
しい対応づけとなり、ブロックが欠落しているとしてそ
のブロックを除外して他ブロックのみを辞書加算するこ
とにより正確な辞書を作成することが可能となる。
Next, a case will be described in which the present invention is applied when creating a dictionary. As in the case of recognition, the dictionary is created by focusing on silent sections, so if a voiced block in a word is missing, the quality of the created dictionary deteriorates. Further, when creating a dictionary, for example, when creating an additive dictionary three times, there is a method of calculating and checking the degree of similarity. Therefore, when calculating the similarity, two similarity calculations are performed, assuming that the block is missing and when it is not. By excluding that block and adding only other blocks to the dictionary, it is possible to create an accurate dictionary.

第4図は、その場合の一例を示す図で、■が1回目の照
合、■が2回目の照合である。また、前述のあらかじめ
識別子をつけておくのは、例えば辞書別のもので作成し
て供給する場合は容易に供給する事ができるが、その装
置で作成する場合は識別子を自動的に作成しなくてはな
らない。そのためには上記の2回の類似度を求める演算
を行った結果、ブロックが欠落した方が類似度が高かっ
た場合、そのように入力が欠落する場合があり得るので
、その単語にその識別子を付加させればよいことになる
。この様にして、辞書作成時に識別子に応じて認識時に
その類似度演算を数回行い、その最も高い類似度のもの
を認識結果とすればよい。
FIG. 4 is a diagram showing an example of such a case, where ■ is the first verification and ■ is the second verification. In addition, adding an identifier in advance as mentioned above can be easily provided if, for example, a dictionary is created and supplied, but if the device is used to create the identifier, the identifier cannot be created automatically. must not. In order to do this, if the result of performing the above two calculations to calculate the similarity is that the similarity is higher when the block is missing, then the input may be missing, so we need to add that identifier to the word. It would be a good idea to add it. In this way, when creating a dictionary, the similarity calculation may be performed several times during recognition according to the identifier, and the one with the highest similarity may be used as the recognition result.

効   果 以上の説明から明らかなように、本発明によると、たび
たびブロックが欠落する様な単語でも正確に辞書を作り
、認識することが可能となる。
Effects As is clear from the above explanation, according to the present invention, it is possible to accurately create a dictionary and recognize words even for words whose blocks are frequently missing.

【図面の簡単な説明】[Brief explanation of drawings]

第1図は、本発明による音声認識装置の一実施例を説明
するための電気的ブロック線図、第2図は、無音区間に
着目した認識方法の一例を説明す  −るための図、第
3図は、従来の認識方法の欠点を説明するための図、第
4図は、辞書作成の一方法を説明するための図である。 1・・・マイク、2・・・前処理部、3・・・特徴抽出
部、4・・・音声区間検出部、5・・・認識処理及び辞
書作成部。 6・・・識別子チェック部、7・・・単語辞書。 第  1  図 第2図 第3!!I   宵4@
FIG. 1 is an electrical block diagram for explaining an embodiment of a speech recognition device according to the present invention, and FIG. 2 is a diagram for explaining an example of a recognition method focusing on silent intervals. FIG. 3 is a diagram for explaining the drawbacks of the conventional recognition method, and FIG. 4 is a diagram for explaining one method for creating a dictionary. DESCRIPTION OF SYMBOLS 1... Microphone, 2... Preprocessing section, 3... Feature extraction section, 4... Speech section detection section, 5... Recognition processing and dictionary creation section. 6...Identifier check unit, 7...Word dictionary. Figure 1 Figure 2 Figure 3! ! I evening 4 @

Claims (2)

【特許請求の範囲】[Claims] (1)マイクから入力された音声の特微量を抽出する手
段と、音声区間を検出する手段と、一単語内の有音区間
と無音区間を切り分ける手段と、その無音区間に基づい
てブロック単位で加算辞書を登録する手段と、無音区間
に基づいて認識処理を行う手段とを具備した音声認識装
置において、辞書作成時に、比較的欠落しやすい子音ブ
ロック等を有する単語をあらかじめ指定しておき、認識
時に、入力パターンに対してそのブロックが欠落した場
合とそうでない場合の両方の類似度を算出し、高い方を
その単語の類似度とするパターン照合方式を有すること
を特徴とする音声認識装置。
(1) A means for extracting the characteristic amount of the voice input from the microphone, a means for detecting the voice interval, a means for separating the voiced interval and the silent interval within one word, and In a speech recognition device that is equipped with a means for registering an additive dictionary and a means for performing recognition processing based on silent intervals, when creating a dictionary, words having consonant blocks etc. that are relatively easy to be omitted are specified in advance, and then recognized. A speech recognition device comprising a pattern matching method that sometimes calculates the degree of similarity for an input pattern both when that block is missing and when it does not, and sets the higher one as the degree of similarity for the word.
(2)上記辞書作成時に、類似度を求めながら辞書加算
を行うことを特徴とする特許請求の範囲第(1)項記載
の音声認識装置。
(2) The speech recognition device according to claim (1), characterized in that when creating the dictionary, dictionary addition is performed while determining similarity.
JP61061725A 1986-03-19 1986-03-19 Voice recognizer Expired - Fee Related JPH0792675B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP61061725A JPH0792675B2 (en) 1986-03-19 1986-03-19 Voice recognizer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP61061725A JPH0792675B2 (en) 1986-03-19 1986-03-19 Voice recognizer

Publications (2)

Publication Number Publication Date
JPS62217298A true JPS62217298A (en) 1987-09-24
JPH0792675B2 JPH0792675B2 (en) 1995-10-09

Family

ID=13179480

Family Applications (1)

Application Number Title Priority Date Filing Date
JP61061725A Expired - Fee Related JPH0792675B2 (en) 1986-03-19 1986-03-19 Voice recognizer

Country Status (1)

Country Link
JP (1) JPH0792675B2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003044079A (en) * 2001-08-01 2003-02-14 Sony Corp Device and method for recognizing voice, recording medium, and program

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS62111295A (en) * 1985-11-08 1987-05-22 松下電器産業株式会社 Voice recognition equipment

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS62111295A (en) * 1985-11-08 1987-05-22 松下電器産業株式会社 Voice recognition equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003044079A (en) * 2001-08-01 2003-02-14 Sony Corp Device and method for recognizing voice, recording medium, and program
JP4655184B2 (en) * 2001-08-01 2011-03-23 ソニー株式会社 Voice recognition apparatus and method, recording medium, and program

Also Published As

Publication number Publication date
JPH0792675B2 (en) 1995-10-09

Similar Documents

Publication Publication Date Title
JPS62217298A (en) Voice recognition equipment
US6438521B1 (en) Speech recognition method and apparatus and computer-readable memory
Seman et al. Hybrid methods of Brandt’s generalised likelihood ratio and short-term energy for Malay word speech segmentation
JP2882791B2 (en) Pattern comparison method
JP3031081B2 (en) Voice recognition device
JPS6147999A (en) Voice recognition system
JPS62245295A (en) Specified speaker's voice recognition equipment
Khaing et al. Automatic speech segmentation for myanmar language
CN118072717A (en) Speech recognition method, device, equipment and storage medium
JP2655637B2 (en) Voice pattern matching method
JPS61233791A (en) Voice section detection system for voice recognition equipment
JP2901976B2 (en) Pattern matching preliminary selection method
JPS60147799A (en) Voice recognition
JPS6136798A (en) Voice segmentation
JPS60115996A (en) Voice recognition equipment
JPS60159798A (en) Voice recognition equipment
JPS6064396A (en) Voice recognition equipment
JPS6147994A (en) Voice recognition system
JPS62113196A (en) Voice recognition learning system
JPS63131196A (en) Nasal identifier
JPS59125800A (en) Voice recognition equipment
JPS60198598A (en) Voice recognition system
JPS61233792A (en) Voice recognition equipment
JPS6064397A (en) Voice recognition equipment
JPS6078491A (en) Dictionary updating system

Legal Events

Date Code Title Description
LAPS Cancellation because of no payment of annual fees