JPS58220197A

JPS58220197A - Standard pattern preparation system for recognition of simulated continuous utterance

Info

Publication number: JPS58220197A
Application number: JP57103567A
Authority: JP
Inventors: 寺尾　修
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1982-06-16
Filing date: 1982-06-16
Publication date: 1983-12-21

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】（ａ）　　発明の技術分野本発明は擬似連続発声（音節の特徴が出るようにゆっく
り発声したもの）を用いて入力した音声を認識する音声
認識装置に係り、特に使用者の負担を軽くするため、予
め登録する標準パターンは単音節のみを用いて行ない、
擬似連続発声による入力の認識率を向上させる標準パタ
ーンの作成及び学習は自動的に行なう擬似連続発声認識
に於ける標準パターン作成方式に関する。[Detailed Description of the Invention] (a) Technical Field of the Invention The present invention relates to a speech recognition device that recognizes input speech using pseudo-continuous speech (pronounced slowly to bring out the characteristics of syllables), and particularly relates to a In order to reduce the burden on the operator, the standard patterns to be registered in advance are made using only monosyllables.
The creation and learning of standard patterns for improving the recognition rate of input by pseudo-continuous utterances relates to a standard pattern creation method in automatic pseudo-continuous utterance recognition.

（ｂ）　　従来技術と問題点従来技術に於ける文章等を単音節によシ音声認識装置に
入力することは使用者の発声が不自然のため実用的には
不可能に近く、文章等のば声入力には出来るだけ連続発
生を可能としたものでなければならない。この場合当然
のことであるが、連続発声に伴う各音韻の音響的特徴が
前後の音韻によって影響を受ける調音結合の問題及び各
音韻の特徴が必ずしも明確に出ない等の問題による認識
力っだとしても、必ずしも単音節単位の分離が可能とな
る訳ではなく、従って予め登録する標準バターンとして
単音節のみを用いて登録しておいても、擬似連続発声か
ら切り出される単音節とパターンマツチングして良い結
果は得られない。従って中音節の（子音＋母音）パター
ンのみの標準パターンだけでなく、母音及び単音節パタ
ーンを標準パターンとして登録し、パターンマツチング
を行うことが望ましい。しかし、岸音節即ち（子音→−
母音）を−回発声して登録するにしても１００種類を越
える上に母音及び単音節パターンを発声して登録するこ
とは数６回を越える負担を使用者に負わすこととなり、
且つ、不完全なものとなる欠点がある。(b) Prior art and problems In the conventional technology, it is practically impossible to input sentences, etc. into a speech recognition device in monosyllable form because the user's utterances are unnatural. The voice input must be able to occur continuously as much as possible. In this case, it is natural that the acoustic characteristics of each phoneme are affected by the preceding and preceding phonemes due to continuous utterances, and the recognition ability is affected by problems such as articulatory coupling and problems such as the characteristics of each phoneme not being clearly visible. However, it is not necessarily possible to separate monosyllable units, so even if you register only monosyllables as standard patterns in advance, pattern matching with monosyllables cut out from pseudo-continuous utterances will not be possible. You won't get good results. Therefore, it is desirable to register not only a standard pattern consisting only of middle syllable (consonant + vowel) patterns, but also vowel and monosyllable patterns as standard patterns and perform pattern matching. However, the Kishi syllable is (consonant → −
Even if a vowel (vowel) is uttered - times and registered, there are more than 100 types, and uttering and registering vowels and monosyllabic patterns burdens the user more than six times.
Moreover, it has the disadvantage of being incomplete.

（ｃ）　　発明の目的担を軽くシ、且つ、擬似連続発声による入力の認識率も
向上せしめるため、予め登録する標準パターンは単音節
のみとし、且つ、該単音節の発声も各ト回のみとし、擬
似連続発声の認識率を高めるだめの標準パターンには母
音及び単音節パターンを用いる。しかし該母音及び単音
節パターンは予め作成された母音及び単音節の統言１情
報から指定された数だけのパターン数に編成し、後処理
装置の結果に基づき腋似連続発声を再分割して学習パタ
ーン区間を求めてから学習し、該パターンを再編成して
母音及び単音節標準パターンとする標準パターン作成方
式を提供することにある。(c) In order to lighten the purpose of the invention and improve the recognition rate of input using pseudo-continuous utterances, the standard pattern to be registered in advance is only a single syllable, and the utterance of the single syllable is limited to each time. , vowel and monosyllabic patterns are used as standard patterns to increase the recognition rate of pseudo-continuous utterances. However, the vowel and monosyllabic patterns are organized into a specified number of patterns from the vowel and monosyllabic syntax 1 information created in advance, and the axillary continuous utterances are redivided based on the results of the post-processing device. An object of the present invention is to provide a standard pattern creation method that obtains learning pattern sections, performs learning, and reorganizes the patterns to create vowel and monosyllable standard patterns.

る音声認識装置に於て、予め登録する標準バタ・−ンと
して単音節のみを用い、該単音節毎に数種類の標準パタ
ーンを発生させ、該姑種類の、歇ｔもパターンを予め作
成された母音及び単音節の統計情報に基づき擬似連続発
声で入力された音声をＩコ分割し学習パターン区間を求
めてから学習して、ｒｊｉＪ記編成したパターンを再編
成して母音及び単音節標準パターンとするようにしたも
のである。In a speech recognition device, only a single syllable is used as a standard pattern to be registered in advance, and several types of standard patterns are generated for each monosyllable, and patterns are also created in advance for each type of syllable. Based on the statistical information of vowels and monosyllables, the speech input by pseudo-continuous utterance is divided into I parts to obtain learning pattern sections, and the pattern is reorganized into vowels and monosyllabic standard patterns. It was designed to do so.

（ｅ）　　発明の実施例母音及び嘔憧節パターンのゼで作成された統計情報に基
づいて、発声された音節即ち（子音＋母音）パターンに
対して母音及び単音節パターンのパラメータを加工して
標準パターンに登録するには、該単音節に対し数種類の
標準パターンを発生し、例えばマの発声／ｍａ／に対し
て５母音ア、イ。(e) Embodiment of the Invention Based on the statistical information created by the vowel and vomit clause patterns, the parameters of the vowel and monosyllabic patterns are processed for the uttered syllable (consonant + vowel) pattern. To register as a standard pattern, several types of standard patterns are generated for the single syllable. For example, for the utterance /ma/ of Ma, five vowels a, i.

つ、工、オを付加した／ａ／−／ｒｎａ／、　／ｉ／−
／ｍａ％／ｕ／−／ｍａ／、／ｅ／−／ｍａ／、１０／
−／ｍ＆／の各標準パターンを汀工して作成する。との
時のパターン加工は次式に基つく。/a/-/rna/, /i/- with the addition of tsu, 工, and o.
/ma%/u/-/ma/, /e/-/ma/, 10/
-/m&/ standard patterns are created by cutting. The pattern processing at the time of is based on the following formula.

但し、Ｑは加」−パラメータＰは入力パラメータＷｉｊは加工重み係数（統計情報による）又、坪音節パ
ターンで認識された結果を文節単位で仮名漢字変換が行
なわれる後処理装置の結果を基にして擬似連続発声で入
力された音声をＮ分割し学習パターン区間を求めてから
学習する時は次式に基づく。However, Q is "+" - Parameter P is input parameter Wij is the processing weighting coefficient (based on statistical information).Also, it is based on the results of a post-processing device that performs kana-kanji conversion on a phrase-by-phrase basis from the results recognized by the Tsubo syllable pattern. The following equation is used when learning is performed after dividing the input voice in pseudo-continuous utterance into N parts to obtain learning pattern sections.

Ｑ　Ｊ（ｔ）　＝＋　（、”Ｌｌ　、Ｐ　１（ｔ）Ｗｔ
　ｊ（ｔ）　＋Ｑ　ｊ（ｔ）　）ｊ＝１〜５の中の１種Ｗｉｊ（ｔ）−Ｗｉｊ　（Ｑｊ（ｔ）／Ｐｚ（ｔ））／
Ｎ但し、Ｎは学習回数の重み時間を軸の対応はＭＤＰ法による。Q J(t) =+ (,”Ll,P 1(t)Wt
j (t) + Q j (t) ) j = 1 type among 1 to 5 Wij (t) - Wij (Qj (t) / Pz (t)) /
N However, the correspondence between N and the weight time of the number of learning is based on the MDP method.

図は本発明の一実施例を示す回路のブロック図である。The figure is a block diagram of a circuit showing one embodiment of the present invention.

入力よシ先ず単音節を入力し、パラメータ計算部１に於
てパターンマツチング用パラメータを抽出し、切替部２
を単音節標準パターン登録部１０へ接続して、該パラメ
ータを単音節標準パターン記憶部１１に格納する。単音
節標準パターン登録部１０は母音及び単音節標準パター
ン加工作成部９を制御して、単音節標準パターン記憶部
１１より単音節のパラメータを送出させ、母音及び岸音
節統計情報部８に予め記録されている統泪情報より各母
音の必要に応じて母音及び鵬音節標準パターンを作成す
る。該単音節標準パターンは母音及び単音節標準パター
ン格納部７に格納され認識処理部３に於て擬似連続発声
による入力のパターンマッヂングに１車重パターンとし
て用いられる。First, a single syllable is input, parameter calculation section 1 extracts parameters for pattern matching, and switching section 2
is connected to the monosyllabic standard pattern registration section 10, and the parameters are stored in the monosyllabic standard pattern storage section 11. The monosyllabic standard pattern registration section 10 controls the vowel and monosyllabic standard pattern processing and creation section 9 to send monosyllabic parameters from the monosyllabic standard pattern storage section 11 and record them in the vowel and shore syllable statistical information section 8 in advance. Standard patterns of vowels and pen syllables are created according to the needs of each vowel from the collected information. The monosyllabic standard pattern is stored in the vowel and monosyllabic standard pattern storage section 7, and is used as a one-vehicle weight pattern in the recognition processing section 3 for pattern matching of input by pseudo continuous utterance.

次に擬似連続発声を認識させるため入力よシパラメータ
計算部１に擬似連続発声を入力し、パラメータを抽出し
て切替器２を認識処理部３へ接続し、該パラメータを母
音及び嘔音節標準パターン格納部７よりの母音及び単音
節標準パターンによりパターンマツチングして入カバタ
ーン送出部４を経て認識結果を送出する。後処理装置ｉ
ｔ５は該認識結果を文節単位で１反名漢字変換を行ない
、その結果を学習母音及び単音節標準パターン加工部６
へ送出する。学習母音及び単音節標準パターン加工部６
は入カバターン送出部４よシの認識結果と後処理装置５
の認識結果とにより学習を行なう。Next, in order to recognize the pseudo-continuous utterances, the pseudo-continuous utterances are input into the input parameter calculation unit 1, the parameters are extracted, the switch 2 is connected to the recognition processing unit 3, and the parameters are converted into vowel and syllable standard patterns. Pattern matching is performed using the vowel and monosyllable standard patterns from the storage section 7, and the recognition result is sent out via the input cover turn sending section 4. Post-processing device i
At t5, the recognition result is subjected to one antonym-kanji conversion for each phrase, and the result is applied to the learning vowel and monosyllable standard pattern processing unit 6.
Send to. Learning vowel and monosyllable standard pattern processing section 6
The recognition result of the input cover turn sending unit 4 and the post-processing device 5
Learning is performed based on the recognition results.

即ちパラメータの加重平均を行なって修正するか、相異
が太きければ入れ替えるか又は再登録して母音及び単音
節標準パターン格納部７に格納する。That is, it is corrected by performing a weighted average of the parameters, or if the difference is large, it is replaced, or it is re-registered and stored in the vowel and monosyllabic standard pattern storage section 7.

又学習結果を母音及び単音節統計情報部５８へ送シ統泪
情報の修正を行なう。The learning results are also sent to the vowel and monosyllable statistical information section 58 for correction of the statistical information.

（ｆ）　　発明の詳細な説明した如く、本発明は予め登録する標準パターンを
単音節のみで行ない、擬似連続発声による入力音声を認
識するためには不充分な単音筒車向上をｎすることか可
能となるため、その効果は犬なるものがある。(f) As described in detail, the present invention uses pre-registered standard patterns with only monosyllables, thereby improving the monophonic syllable, which is insufficient for recognizing input speech based on pseudo-continuous utterances. Because it is possible, the effect is like a dog.

[Brief explanation of the drawing]

図は本発明の一実施例を示１回路のブロック図である。１はパラメータ計算部、２は切替部、３は認識処理部、
４は入カバターン送用部、５は後処理装置、６は学習母
音及び単音節標準パターン加工部、７は母音及び単音節
標準パターン格納部、８は母音及び単音節統計情報部、
９は母音及び学音節標準パターン加工作成部、１０は単
音節標準パターン登録部、１１は単音節標準パターン記
憶部である。The figure shows one embodiment of the present invention and is a block diagram of one circuit. 1 is a parameter calculation unit, 2 is a switching unit, 3 is a recognition processing unit,
4 is an input cover pattern sending section, 5 is a post-processing device, 6 is a learning vowel and monosyllable standard pattern processing section, 7 is a vowel and monosyllable standard pattern storage section, 8 is a vowel and monosyllable statistical information section,
Reference numeral 9 represents a vowel and academic syllable standard pattern processing/creation unit, 10 represents a monosyllabic standard pattern registration unit, and 11 represents a monosyllabic standard pattern storage unit.

Claims

[Claims]

In a speech recognition device that allows input using pseudo-continuous speech,
Only monosyllables are used as standard patterns to be registered in advance, several types of standard patterns are generated for each monosyllable, and the standard patterns of the numerology are created based on the statistical information of vowels and monosyllables created in advance. Based on the results of a post-processing device that performs kana-kanji conversion on a clause-by-phrase basis, the speech input in pseudo-continuous utterance is subdivided into learning pattern sections, which are then learned. A standard pattern creation method in pseudo-continuous speech recognition characterized by rearranging vowels and monosyllable standard patterns.