JPS62116999A - Syllable unit voice recognition equipment - Google Patents

Syllable unit voice recognition equipment

Info

Publication number
JPS62116999A
JPS62116999A JP60256564A JP25656485A JPS62116999A JP S62116999 A JPS62116999 A JP S62116999A JP 60256564 A JP60256564 A JP 60256564A JP 25656485 A JP25656485 A JP 25656485A JP S62116999 A JPS62116999 A JP S62116999A
Authority
JP
Japan
Prior art keywords
syllable
voice recognition
monosyllable
sequence
recognition equipment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP60256564A
Other languages
Japanese (ja)
Inventor
宮岡 伸一郎
舩橋 誠寿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hitachi Ltd
Original Assignee
Hitachi Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi Ltd filed Critical Hitachi Ltd
Priority to JP60256564A priority Critical patent/JPS62116999A/en
Publication of JPS62116999A publication Critical patent/JPS62116999A/en
Pending legal-status Critical Current

Links

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。
(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】 〔発明の利用分野〕 本発明は、音声認識装置に係り、特に、従来量も良く用
いられていた単語単位音声認識装置を適用することが困
難であった大語い向は音声認識に好適な、単音節単位音
声認識装置に関する。
[Detailed Description of the Invention] [Field of Application of the Invention] The present invention relates to a speech recognition device, and particularly to speech recognition devices for large words, to which it has been difficult to apply word-by-word speech recognition devices, which have been commonly used in the past. The present invention relates to a monosyllable unit speech recognition device suitable for speech recognition.

〔発明の背景〕[Background of the invention]

従来の単音節単位認識方式では、不特定話者の単音節単
位入力による大語い単語音声認識(信学会論文誌、 V
ol、J 65−D、 &12) (7)ように単音節
中で比較的安定している母音をまず検出し、調音結合に
よる変形に対撚するため、子音・母音。
In the conventional monosyllable unit recognition method, large word speech recognition using monosyllable unit input from unspecified speakers (Transactions of the Institute of IEICE, V
ol, J 65-D, &12) As in (7), vowels that are relatively stable in a single syllable are first detected, and consonants and vowels are twisted in order to deal with deformation due to articulatory combination.

あるいは子音・母音・子音の連鎖についてDPマツチン
グ等の方法で認識するのが一般であった。
Alternatively, it was common to recognize chains of consonants, vowels, and consonants using methods such as DP matching.

しかしこの方法では、子音区間、母音区間のセグメンテ
ーションが必要であり、これが困難なこと、また単一の
距離情報に基づいてマツチングを行うため、本質的では
あるが微小な特徴量が失われてしまい認識率が向上しな
いという難点があった。
However, this method requires segmentation of consonant and vowel intervals, which is difficult, and because matching is performed based on a single distance information, essential but minute features are lost. The problem was that the recognition rate did not improve.

〔発明の目的〕[Purpose of the invention]

本発明の目的は、母音区間、子音区間のセグメンテーシ
ョンという困難な問題を回避し、微弱ではあるが本質的
な特徴量を失うことなくきめ細く利用する二とが可能な
、認識率の高い単音節単位音声認識装置を提供すること
にある。
The purpose of the present invention is to avoid the difficult problem of segmentation of vowel intervals and consonant intervals, and to create a single syllable with a high recognition rate that can be used finely without losing the essential features, although they are weak. The object of the present invention is to provide a unit speech recognition device.

〔発明の概要〕[Summary of the invention]

入力音声を音響分析によりスペクトル(10次元程度の
ベクトル表現)の系列に変換する。スペクトル系列は、
標準的なパターンとの比較によって量子化され、プリミ
ティブと呼ばれる記号列に変換される。ここまでは、従
来の認識方式でも用いられていた方法である0本発明で
は、ブリミテイブの系列の認識処理において、セグメン
テーション、マツチングという従来の方法をとらず、生
成文法に基づいた認識を行うものとする。各単音節(l
al=  1Kalなど)はプリミティブの系列として
表現される。そこで各単音節対応に、事前にプリミティ
ブ系列を生成する規則を定めておく。入力されたプリミ
ティブ系列がどの生成規則によって生成されるかを解析
することにより単音節の認識が可能となる。
Input speech is converted into a series of spectra (approximately 10-dimensional vector representation) by acoustic analysis. The spectral series is
They are quantized by comparison with standard patterns and converted into symbol strings called primitives. The methods described so far have been used in conventional recognition methods.In the present invention, the conventional methods of segmentation and matching are not used in the recognition process of primitive sequences, but recognition is performed based on generative grammar. do. Each monosyllable (l
al=1Kal, etc.) is expressed as a sequence of primitives. Therefore, rules for generating primitive sequences for each monosyllable are determined in advance. Monosyllables can be recognized by analyzing which production rule is used to generate the input primitive sequence.

〔発明の実施例〕[Embodiments of the invention]

以下、本発明の実施例を第1図から第3図により説明す
る。
Embodiments of the present invention will be described below with reference to FIGS. 1 to 3.

単音節認識の手順全体を第1図に示す。入力音声6は、
音響分析部1によって処理され、スペクトルの系列に変
換される。スペクトルの系列は、量子化部2によりプリ
ミティブの標準パターン3と比較されプリミティブの系
列に変換される。プリミティブの系列は、生成規則5を
用いてパーサ4により解析され、単音節系列7として出
力される。
The entire procedure for monosyllable recognition is shown in Figure 1. The input audio 6 is
It is processed by the acoustic analysis unit 1 and converted into a series of spectra. The spectral sequence is compared with the primitive standard pattern 3 by the quantization unit 2 and converted into a primitive sequence. The sequence of primitives is analyzed by the parser 4 using production rule 5 and output as a monosyllabic sequence 7.

パーサ4による解析を第2図に従い説明する。The analysis by the parser 4 will be explained with reference to FIG.

パーサに入力されるプリミティブの系列は、第2図(a
)に示すような記号列である。第2図(a)は1Kal
のようなCv音節の例を示しており、先行音節との間で
発生する無音部と、子音部、母音部からなる。この中で
、無音部と母音部は発話状況によって時間的に伸縮され
るので、任意長として表現している。この単音節に対す
る生成規則を第2図(b)に示す。ここで、Sは開始記
号。
The sequence of primitives input to the parser is shown in Figure 2 (a
) is a symbol string as shown in Figure 2 (a) is 1Kal
An example of a Cv syllable is shown, which consists of a silent part that occurs between the preceding syllable, a consonant part, and a vowel part. Among these, silent parts and vowel parts are expressed as arbitrary lengths because they are expanded or contracted in time depending on the speech situation. The production rule for this monosyllable is shown in FIG. 2(b). Here, S is the start symbol.

A、Bなどの大文字は非終端記号(書き換え可能な変数
)y a+ bなどの小文字は終端記号(プリミティブ
)である。第2図(a)の記号列に第2図(b)の生成
規則を適用して、終端記号を非終端記号で置き換えてい
けば、開始記号Sが最終的に得られる。このとき、該記
号列は該生成規則の規定する文法に従うこととなり、該
生成規則に対応する単音節として認識される。
Capital letters such as A and B are non-terminal symbols (rewritable variables), and lowercase letters such as y a+ b are terminal symbols (primitives). By applying the production rule shown in FIG. 2(b) to the symbol string shown in FIG. 2(a) and replacing terminal symbols with non-terminal symbols, the starting symbol S is finally obtained. At this time, the symbol string follows the grammar prescribed by the production rule, and is recognized as a monosyllable corresponding to the production rule.

記号列は実際には連続的に入力されることになるので、
一つの単音節の対するパージング(生成規則による解析
)の終了時をまって、次の単音節に対するパージングの
開始時とする。従って、単音節区間、あるいは母音、子
音区間などを事前にセグメント化しておくことは不要と
なる。
Since the symbol string is actually input continuously,
The time when parsing (analysis using production rules) for one single syllable ends is the time when parsing for the next single syllable begins. Therefore, it is not necessary to segment monosyllabic sections, vowel and consonant sections, etc. in advance.

生成規則は、各音節対応に、すなわち単音節の数だけ作
成しておくことになる。調音結合による記号列の変化の
バリエーションも生成規則の中に埋め込んでおくことと
する。また、パージングの方法については、雑音や発声
状況によって生ずる記号の脱落、挿入に対撚するため、
エラーコレクティングなパージング法を用いるのが望ま
しい。
Production rules are created for each syllable, that is, for each syllable. Variations in changes in symbol strings due to articulatory combinations are also embedded in the production rules. In addition, regarding the parsing method, in order to prevent symbols from being dropped or inserted due to noise or speech conditions,
It is desirable to use an error-correcting purging method.

第3図に、本発明に係る単語音声認識装置の実施例を示
す。4,5,7.9はそれぞれ、音響分析用、量子化用
、パージング用、マツチング用の装置である。また、6
,8.10はそれぞれ、プリミティブの標準パターンを
格納したメモリ、生成規則を格納したメモリ、単語辞書
用のメモリである。入力音声11は、まず4でスペクト
ル系列に変換され、該スペクトル系列は5でプリミティ
ブ系列に変換される。7では、該プリミティブ系列がパ
ージングされ、単音節として認識される。
FIG. 3 shows an embodiment of the word speech recognition device according to the present invention. 4, 5, and 7.9 are devices for acoustic analysis, quantization, purging, and matching, respectively. Also, 6
, 8.10 are a memory storing standard patterns of primitives, a memory storing production rules, and a memory for a word dictionary, respectively. The input speech 11 is first converted into a spectral sequence in step 4, and the spectral sequence is converted into a primitive sequence in step 5. At 7, the primitive sequence is parsed and recognized as a monosyllable.

単音節系列は、9で単語辞書とマツチングをとられた結
果単語として認識される。3は、各装置間の通信用のバ
スであり、装置間の同期制御、バス制御はコントローラ
1によって行われる。認識結果は、ホスト計算機2によ
って利用される。
The monosyllable sequence is matched with the word dictionary in step 9 and is recognized as a word. 3 is a bus for communication between devices, and the controller 1 performs synchronization control and bus control between the devices. The recognition result is used by the host computer 2.

〔発明の効果〕〔Effect of the invention〕

本発明によれば、単音節単位の認識を行うので語い数が
増大したときにもテンプレート作成の手間、マツチング
に要する手間が増えることがないという効果があること
は言うまでもないが、単音節単位の認識方式として従来
の方法と比較した場合にも、 (1)音節区間、あるいは母音区間、子音区間のセグメ
ンテンテーションを行うという困難な問題を回避するこ
とができる (2)距離最小化のマツチングを行う際、失われてしま
うような、微弱ではあるが本質的な特徴量をきめ細く利
用できる という効果があり、認識率の向上が期待される°。
According to the present invention, recognition is performed in units of monosyllables, so it goes without saying that even when the number of words increases, the effort required for template creation and matching does not increase. When compared with conventional methods as a recognition method for This has the effect of making it possible to make detailed use of weak but essential features that would otherwise be lost when performing a process, and is expected to improve the recognition rate.

【図面の簡単な説明】[Brief explanation of drawings]

第1図は、単音節単位音声認識の手順図、第2図(a)
は、プリミティブ系列の例、第2図(b)は(a)に対
応する生成規則、第3図は単語音声認識装置の実施例で
ある。 冨  1  図 冨 Z  図 (L) rut −1’l l”+Qz4ab+ −b、bt−
bz(b)
Figure 1 is a procedure diagram for monosyllable unit speech recognition, Figure 2 (a)
is an example of a primitive sequence, FIG. 2(b) is a generation rule corresponding to FIG. 2(a), and FIG. 3 is an example of a word speech recognition device. Tomi 1 Figure Tomi Z Figure (L) rut -1'l l"+Qz4ab+ -b, bt-
bz(b)

Claims (1)

【特許請求の範囲】[Claims] 1、入力音声を音響分析する手段と、該音響分析の結果
得られるベクトル系列を量子化する手段と、該量子化の
結果得られる記号列を生成規則を用いて解析することに
より単音節の認識を行う手段を有することを特徴とする
音節単位音声認識装置。
1. A means for acoustically analyzing input speech, a means for quantizing a vector sequence obtained as a result of the acoustic analysis, and a method for recognizing monosyllables by analyzing the symbol string obtained as a result of the quantization using production rules. A syllable-by-syllable speech recognition device characterized by having means for performing.
JP60256564A 1985-11-18 1985-11-18 Syllable unit voice recognition equipment Pending JPS62116999A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP60256564A JPS62116999A (en) 1985-11-18 1985-11-18 Syllable unit voice recognition equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP60256564A JPS62116999A (en) 1985-11-18 1985-11-18 Syllable unit voice recognition equipment

Publications (1)

Publication Number Publication Date
JPS62116999A true JPS62116999A (en) 1987-05-28

Family

ID=17294391

Family Applications (1)

Application Number Title Priority Date Filing Date
JP60256564A Pending JPS62116999A (en) 1985-11-18 1985-11-18 Syllable unit voice recognition equipment

Country Status (1)

Country Link
JP (1) JPS62116999A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8738378B2 (en) 2007-07-09 2014-05-27 Fujitsu Limited Speech recognizer, speech recognition method, and speech recognition program

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8738378B2 (en) 2007-07-09 2014-05-27 Fujitsu Limited Speech recognizer, speech recognition method, and speech recognition program

Similar Documents

Publication Publication Date Title
US4994983A (en) Automatic speech recognition system using seed templates
US5680510A (en) System and method for generating and using context dependent sub-syllable models to recognize a tonal language
JPS62235998A (en) Syllable identification system
JPS62232691A (en) Voice recognition equipment
JPS62116999A (en) Syllable unit voice recognition equipment
Tunalı A speaker dependent, large vocabulary, isolated word speech recognition system for turkish
JP2615643B2 (en) Word speech recognition device
JP2615649B2 (en) Word speech recognition device
JP2737122B2 (en) Voice dictionary creation device
JPH1097270A (en) Speech recognition device
JPS6270900A (en) Syllable recognition system
JP2707552B2 (en) Word speech recognition device
KR100340688B1 (en) Method for extracting the number of optimal allophone in order to speech recognition
JPH0635494A (en) Speech recognizing device
JPH05303391A (en) Speech recognition device
JPH11175087A (en) Character string matching method for word speech recognition
Akila et al. WORD BASED TAMIL SPEECH RECOGNITION USING TEMPORAL FEATURE BASED SEGMENTATION.
JPS62217297A (en) Word voice recognition equipment
JP2008145996A (en) Speech recognition by template matching using discrete wavelet conversion
JPH1130994A (en) Voice recognizing method and device therefor and recording medium recorded with voice recognition program
JPS6033599A (en) Voice recognition equipment
JPH11175085A (en) Word speech recognition method and word dictionary for speech recognition
JPS6317498A (en) Word voice recognition system
JPS6180298A (en) Voice recognition equipment
JPS61256396A (en) Voice recognition equipment