JPS6031196A

JPS6031196A - Voice pattern generator

Info

Publication number: JPS6031196A
Application number: JP14007783A
Authority: JP
Inventors: 潤一郎藤本
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1983-07-30
Filing date: 1983-07-30
Publication date: 1985-02-16

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】肢」しヂＭ＝本発明は、音声パターン作成装置、より詳細には、音声
認識装置における音声の特徴パターンを作成するための
装置に関する。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a voice pattern creation device, and more particularly to a device for creating voice characteristic patterns in a voice recognition device.

災来皮４第１図は、従来の音声認識装置の一例を説明するための
図で、図中、１はマイク、２はフィルター群、３はサン
プリング回路、４は音声区間検出回路、５はレジスター
、６は認識部で、周知のように、マイクｌ°から入力さ
れた音声はバンドパスフィルター群２によって周波数分
析され、一定の周期でサンプリングされた後、音声と雑
音を判別して音声区間のみがレジスター５に記録される
。或いは、サンプル数が多い場合はデータ圧縮された後
レジスター５に記録され認識部６でそのデータを利用す
る。ところが語頭の音の違いによって意味の異なる単語
、例えば６／ｒｏｋｕ／と億１０ｋｕ／の如きものの区
別がつけにくいという欠点がある。このような欠点を解
決するために音声の立ち上り状態にあることを検出する
検出回路を設け、これによってす□ンプリング回路３の
サンプリング周期を変化させる方法が提案されているが
（特開昭５７−１９１０００）、この方法は子音が音声
の立ち上り部に位置することが多いことを利用して、変
化の速い子音を母音よりも短い一定周期でサンプルし、
子音の特徴を十分にとり出そうとするものである。第２
図は、上記方法によって数字音声４２／５ｈｔｊｙｕｎ
ｉ／と発声した時の音声信号とそのサンプル周期を示し
たもので、図示のように、音声の立−トリ時にはサンプ
リング周期を１１とし、それ以外の時にはサンプリング
周期を１２とし、ｔ＋　＜ｔ２となるようにしているが
、実際には音声のエネルギーはこのようにならないこと
が多く、先例６と億については第３図（ａ）、（ｂ）に
示すようになる。このように子音のエネルギーは母音に
比べて小さく母音が発声される際にエネルギーは立ち上
り状態になる。又、子音も後半ではエネルギーが小さく
なることが多く特に／ｒ五ｋ　１＋　／、／　ｏ　ｋ　
ｕ　／の／　ｕ　／は発声されないのが普通である。こ
のような場合／　ｒ　／、／に／は共にエネルギーが立
ち上り状態にある前半部だけが短い周期でサンプリング
され後半は長い周期となってしまう。しかも経続時間の
長い母音の立ち」−り部も細かいサンプルとなるため、
子音の特徴だけを十分とり出すことができない。Figure 1 is a diagram for explaining an example of a conventional speech recognition device.In the figure, 1 is a microphone, 2 is a filter group, 3 is a sampling circuit, 4 is a speech section detection circuit, and 5 is a diagram for explaining an example of a conventional speech recognition device. The register 6 is a recognition unit, and as is well known, the voice input from the microphone 1° is frequency-analyzed by a group of band-pass filters 2, sampled at a constant cycle, and then distinguished between voice and noise to determine the voice interval. only is recorded in register 5. Alternatively, if the number of samples is large, the data is compressed and recorded in the register 5, and the recognition unit 6 uses the data. However, it has the disadvantage that it is difficult to distinguish between words that have different meanings depending on the sound at the beginning of the word, such as 6/roku/ and 1010ku/. In order to solve these drawbacks, a method has been proposed in which a detection circuit is provided to detect that the voice is in the rising state, and the sampling period of the sampling circuit 3 is thereby changed (Japanese Unexamined Patent Publication No. 1986-57). 191000), this method takes advantage of the fact that consonants are often located at the rising edge of speech, and samples rapidly changing consonants at a constant period shorter than vowels.
The aim is to fully extract the characteristics of consonants. Second
The figure shows the number sound 42/5htjyun by the above method.
This shows the audio signal and its sampling period when uttering "i/".As shown in the figure, the sampling period is 11 when the voice is starting, and the sampling period is 12 at other times, and t+ < t2. However, in reality, the energy of the voice is often not like this, and the example 6 and 10 million is shown in Figures 3 (a) and (b). In this way, the energy of a consonant is smaller than that of a vowel, and when a vowel is uttered, the energy rises. Also, the energy of consonants often decreases in the latter half, especially /r5k 1+ /, /o k
The /u/ of u/ is usually not uttered. In such a case, only the first half of /r/ and /ni/, where the energy is in a rising state, is sampled at a short period, and the second half is sampled at a long period. In addition, the final parts of vowels with long durations are also sampled in detail, so
It is not possible to fully extract the characteristics of consonants.

」−一眞本発明は、−上述のごとき従来技術の欠点を解消するた
めになされたもので、特に、子音の特徴を強調して音声
パターンを作成する装置を提供することを目的としてな
されたものである。- Kazuma The present invention was made in order to eliminate the drawbacks of the prior art as described above, and in particular, it was made for the purpose of providing a device that creates a speech pattern by emphasizing the characteristics of consonants. It is.

棗−一爪本発明の構成について、以下、−実施例に基づいて説明
する。Jujube - One Claw The structure of the present invention will be described below based on Examples.

本発明は、無声子音が母音に比べて高い周波数成分によ
って成り立っていることに着目し、音声の高い周波数成
分が低域より大きい時はサンプリング周期を細かくする
ようにしたものである。The present invention focuses on the fact that voiceless consonants are made up of higher frequency components than vowels, and the sampling period is made finer when the high frequency components of the voice are larger than the low frequency components.

第４図は、本発明の一実施例を説明するための電気的ブ
ロック線図で、図中、７は比較器で、その他、第１図と
同様の作用をする部分には、第１図の場合と同一の参照
番号が付しである。而して、第４図に示した音声認識装
置において、マイク１から入力された音声はバンドパス
フィルター群２を通過するが、本発明においては、その
際バンドパスフィルター群２の最高帯域フィルターと最
低帯域フィルターの出力を比較器７で比較し、最高帯域
フィルターの出力が大なる時に、フィルター群の出力を
サンプルするサンプレンゲ回路３のサ３− ンプリング周期を短くし、その後信号から音声の区間だ
けをとり出してレジスター５へ納めるようにしている。FIG. 4 is an electrical block diagram for explaining one embodiment of the present invention. In the figure, 7 is a comparator, and other parts having the same functions as those in FIG. The same reference numbers as in the case are given. In the speech recognition device shown in FIG. 4, the voice input from the microphone 1 passes through the band-pass filter group 2, but in the present invention, the highest band filter of the band-pass filter group 2 and The output of the lowest band filter is compared with the comparator 7, and when the output of the highest band filter is large, the sampling period of the sampling circuit 3 which samples the output of the filter group is shortened, and then only the section from the signal to the voice is sampled. I take it out and put it in register 5.

なお、上記実施例では最高、最低帯域フィルター出力を
比較するようにしているが、これは最高、最低近傍の出
力であれば良い。また、言うまでもないことであるが第
４図の各部の順序を入れ替えても同等の効果が得られる
ことは容易に理解できよう。In the above embodiment, the highest and lowest band filter outputs are compared, but it is sufficient if the outputs are near the highest and lowest. It goes without saying that it is easy to understand that the same effect can be obtained even if the order of the parts in FIG. 4 is changed.

羞−一部以上の説明から明らかなように、本発明によると、従来
強調され難かった単語中の子音部に重みづけした音声パ
ラメータを作成することができ。Shyness - As is clear from the above description, according to the present invention, it is possible to create voice parameters that weight consonant parts in words, which have traditionally been difficult to emphasize.

その結果音声認識装置の認識率を向上させることができ
る。As a result, the recognition rate of the speech recognition device can be improved.

[Brief explanation of drawings]

第１図は、従来の音声パターン作成装置の一例を説明す
るための要部構成図、第２図及び第３図は、第１図に示
した従来装置の動作を説明するための信号波形図、第４
図は、本発明による音声パターン作成装置の一実施例を
説明するための要部４− 構成図である。１・・・マイク、２・・・フィルター群、３・・・サン
プリング回路、４・・・音声区間検出回路、５・・・レ
ジスター、６・・・音声認識装置部、７・・・比較器。FIG. 1 is a main part configuration diagram for explaining an example of a conventional voice pattern creation device, and FIGS. 2 and 3 are signal waveform diagrams for explaining the operation of the conventional device shown in FIG. , 4th
The figure is a configuration diagram of a main part 4 for explaining an embodiment of the voice pattern creation device according to the present invention. DESCRIPTION OF SYMBOLS 1... Microphone, 2... Filter group, 3... Sampling circuit, 4... Voice section detection circuit, 5... Register, 6... Speech recognition unit, 7... Comparator .

Claims

[Claims]

It has a means for frequency analyzing an audio signal, a means for sampling the frequency-analyzed signal, and a means for comparing the magnitude of the frequency components, and when the high frequency component is equal or larger in magnitude than the low frequency component, A voice pattern creation device characterized in that the sampling period of the sampling means is shortened.