JPS59192298A

JPS59192298A - Voice message identification system

Info

Publication number: JPS59192298A
Application number: JP6726183A
Authority: JP
Inventors: 湯浅　啓義; 大村　皓一
Original assignee: Matsushita Electric Works Ltd
Current assignee: Panasonic Electric Works Co Ltd
Priority date: 1983-04-15
Filing date: 1983-04-15
Publication date: 1984-10-31
Also published as: JPH02720B2

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】〔技術分野〕本発明は電子機器を音声メツセージによって操作する之
めの音声メツセージ識別方式にＩｉＡ″ｒるものである
。DETAILED DESCRIPTION OF THE INVENTION [Technical Field] The present invention relates to a voice message identification method for operating electronic equipment by voice messages.

[Background technology]

第１図乃至第壬図は日本語の５母音の特徴を示す負相で
ある。１子音Ｐは音韻に対応して、第１図に示したよう
な周波数スペクトル包袷を南し１音声の周波数スペクト
ル分析により、このスペクトル包絡線のヒータに相当す
るホルマント周波数を才め周波数の低い方から第１ホル
マントＦ、％第２ホルマントＦ２、・・・と順に表わせ
ば、５寸音は、第２図に示したまうなＦ１〜Ｆ４の変化
カーブで表わせる。Figures 1 to 1 are negative phases showing the characteristics of the five Japanese vowels. The first consonant P corresponds to the phoneme, and by analyzing the frequency spectrum of the first voice as shown in Figure 1, the formant frequency corresponding to the heater of this spectrum envelope is reduced, and the lower frequency is determined. By representing the first formant F, the second formant F2, .

このホルマント周波数の分布２Ｆよ、Ｆ、ＩＩ標軸上で
調べたものｔ第５図に示した。この第５図からもわかる
ように日本語の５母音を線型識別するためには、％１〜
第５ホルマシトを正しく求めないとできないと云われて
いる。This formant frequency distribution 2F, examined on the F and II reference axes, is shown in Figure 5. As can be seen from Figure 5, in order to linearly identify the five Japanese vowels, %1~
It is said that this cannot be done unless the 5th formacite is determined correctly.

我々は、ごく少数の限定さｎた卑語ないしは、文菫ン識
別する装置内オローコストに製作するため、完全ではな
くともおおよそ５母音に似た擬音韻に記号化する方式才
検討した。We investigated a method of symbolizing onomatopoeias that are approximately, if not completely, similar to five vowels, in order to produce a very small number of vulgar words or sentence words that can be identified at a low cost within the device.

第５図は従来本発明者らが開発した音声メツセージ識別
装置の概略構成才示している。同図において、■に音声
人力中、０〜ＩＫＨｚの周波奴帝域の短詩同半均パワー
？示しており、刊μ〜昔のエネルギに対にひしている。FIG. 5 schematically shows the structure of a voice message identification device conventionally developed by the present inventors. In the same figure, ■ is the short poem in the voice human power, the frequency of 0 to IKHz, and the same half-equal power? It shows the energy of the past, published by μ.

筐た、Ｕは音声入力中、５〜１２　Ｋ　Ｈｚの周波数帯
域の短時間平均ハヮーτ示しており、無声音のエネル十
に幻ルししているＶＢ／ＶＬ、お！びＶ　Ｆ　／Ｖ　Ｂ
　ｋｈ１分とする３次元ベクトルに所定の行列Ｔｍｋ乗
算して、音声入力中に含まｎる谷母音／　ａ　／、／ｉ
／、／ｕ／、／　ｅ　／、１０／の短時間平均パワーＶ
ａ、Ｖｉ、Ｖｕ、Ｖ６、Ｖｏ並びに広顎有声音、狭顎；
ｇｐ音、前古有戸音、復古有声音、母音／ａ／と１０／
の中間的な有声音の各短時間平均ノ＼ワーｖｈ、ｖｚ％
Ｖｆ、Ｖｂ％ｖｗ’ｚ算出するものである。行列計算部
Ｍ　Ｃ０の出力は最大値判定部Ｍ　Ｘ　ｏに入力さｒて
、各成分Ｖａ、　Ｖｌｓ　Ｖｕ％Ｖｅ、Ｖｏ、ｖｈ、ｖ
ｚ。During voice input, U shows a short-term average high τ in the frequency band of 5 to 12 kHz, and VB/VL, which is illusory with the energy of unvoiced sounds, oh! and V F /V B
Multiply the three-dimensional vector of kh1 by a predetermined matrix Tmk to obtain n valley vowels /a /, /i included in the voice input.
/, /u/, /e /, 10/ short-time average power V
a, Vi, Vu, V6, Vo and wide-jaw voiced, narrow-jaw;
gp sounds, old voiced sounds, retro voiced sounds, vowels /a/ and 10/
Each short-term average of intermediate voiced sounds vh, vz%
Vf, Vb%vw'z are calculated. The output of the matrix calculation unit M C0 is input to the maximum value determination unit M
z.

ｖｒ、ＶＷのうちｉ大の成分がどれであル＞　ｗ　ＩＪ
定され、その最大の成分の符号か記号化処理部ＭＹｏに
入力さｎる。記号化処理部ＭＹ０は、比較手段Ｃ０から
出力さｎる符号がＶであるときには、最大値判定部ＭＸ
ｏ７＋”ら出力さｎるｖ２．Ｖｉ％　Ｖ　ｕ％Ｖ　ｅｓ
　Ｖ　ｏ、Ｖｈ、Ｖ／％Ｖｆ％Ｖｂ。Which component is larger than i among vr and VW? > w IJ
The code of the largest component is input to the encoding processing unit Myo. When the code n output from the comparison means C0 is V, the symbolization processing unit MY0 converts the maximum value determination unit MX
v2.Vi% V u%V es
Vo, Vh, V/%Vf%Vb.

およびＶｗのうちのいずれた１つの符号ｒ出力し、また
比軟手段ＣＯから出力されるｌＲ号汐判またはＳである
ときには、その符号ｔその１ま出力するものである。こ
の記号化処理部ＭＹ０７１にら出力標準パターン記憶部
に入力記憶さｎ、酔声メツセージの認識時には入カバタ
ーン記）は都に入力記憶されるものであり、照合処理時
には、予め登録さｎｆＣ標準パターンのうち入力へター
ンに最も近似する標準パターンを人力メツセージとじて
識別するものである。and Vw, and when it is 1R or S output from the comparative soft means CO, the code t is output. This symbolization processing unit MY071 is inputted and stored in the output standard pattern storage unit n, and when recognizing a drunken message, the input cover pattern is input and stored in the memory, and during the verification process, the nfC standard pattern registered in advance is Among these, the standard pattern that most closely approximates the input turn is identified as a human message.

ところで゛第５図従来例において、■Ｈと■Ｌのパワー
のバランスを調整すると、差信号の零点才境界にして、
正側に５母音の（ｅ　、　ａ　、　ｏ　）、負側に（ｉ
、ｕ）か位置し、したがってＶＨ／ＶＬ差１ａりは、Ｖ
ｅａｏ／Ｖｉｕ差信号と云うべき信号になる。次に、Ｖ
ＦとＶＢのパワーのバランスを調整すると、差信号の零
点上境界にして、正側に５母音の（ｉ、ｅ）、負側に５
母音の（ａ、ｏ、ｕ）か位置し、Ｖ　Ｆ／Ｖ　Ｂ７ｆｉ
信ｅｌ′ＵＶ１ｅ／Ｖａｏｕ７ｉイど号と云うべき信号
になる。一方、Ｖ１３／ＶＬ庄信号のバランスを調整す
ると、斧イ♂句の零点を境界にして、正側に５母音の（
ａ）、負側に５母音の（０）が位置し、したがってＶＢ
／ＶＬ差倍号はＶ　ａ　／　Ｖ　ｏ差信号と云うべき信
号になる。By the way, in the conventional example shown in FIG.
Five vowels (e, a, o) on the positive side, (i
, u) is located, so the VH/VL difference 1a is V
This becomes a signal that can be called an eao/Viu difference signal. Next, V
By adjusting the power balance between F and VB, the difference signal will be at the border above the zero point, and five vowels (i, e) will be on the positive side and five vowels on the negative side.
Vowel (a, o, u) is located, V F/V B7fi
The signal el'UV1e/Vaou7i becomes a signal that should be called an ID signal. On the other hand, when the balance of the V13/VL Sho signal is adjusted, the 5th vowel ((
a), the fifth vowel (0) is located on the negative side, therefore VB
/VL difference multiplier becomes a signal called Va/Vo difference signal.

第６図軸）（ｂ）は第５図従来例において各有声音ＶＨ
％ＶＬ、ＶＦ、ＶＢ６り９ｆｉｌｌ１間平均八ワーｔｌ
ｔＸ＃）出丁尺めに用いるフィルタの周波数特性を示す
図であり、同図（ａ）は横軸の周波数上均等目盛として
描いてあシ、同図（ｂ）は横軸の周波＆七対故目盛とし
て描いである。なお第６図において、ＡＰは後述する調
整アンプの特性？示している。Figure 6 axis) (b) shows each voiced sound VH in the conventional example in Figure 5.
%VL, VF, VB6 average 8wtl between 9fill1
(t It is drawn as an accident-proof scale. In Fig. 6, AP is the characteristic of the adjustment amplifier described later. It shows.

第７図は第５図従来例における行列計算部ＭＣ６および
最大値ヤＪ定都ＭＸｏと同じ機能を夫現する別の手段を
示すものである。この褐７図は、′？！ｒ差信号Ｖｅａ
ｏ／Ｖｉｕｓ　Ｖ　ａ／Ｖ　ｏｓ　Ｖｉｅ／Ｖａｏｕの
レベルを高レベル（Ｅ−１）、中レベル（Ｍ）、および
低レベル（Ｌ）の５値で表わした場合における擬音順！
ｔ！ＩＪ別フｏ−２示している。同図のフ０−では、１
ず％　一段階では第１ホルマントＦ１に対応したＶｅａ
ｏ／Ｖｉｕ　　差信号で’ｉ！ＩＪ別し、第二段階では
第２ホルマシトＦ、に対応したＶｉｅ／Ｖａｏｕ差佃号
でや」別し、第三段階でＶ　ａ　／　Ｖ　ｏ差信号でや
］別することにより、（ｉ　＋　ｅｓ　ａ＋　Ｏｅ　ｕ
　−ｈ、’　＋　　’　＋　ｂ＋ｗ　、ｍ）の１１種類
に有声音を記号化するものである。FIG. 7 shows another means for realizing the same functions as the matrix calculation section MC6 and the maximum value MXo in the conventional example shown in FIG. This brown 7 figure is '? ! r difference signal Vea
The onomatopoeic order when the level of o/Vius Va/Vos Vie/Vaou is expressed in five values: high level (E-1), medium level (M), and low level (L)!
T! IJ separate foo-2 is shown. In the figure, F0- is 1
% At the first stage, Vea corresponding to the first formant F1
o/Viu difference signal 'i! In the second step, the Vie/Vaou difference signal corresponding to the second formasite F is separated, and in the third step, by the Va/Vo difference signal], (i + es a+ Oe u
-h, '+' + b+w, m) are used to encode voiced sounds into 11 types.

第８図乃至第１１図は第５図従来例において、５母音／
ｉ／、／　ｅ　／、／ａ／、１０／、／ｕ／を入力した
場合における差卿増１陥手段Ｓ。〜Ｓ３の出力債号波形
？示している。上記各区において、　Ｕ　／　Ｖ　４Ｎ
　’七４°、　Ｈ／　Ｌ　　侶＠　、　Ｆ　　／　Ｂ　
Ｇ４　う＝４、　Ａ／Ｕ倍号はそｎぞｎ屋切増幅手段Ｓ
０〜Ｓ３の出力を示している。まｆｃＳＹＭは各自声音
の区別τ示してＰす、例えば第８図におい又、１％　１
％　ｆ％　ｅ、・・・などは、有声音Ｖｆ、Ｖｉ、Ｖｆ
、Ｖｅ、・・・會そｎぞｎ示している。ｆｃたし、符号
ｍは各有声音Ｖａ、Ｖ　ｉ、Ｖｕ、Ｙｅｓ　Ｖｏ、ｖｈ
％　　Ｖｌｓ　　Ｖｆｓ　　Ｖｂ、Ｖｗのうち、いずｎ
ｖｃも該当しない有声音Ｖｍτ示している。し刀・して
第６図および第９図は相！Ａなる男性の被緘者二人につ
いて測定し文例を示して２す、第１０区お工ひ第１１図
は女性の４Ｎ験者二人について測定した例を示している
。上記各区τ児ｎは、姑者によらず、はぼ同じ特＆を抽
出していることかわ〃為るが、母音の／ｅ／音と／ｕ／
音と１０／音の記号化は、フィルタの１１にノウハウが
あって、記号化が若干不完全である。Figures 8 to 11 show the conventional example in Figure 5, 5 vowels/
Difference increase 1 error means S when inputting i/, /e/, /a/, 10/, /u/. ~S3 output bond waveform? It shows. In each of the above districts, U/V 4N
'74°, H/L mate@, F/B
G4 U = 4, A/U double number is yakiri amplification means S
The outputs of 0 to S3 are shown. fcSYM indicates the distinction τ of each voice, for example, in Fig. 8, 1% 1
% f% e, etc. are voiced sounds Vf, Vi, Vf
, Ve, . . . each meeting is shown. fc, and the code m is each voiced sound Va, Vi, Vu, Yes Vo, vh
% Vls Vfs Which of Vb and Vw is n?
VC also indicates a voiced sound Vmτ that does not apply. The 6th and 9th figures of the sword and figure are phase! Figure 11 shows an example of measurements taken on two male subjects named A, and two female subjects with 4N. It can be seen that each of the above-mentioned groups τ-n extracts the same characteristic &, regardless of the mother-in-law, but the /e/ sound of the vowel and /u/
The symbolization of sounds and 10/sounds is due to the know-how of the filter 11, and the symbolization is somewhat incomplete.

ところで、上述の母音の第１ホルマントＦエ　と、゛粥
２ホルマントＦ２の分布を示す第５図と舌の調音位置を
表わす第４図とを比軟すると、第１ホルマントＦ１は、
顎が広く開いていると／ａ／のように商い周波数になり
、顎が狭く開いていると／ｉ／のように低い周波数にな
るので、はぼ′伽の広狭に対応していることがわかる。By the way, if we compare the above-mentioned first formant F of the vowel with Fig. 5 showing the distribution of the ``porridge 2 formant F2 and Fig. 4 showing the articulation position of the tongue, the first formant F1 is as follows.
If the jaw is wide open, the frequency will be high like /a/, and if the jaw is narrowly open, the frequency will be low like /i/, so it can be said that it corresponds to the wide and narrow of Habo'ka. Recognize.

一方、刀２ポルマントＦ２は、同様に舌の位置の前後に
ほぼ対応していることかわかる。′１：た第２区および
第５図を見れば、母音の第２ホルマントは男性と女性と
でばらつきが大きいことがわかる。しかるに従来例にあ
っては、かかる第２ホルマントをＶＦ／ＶＢ差（ｉ！号
のみで分離しているものであるから、特に母音の（ｅ）
と（ｏ、ｕ）の分離が不完全になるという問題があった
。つ１す、ＶＦ／ＶＢ毘イｄ号では、第６図乃至第土工
凶において５母音のなかで／　ｕ　／　Ｋ対応する部分
が鎖側にもつと太きく検出されたいが、これが弱く、ま
た／　ｅ　／に対応する部分や１０／に対応する部分の
後半も弱いことがわかり、これが記づ化を不備夫にする
要因となっている。かかる不完全さを解消するために、
従来、５母音を発声したときの７肚佃号のずれを個人別
にオフセットとして氷めて抽出する方式を提案したが、
それでもこのようなオフセットＡ格はなるべく矩（い方
がよいし、葦たオフセット量が少ない方がｔｔしいこと
は云うまでもない。それにも拘らず、従来例にあっては
上述のようにばらつきの大きい第２ホルマントをＶ　Ｆ
　／　Ｖ　Ｂ　ｍ信号のみで分１ＩＩｌｔ；　している
ので、フィルタ対の斧佑づ出力の零点細土のオフセット
がかなり大さくなり、場合によっては完全に抽出できな
い２それがあった。葦たこのような岸侶づ出力の零点補
止を行なわない場合には、夫１祭の発声と擬音類の記号
との柑痙のため、敵別牢詣奴の？歳少ヤ詔、調率の低下
を葦ねくという欠点があった。このため、不特疋酷首用
としては、単語玖等で者しい市Ｉｎ民を受けるというこ
とにもなった。On the other hand, it can be seen that the position of the sword 2 Polmanto F2 almost corresponds to the front and back of the tongue. '1: If you look at the second section and Figure 5, you can see that the second formant of the vowel varies greatly between men and women. However, in the conventional example, since the second formant is separated only by the VF/VB difference (i!), especially the vowel (e)
There was a problem that the separation of and (o, u) was incomplete. 1st, in VF/VB biid No. 6, among the 5 vowels in Figure 6 to No. 1, the part corresponding to / u / K on the chain side is expected to be detected as thick, but this is weak, and It turns out that the second half of the part corresponding to /e/ and the part corresponding to 10/ is also weak, which causes the notation to be incomplete. In order to eliminate such imperfections,
Previously, we proposed a method that extracts the deviation of the 7th vowel when uttering the 5th vowel as an offset for each individual.
Still, it goes without saying that it is better to make this type of offset A as rectangular as possible, and it goes without saying that the smaller the amount of offset, the better. The large second formant of V F
/ V B Since only the m signal is used, the offset of the zero-point fine soil of the output of the filter pair becomes quite large, and in some cases, it may not be possible to extract it completely. If you don't perform zero correction for the output of Kishi-no-zu like Ashitako, you'll be confused by the convulsions between the utterances of the husband and the onomatopoeic symbols. The edict against a young man had the disadvantage of preventing a decline in the balance. For this reason, it came to be known that words such as "ku" were used to describe people who were particularly cruel.

〔発明の１１灼〕本発明は上述の点に鑑みて為されたものであり、母音の
第２ホルマントの特徴を（籠夫に抽出できるようにして
、５母音のより完全な記号化を可能とし、葦だフィルタ
対の１ｌＬｆ号出力の語基による零点補正量を少なくで
きるようにしだ音声メツ七−ジ詠別方式を提供すること
を１同とするものである。[Eleventh aspect of the invention] The present invention has been made in view of the above-mentioned points, and it is possible to extract the characteristics of the second formant of a vowel, thereby enabling a more complete symbolization of the five vowels. It is an object of the present invention to provide a system for determining the number 11 of the 11Lf outputs of the Ashida filter pair and to reduce the amount of zero point correction based on the word base of the Ashida filter pair.

[Disclosure of the invention]

第１２図は、不発明の特許請求の範囲第１項に記載され
た構収を機能的にブロック化して示したいわゆるクレー
ム対応図である。同区において、Ｆｖｉｄ音声入力の低
周波成分の短時間平均パワーを収り出すフィルタであり
、Ｆｕは音声入力の品周波成分を収り出すフィルタであ
る。谷フィルタＦｖ％Ｆｕの出力は差すＪ増１咄手段Ｓ
。に入力されて、差信号成分全抽出される。Ｃｏに比紋
手段であり、上記差ｗ′ＪＮ１咄手段Ｓ。から出力され
る正信号読分が、基ｌ＄１直Ｒｖよりも小さいときには
自声−咥■の符ＪＰｊ全割り当て、基ｉ＄ＩＬＩＲｕよ
りも大きいときには無声音ＵＶの符号を酌り当て、それ
以外の場合には無音Ｓの符号全否］ｊり当てるものであ
る。ただし、Ｒｕ　）　０　）　Ｒｖである。次にＦａ
ｌは楢声音のうち顎の開きの狭い狭゛拍有声音（母音の
ｉ、ｕなど）の短時間平均パワーを収り出すフィルタで
あり、Ｆｕ２　は旬μ音のうち狽の開きの広い広゛確剃
声音（母音のｅ、ａ、ｏなと）の短時間平均パワー全駅
す田すフィルタである。次に、Ｆｂ１１Ｉ′ｉ顎の開き
の広い回加有声音のうち、母音のｅｓＯのような第１ホ
ルマントの低い音の短時間平均パワーを収り出すフィル
タであり、Ｆｂ２ｉ１−１同じく顎の囲きの広い広宛句
声音のうち、母音のａのような第１ホルマントの［６い
音の短時間平均パワーを敗り出すフィルタである。次に
ＦＣ，ｉ’１、化１ホルマントの低い広凋伺′ｐ竹のう
ち、母音の００ような第２ホルマントの低い音の短時間
平均パワーを敗り出すフィルタであり、ＦＣ２は、同じ
く第１ホルマントの１氏い広狽自声音のうち、母音のｅ
のような’４’、　２ホルマントの商い音の短時間平均
パワーを敗り出すフィルタである。さらにＦｄｌは、顎
の開きの狭い伏硝自μ音のうち、母音のＵのような第２
ホルマントの低い音の短時間平均パワーを収り出すフィ
ルタであり、Ｆｄ２１−１母音のｉのような第２ホルマ
ントの高い短時間平均パワーを収り出すフィルタである
。Ｓ０〜Ｓ４は差卯Ｊ瑠１１１ｉ手段であり、それぞれ
差信号Ｖ／ＵＶ、Ｖｅａｏ／Ｖｉｕ％Ｖａ／Ｖｅｏ　　
ＳＶｅ／Ｖｏ、■ｉ　／　Ｖ　ｕ乞算出するものである
。差粥増幅手段Ｓ。FIG. 12 is a so-called claim correspondence diagram showing the collection described in claim 1 of the non-inventive claim in functional blocks. In the same area, Fvid is a filter that extracts the short-term average power of the low frequency component of the audio input, and Fu is a filter that extracts the high frequency component of the audio input. The output of the valley filter Fv%Fu is
. , and all difference signal components are extracted. It is a comparison means to Co, and the above difference w'JN1 means S. When the positive signal reading output from is smaller than the base l$1 direct Rv, allocate the sign JPj of own voice - mouth ■, and when it is larger than the base i$ILIRu, take into account the sign of the unvoiced voice UV, otherwise In the case of , the sign of silence S is determined. However, Ru ) 0 ) Rv. Next, Fa
1 is a filter that extracts the short-time average power of narrow-beat voiced sounds with a narrow jaw opening (vowels i, u, etc.), and Fu2 is a filter that extracts the short-time average power of narrow-pitched voiced sounds with a narrow jaw opening (vowels i, u, etc.).゛It is a short-term average power all-stations filter for definite voicing sounds (vowels e, a, and o). Next, Fb11I′i is a filter that extracts the short-term average power of the low sound of the first formant, such as the vowel esO, among recursive voices with a wide jaw opening, and This is a filter that eliminates the short-term average power of the first formant [6] sound, such as the vowel a, among the wide-ranging sounds of the phrase. Next, FC, i'1, is a filter that eliminates the short-term average power of low sounds in the second formant, such as the vowel 00, among the low sounds in the second formant, and FC2 is also Among the first formant's broad self-voiced sounds, the vowel e
This is a filter that eliminates the short-term average power of a 2-formant quotient such as '4'. Furthermore, Fdl is the second of vowel U-like sounds in the phlegmatic self-μ sound with a narrow opening of the jaw.
This is a filter that extracts the short-term average power of a sound with a low formant, and it is a filter that extracts the high short-term average power of a second formant such as the Fd21-1 vowel i. S0 to S4 are difference signals V/UV, Veao/Viu%Va/Veo, respectively.
SVe/Vo, ■i/Vu is calculated. Different porridge amplification means S.

の出力に比軟手段Ｃ９において是ｋＰ−値Ｒｖ、Ｒｕ（
Ｒｖ　（０（Ｒｕ　）と比軟され、庄価号出力か基準１
１１′ｉＲｖよりも小さい場合には自声性Ｖと判定され
る。筐だ上記庄（ｇ号出力が基準ｉ＠Ｒｕよりも大きい
場合にｅユ無戸音Ｕと判定さｎ１基準愉ＲｕとＲｖとの
商であれは無音Ｓとヤ」定される。そしてツバ（音、伺
Ｐ音、２よび無声音の谷場合についてＳ、Ｖ、Ｖの谷符
勺のうちいずれか１つの符づが配り化処理ｔ４ＪＩ　Ｍ
ｙ　ｏ　に人力される。Ｍ　ＣＯは、谷圧ＴＭＪ増１咄
手段Ｓ１〜Ｓ、の出力全人力とする行列ｖトＪＡ−都で
あり、この行列計算都ＭＣｏは各走償号出力Ｖｅａｏ／
Ｖｉｕｓ　Ｖ　ａ　／　Ｖｅｏ、　Ｖ　ｅ　／　Ｖ　ｏ
、ＶｉＴｖを乗算して、音声入力中に含°まれる谷母音
１、ｅ、ａ、ｏ、Ｕの短時間平均パワー全算出するもの
である。第１２図の構成においては、広狽慣声音ｖＨと
狭楕自声音ＶＬとの比率を求める皮切増１隅手段Ｓ、と
、ｉす古刹声音ＶＦと候舌慣声音ＶＢとの比率を求める
差動＠幅手段Ｓ６とを設けてあり、行列計ｐ−都ＭＣｏ
ではこれらの谷差切順１１１４１手Ｗ　Ｓ　５、Ｓｅ　
〕ｍ１Ｆ＋　Ｍｌ力ｖ　Ｈ／　Ｖ　Ｌ　ｓ、−Ｊ：び■
Ｆ／ＶＢＫ所定の行列Ｔ（’ｆｚ米ｐして、音声入力中
に含まれる広＠自Ｐ音（ｈ）、伏頻慣ハ斤（１７、曲舌
勺μ音（ｆ）、後置有声音（ｔ）Ｊ、２よびその１世の
広報かつ復古＋３声音（ｗ）のパワー全算出するもので
ある。In the soft means C9, the kP-values Rv, Ru(
Rv (0 (Ru)) and the output value is the standard 1
If it is smaller than 11'iRv, it is determined that the voice is V. (If the g output is larger than the reference i@Ru, it is determined that there is no sound U, and the quotient of the n1 reference value Ru and Rv is determined to be silent S. (For sound, P sound, 2 and unvoiced sound, any one of the valley marks of S, V, and V is distributed t4JI M
It is manually operated by yo. MCO is a matrix vtoJA-to which is the output of the valley pressure TMJ increasing means S1 to S, and this matrix calculation capital MCo is the output Veao/to of each running number.
Vius Va/Veo, Ve/Vo
, ViTv to calculate the total short-term average power of valley vowels 1, e, a, o, and U included in the voice input. In the configuration shown in FIG. 12, the first corner S means for calculating the ratio between the wide idiomatic sound vH and the narrow elliptical idiomatic sound VL, and the ratio between the isu ancient vocal sound VF and the isu tongue idiomatic sound VB are used. A differential@width means S6 is provided, and a matrix total p-to-MCo is provided.
Now, these valley cutting order 11141 moves W S 5, Se
] m1F+ Ml force v H/ V L s, -J: Vi ■
F/VBK predetermined matrix T ('fz ricep), wide @ own P sound (h) included in the audio input, fuku frequency practice ha 斤 (17, curved pronunciation μ sound (f), with postfix) It calculates the total power of voice (t) J, 2 and its 1st generation and restoration + 3 voice (w).

行列計算都ｂ１ｃ：　６において用いる’ｊ〕夕’ＪＴ
　ｖ　、Ｔ　ｃ行列計算部Ｍ　Ｃｏの出力に最大値判定
都ＭＸｏに入力されて、谷戚分Ｉ％ｅｓ　１％、Ｏｓ　
ｕｌ　ｈｓ　１．１％　ｂ％Ｗのうち最大の成分がどれ
であるが？判定され、その最大の成分の符づが記号化処
理６１ｓＭＹｏに人力される。たたし最大の成分と２都
目に大きい成分との差が小さいときには符りｍが出力さ
れる。記号化処理部ＭＹｏは、比軟手段Ｃ８から出力さ
れる符号がＶであるときには、最大匍判足部ＭＸｏから
出力される１％　　ｅｓ　　ａ、０％　　ｕ、ｈ、ｚ％
　ｆｌ　１）％Ｗ２よびｍのうちのいずれが１つの符号
全出力し、また比軟手段Ｃｏ　７Ｑ・ら出力される行づ
がＵまたはＳであるときには、その符号をその址ま出力
するものである。この記号イＬ処垣１ｎｉＳＭＹ、から
出力される複合符号に、廿μメツセージの登録時には標
準パターン記憶部に入力記憶され、音声メツセージの認
識時には入力バター−）記憶部に入力記憶されるもので
あり、照合処理時には、予め登録された標準パターンの
うち人カパターシに最もｉＸ似する＃、準Ｊ＼ターシ紫
人カメッセージとして置別するものである。Matrix calculation city b1c: 'j] Yu'JT used in 6
v, Tc The output of the matrix calculation unit MCo is input to the maximum value judgment capital MXo, and the valley ratio I%es 1%, Os
Which is the largest component among ul hs 1.1% b%W? It is determined, and the sign of the largest component is manually entered into the symbolization process 61sMYo. When the difference between the largest component and the second largest component is small, the sign m is output. When the code output from the ratio soft means C8 is V, the symbolization processing unit MYo converts 1% es a, 0% u, h, z% output from the maximum slender foot unit MXo.
fl 1) When either of %W2 and m fully outputs one symbol, and the line output from the soft means Co 7Q is U or S, that symbol is output until its end. be. The composite code output from this symbol is input and stored in the standard pattern storage unit when registering a message, and is input and stored in the input butter-) storage unit when recognizing a voice message. , At the time of the matching process, #, which is most similar to iX among the standard patterns registered in advance, is set as the quasi-J\tashi-purple message.

な２％１２図の構成において、ＶＨ７ＶＬ差イＭ号およ
びＶＦ／ＶＢ差侶すは、それぞれＶｅａｏ／Ｖｉｕ　　
差信号およびＶ　ｅ　／　Ｖ　ｏ差信号で代用してもか
まわない。In the configuration of the 2% 12 figure, the VH7VL difference number M and the VF/VB difference number are Veao/Viu, respectively.
A difference signal and a V e /V o difference signal may be used instead.

第１３図？″１８フィルータ方式の夫施例横我を示すプ
０ツク図である。上述の第１２凶の構成でに、Ｆ　Ｖｓ
　Ｆ　ｕ　ｓ　Ｆ　ａ　ｓ　ｓ　Ｆ　ａ　２　％　Ｆ　
ｂ　１、Ｆ　ｂ　２、ＦＣ１％ｐ　ｃ　２　ｓ　Ｆｄ　
Ｉ　Ｓｐ　ｄ２の合計１０個のフィルタ全必要とするが
、協１３凶の構成では、このうち２つのフィルタ？兼用
して８フイルタで音声の特徴を抽出でさるようにしたも
のである。Figure 13? It is a block diagram showing an embodiment of the 18 filter system.In the above 12th configuration,
F u s F a s s F a 2 % F
b 1, F b 2, FC1% p c 2 s Fd
A total of 10 filters for I Sp d2 are required, but in the configuration of Kyō13, two of these filters are required. Eight filters are also used to extract the voice features.

第１３図に２いて、ＶＦｈにＨｌ」古刹声音の尚域反分
、ＶＦはθ１」古刹声音の成分、ＶＢは候古自声音の成
分、ＶＨｈは広顎自声音の１烏城戚分、Ｖ　Ｈｆほ入相
有声音の低域成分、ＶＬは伏顎有声音の成分をそれぞれ
抽出するフィルタの出力である。そして第１０凶の未施
忰」に２いては、Ｖｅａｏ／Ｖｉ差信号とＶａ／Ｖｅｏ
　　差信号についてＶＬ全井川用、まだＶ　ｅ　／　Ｖ
　ｏ差信号とＶ　ｉ　／　Ｖ　ｕ差信号についてＶＢ（
又はＶＬ）を共相したものである。In Figure 13, 2, VFh is the component of the ``Hl'' ancient vocal sound, VF is the component of the θ1'' ancient vocal sound, VB is the component of the ancient self-voiced sound, VHh is the 1 Karasugi relative portion of the wide-jawed own vocal sound, VHf is the output of a filter that extracts the low-frequency component of the in-phase voiced sound, and VL is the output of the filter that extracts the component of the voiced sound. And in the 10th worst case, the Veao/Vi difference signal and Va/Veo
For the difference signal VL for all Igawa, still V e / V
o difference signal and V i / V u difference signal VB(
or VL).

これは、フィルタ対の差信号の零点がフィルタ帯域の父
差点（クロスオーへ周波欽）に対応するものであるため
、ｍ　４Ｅ４号會とるフィルタ対のうち、片方のフィル
タの帯域をｈ定しても、もう一方のフィルタの帯域が２
種類あれは、フィルタ帯域の父差点が変わることになる
。This is because the zero point of the difference signal of the filter pair corresponds to the difference point of the filter band (the frequency difference to the cross-over). Also, the band of the other filter is 2.
Depending on the type, the difference point of the filter band will change.

ところで、第１３図の夫施例においては、ＶＨｈとＶＢ
は、はぼ同じであるので、１つのＶＢに１とめたいが、
粥１３図の１１では、ＶＢから３つのた信号全敗り出す
ことになって、フィルタ対のバランス読挙が田蝿になる
。そこでＶＢを高域１戊分ＶＢｈと全域成分ＶＢとに分
け、ＶＦを１つＶＣまとめて、７フイルタ方式とした例
を第１４凶にボす。この場合には、ＶＢから２つの差信
号を抽出するだけであるのでフィルタのパラシス９Ｍ儀
に１ｍ単になる。また別な見方をすれば、第１缶図の芙
施例ｒｔ第５図従来例にＶＢｈ金加えたものとも云える
。第１５図（ａＪ　（ｂ）は、粥１４図の７フイルタ方
式にＰいて谷刹声音の成分ＶＬ％ＶＨ，ＶＢ、ＶＢｈ％
ＶＦの短時間平均パワーを収り出すために用いるフィル
タの周技数特１注？示す凶であり、同図（ａ）は横軸の
周波依會均等目盛として描いてあり、同図（ｂｉは横萌
Ｕのｊｄ波数を対数目盛として抽いである。この第１５
凶にあ・いてＡＰに後述する唐金アンプの特性を示して
いる。By the way, in the example of FIG. 13, VHh and VB
Since they are almost the same, I would like to put one in one VB, but
At 11 in Figure 13, all three signals are lost from VB, and the balanced reading of the filter pair becomes unreliable. Therefore, an example in which VB is divided into one high-frequency component VBh and a whole range component VB, and one VF is combined into one VC, and a 7-filter system is adopted is shown as the 14th example. In this case, since only two difference signals are extracted from VB, the filter paralysis becomes only 1m. From another perspective, it can be said that VBh gold is added to the conventional example shown in Fig. 5 and the conventional example shown in Fig. 1. Figure 15 (aJ (b) shows the components of the valley voice sound VL%VH, VB, VBh% when P is applied to the 7 filter method shown in Figure 14.
Special note on the number of cycles of the filter used to extract the short-term average power of VF? This figure (a) is drawn as a frequency dependence uniform scale on the horizontal axis, and in the same figure (bi is drawn from the jd wave number of Yokomoe U on a logarithmic scale.
It shows the characteristics of Karakin amplifier which will be explained later in AP.

′第１６凶ｐユ６フィルタ方式の夫り１例である。すな
わち、上述の第１午凶夫施例にあっては、ＶＢｌｌの代
わりにＶＨをとっても、母音のｉとＵの識別はｑＪ龜で
あるので、同波奴成分ベクトルは、Ｕｖ、Ｖ、ＶＦ、Ｖ
Ｂ％ＶＨ％ＶＬの６ｇ分（ｅｌフィルタ）で構成するこ
とができるものである。この第１・６図去施例な、別な
見方？すれば掲５図従来例にｖＦ／■Ｈ尭偏号會追加し
たものであり、フィルタ帯域もほぼ回じものが使える。This is an example of the 16th filter method. That is, in the above-mentioned first example, even if VH is used instead of VBll, the discrimination between the vowels i and U is qJ, so the same-wave component vectors are Uv, V, VF. , V
It can be configured with 6g of B%VH%VL (el filter). Is there another way to look at this example from Figures 1 and 6? In this case, the vF/■H polarization is added to the conventional example shown in Figure 5, and almost the same filter band can be used.

たたし、ＶＦ／ＶＢ差佃号は、母音のｅと０が確夫に識
別できるように調整する。第１７図（ａ）　（ｂ）は、
第１６凶の６フイルタ方式において各自声音の成分ＶＬ
、ＶＨ，ＶＢ％ＶＦの短時間平均パワー全Ｍｙ、シ出す
ために用いるフィルタの周阪奴特性を示す凶であり、同
図（ａ）は横１肺の周波数を月等目盛として描いてあり
、同図（ｂ）は横軸の周波数を対飲目、Ｚをとして描い
である。この第１５図に２いてＡＰは後述する藺裕アン
プの特性を示している。However, the VF/VB difference number is adjusted so that the vowels e and 0 can be clearly distinguished. Figures 17(a) and (b) are
In the 16th 6-filter method, each own voice component VL
, VH, VB% VF, short-term average power Total My, is a characteristic that shows the frequency characteristic of the filter used to extract the power, and Figure (a) shows the frequency of one horizontal lung as a monthly scale. , the same figure (b) is drawn with the frequency on the horizontal axis versus the drinking eye, and Z as the horizontal axis. In FIG. 15, 2 AP indicates the characteristics of the Iiyu amplifier, which will be described later.

’ｉ！！、１３凶火施例の８フィルタ方式、第１午凶夫
施例の７フイルタ方式、Ｐよび第１午凶夫施例の６フイ
ルタ方式における行列計算都Ｍｃｏの友換行列Ｔ　ｍと
しては、■〜７Ｐｃｓ）式のようなものが使まず０式の
変換行列Ｔｍに、識別に最低限必要な要素以外に０にし
て、１ｔ１−算全速くできるようにしたもので、■式は
、要素の絶対値が８の部分に冗長度を持たせ、差信号の
検出が弱い場合には幅広く５母音の記号化が可能になる
ようにしだもので、０式は第１ホルマントＦ１に関する
差信号に対する５母音の要素をすべて同じ大きさの、献
み（絶対値１４）にすると共に、゛第２ホルマントＦ２
に関する２つの差信号に関しては、５母音に対して、ど
ちらかに−個づ゛つ鍼′ＩＪ１１に必要な恵みをつけた
もので、第１ホルマントＦｌを第２ホル？ゝトＦ２　よ
り重要視したものと云える。この変換行列Ｔｍは、域別
対象の否葉等によって任意に役定できるものである。'i! ! , the 8-filter method of the 13th example, the 7-filter method of the 1st example, and the 6-filter method of P and the 1st example. ■~7Pcs) Formulas are not used, and all elements other than the minimum required for identification are set to 0 in the 0-type transformation matrix Tm, so that 1t1-calculation can be performed faster. It is designed to provide redundancy in the part where the absolute value of In addition to making all the 5 vowel elements the same size (absolute value 14), the second formant F2
Regarding the two difference signals for the 5 vowels, the necessary grace is added to the needle 'IJ11' for each of the five vowels, and the first formant Fl is changed to the second form? It can be said that more importance was placed on this than F2. This transformation matrix Tm can be used arbitrarily depending on whether or not the area is targeted.

特に、記号ベタ１−ルの成分を５母音（ｉ、ｅ。In particular, the components of the symbol Beta 1 - five vowels (i, e.

ａ、Ｏ，Ｕ）のみとする場合には、第１２凶のＴＶに相
当する行列の要素は（＋１，０．−１）のいづれかで良
いので、乗＃−會必要とせず符号賀侯だけで、簡単な記
号化が可能である。−万第１２図の１゛Ｃに相当する記
号（ｈ　＊　ｌＴ　ｔ　＋　ｂ　＋　ｗ　）の変換行列
の要素は、この行うすの行ベクトルのノルムｉＴｖの行
ベクトルのノルムと同じにするかＴ　ｃの行ベクトルの
ノルムの憧か、ＴＶの行ベクトルのノルムの値より小さ
く、かつＴｖの行列の要素の絶対値よりも大きくする。a, O, U), the element of the matrix corresponding to the 12th worst TV can be either (+1, 0.-1), so there is no need for the multiplication #-kai and only the code number. This allows easy symbolization. The elements of the transformation matrix of the symbol (h * lT t + b + w) corresponding to 1゛C in Figure 12 should be the same as the norm of the row vector iTv of this process. The norm of the row vector of c is smaller than the norm of the row vector of TV, and larger than the absolute value of the element of the matrix of Tv.

このようにしないと、５母音の各成分（ｉ、ｅ、ａ、ｏ
、ｕ）よりもその他の有声音の成分（、ｈ　、’　ｌ　
、　ｆ　、　ｂ　、　ｗ）の方が小さくなってしまう。If you do not do this, each of the five vowel components (i, e, a, o
, u), other voiced sound components (, h ,' l
, f, b, w) will be smaller.

次により具体的な夫施例について説明する。第１８凶は
第１４図の７フイｌし夕方式會より具体化した夾施例、
第１９図は第１６図の６フイルタ方式乞より具体化した
失施例を示しており、両者の遠いにフィルタＦ８算りの
刊無のみである。上記谷犬施例に２いて、音声は、マイ
クＩＩ）より入力され、プリアンプ（２）で増１陥され
て、調整アンプ（３）でゲインとオフセットを詩聖され
る。次にレベル調整器（５）ではＶ／Ｕ　Ｖ差信号と他
の差僧号との入力パワーのバランスをとる。（一般に、
Ｖ／ＵＶ庄イ日号よりも池の差信号の万全強調する。）
次に、■／ＵＶバランス調余器ｔ４１ではフィルタＦｖ
の入力とフィルタＦｕの入力とのパラシス金とる。一方
、ＶＢ／ＶＬＪ＼ランス調俯器（６１を中点に調竪し、
Ｖ　Ｈ／Ｖ　Ｌバランス読堅８４１７１で１フイルタＦ
ＨとフィルタＦＬの入力バランスをとり、■Ｆ／ＶＢバ
ランス、ＩＡＩ　格器（８１でフィルタＦ　Ｆ’とフィ
ルタＦＢ（ＦＢｈ）のバランス金とる。次にＶＢ／ＶＬ
バラシス調企器（６）で、ＶＢとＶＬのバランスをとる
。第１９凶の構成では、ＶＢ／ＶＬバランス調茶器（６
）全調整するとＶＦ／Ｖｌ（のバランスも同時にとれて
いる。Next, a more specific example will be explained. The 18th example is the 7th film in Figure 14, which is a concrete example from the evening ceremony,
FIG. 19 shows a more specific example of the 6-filter system shown in FIG. 16, with only the F8 filter being far from the two. In the above-mentioned Taninu Example 2, audio is input from the microphone II), amplified by the preamplifier (2), and adjusted for gain and offset by the adjustment amplifier (3). Next, a level adjuster (5) balances the input power of the V/UV difference signal and other difference signals. (in general,
We will thoroughly emphasize the difference signal of Ike rather than V/UV Shoi Higo. )
Next, in the /UV balance adjuster t41, the filter Fv
Take the parasitic relationship between the input of the filter Fu and the input of the filter Fu. On the other hand, VB/VLJ\Lance-like headgear (tune 61 to the midpoint,
V H/V L balance reader 84171 with 1 filter F
Balance the inputs of H and filter FL, ■ F/VB balance, IAI scale (81 balance the inputs of filter F F' and filter FB (FBh). Next, VB/VL
Balance VB and VL with the balance adjustment device (6). In the 19th configuration, VB/VL balance tea utensil (6
) When all adjustments are made, VF/Vl (are balanced) at the same time.

各フィルタの出力は、マルチプレク＋ｊ（９）で順次ｌ
；ＩＪり換えなから対数変換器（ｌＯ）で、パワー金対
数スケールに変換し、Ａ／Ｄ］ンバータ（１りで８ピツ
トの２進数にディジタル化する。なお各フィルタをディ
ジタルフィルタで構成する場合には、Ａ／Ｄコンバータ
（ｌす（１、調整アンプ（３）の次段に来るもので、パ
イプライン方式で、谷フィルタ計算′に順次行ない、一
種のマルチフレフサ（９）のように順々に各フィルタの
出力が計算される。次に、このディジタル（ｌｔｉの相
互の差を計算し、差信号ベルトル抽出都隆で、差信号へ
クトＪｂ（ＵＶ／Ｖ、Ｖｅａｏ／Ｖｉｕ、Ｖａ／Ｖｅｏ
、　　Ｖｅ／Ｖｏ　、　Ｖ　ｉ／Ｖｕ　）の５式分を計
算する。The output of each filter is sequentially l
Convert the IJ to a logarithmic scale using a logarithmic converter (IO), and digitize it to an 8-pit binary number using an A/D converter (1 unit).Each filter is configured with a digital filter. In this case, the A/D converter (1) comes after the adjustment amplifier (3), and performs the valley filter calculation in sequence in a pipeline manner, like a kind of multi-flex filter (9). The output of each filter is calculated separately. Next, the mutual difference of this digital (lti) is calculated, and the difference signal hect Jb (UV/V, Veao/Viu, Va/ Veo
, Ve/Vo , V i/Vu ) are calculated.

第２０図乃至第２凸凶は、化１８凶の夫施例について、
’ｆｉ８図乃至第１１図の音声と同一の音声を録音テー
プにより入力し、音声の特徴抽出を行なった結果金示し
ており、また第２４図乃至２７図に、第１９凶の夫施例
について、同様に第８図乃至′９ｐＪｌ１図の音声と同
一の音？全録音テープにより入力し、音声の特徴抽出を
行なった結果を示している。これらの化２０図乃至第２
７因においては、従来例のＶＦ／ＶＢ７ｉ信号が、Ｖ　
ｅ　／　Ｖ　。Figure 20 - The second convex evil is about the example of the husband of the 18th evil,
The same voices as those shown in Figures 8 to 11 were input using a recording tape, and the characteristics of the voices were extracted. The results are shown in Figures 24 to 27. Similarly, is the sound the same as the sound in Figures 8 to '9pJl1? It shows the results of inputting all audio tapes and extracting audio features. These figures 20 to 2
In the seventh cause, the VF/VB7i signal of the conventional example is V
e/V.

差イａ号とＶ　ｉ　／　Ｖ　ｕ差信号の２本になり、ま
た従来例のＶＡ／Ｖｌ）等信号が、Ｖ　ａ　／　Ｖ　ｅ
　ｏ　　差Ｇ４　Ｍ（図中ではａ　／　ｏと略記）にな
ったものである。There are now two signals, the difference a and the V i / V u difference signal, and the conventional VA / Vl) etc. signal is now V a / V e
o Difference G4 M (abbreviated as a/o in the figure).

なお夷２０図乃至第２７図において、ａ／ｉでに、Ｖｅ
ａｏ／Ｖｉｕ　　差信号音ボしている。しかして従来例
では、ＶＦ／ＶＢ＆侶りによるｅ％ｕ、　。In addition, in Fig. 20 to Fig. 27, in a/i, Ve
ao/Viu difference signal sound is blurred. However, in the conventional example, e%u, due to VF/VB & limit.

の検出が差イ言号の零点に近づいて弱くなり、ｅｓＵ％
　Ｏの記号化がｉ５　ａに比べて困難であったが、第２
０凶乃至第２７凶では、Ｖ　ｅ　／　Ｖ　ｏ　ｐｉイｄ
号でｅ、ｏの検出が確夫になシ、Ｖ　ｉ　／　Ｖ　ｕ　
ｉｉイ！ＪＰｊでＵの検出が億夫になったので、５＃音
の記号化が、より確スに行なえることがわかる。特に第
２４図乃至第２７図では、Ｖ　ｌ／　Ｖ　ｕ差信号が第
２０区乃至第２３凶よりも明確にｉとＵを識別しており
、夫施例に関する限りｍ１８図のものよりもダ５１９図
の方が侃夫に５母音金肥号化していると云えるものであ
る。The detection of the difference becomes weaker as it approaches the zero point of the difference word, and esU%
Symbolization of O was difficult compared to i5 a, but the second
From 0 to 27th, V e / V o pid
The detection of e and o is not reliable in the issue, V i / V u
ii! Since the detection of U in JPj became smooth, it can be seen that the symbolization of the 5# sound can be performed more reliably. In particular, in FIGS. 24 to 27, the V l / V u difference signal distinguishes i and U more clearly than in the 20th to 23rd sections, and as far as the second example is concerned, it is more dank than that in FIG. m18. Figure 519 can be said to have five vowels in Kano.

次に第１８図および第１９図に民って、■／Ｕ、゛Ｖ判
定部０３１では、Ｖ／ｕｖ企価号が、ある設定１唾Ｒｕ
ｆより正のときＵＶと判定し、ある設定値ＲＶより負の
とき■と判定し、その中間全Ｓと判定する。始端、終端
検知部（１４）でに、■又はＵＶの判定により音声の始
端？検知し、無音かある設定値以上のυンプル奴の間、
継わ°Ｃすると終端と検知する記号ベクトル変換部０５
）は、第１４図Ｐよび第１６図で示したように行列演算
で、記号ペクト＋Ｖ　（１＋　ｅ　＊　ａ＋　０＋　ｕ
＋　ｂ　＋　ｌ　＊　ｆ　＋　ｂ＋　Ｗ）　Ｋ　Ｌ模す
る。胆し、行列演算はＶの区間でのみ行なうものである
。記号１１Ｚ処理８ｉＳ　ｔ１６１　＆よ、■の凶１醐
で（１ａ己リベクトルの最大１況分がある肢疋旭以上の
場合に出力する。またＵＶとＳの区間では、それぞれＵ
Ｖ％ｓｆ！：出力する。Next, in FIG. 18 and FIG.
When it is more positive than f, it is judged as UV, when it is more negative than a certain set value RV, it is judged as ■, and all the intermediate values are judged as S. The start/end detection section (14) detects the start end of the audio based on ■ or UV judgment. Detected, while there is no sound or υ sample above a certain set value,
Symbol vector conversion unit 05 that detects the end when it joins °C
) is a matrix operation as shown in Fig. 14P and Fig. 16, and the symbol pect+V (1+ e * a+ 0+ u
+ b + l * f + b+ W) K L imitate. However, matrix operations are performed only in the interval of V. Symbol 11Z processing 8iS t161 &yo,■ is output when there is a maximum of 1 situation of revector (1a).In the UV and S sections, U
V%sf! :Output.

整形処理部１１７１でｆ４、同じ記号の繰返しを一つの
記号とその継続時間とのリストに直し、さらに継続時間
が、あるシ定値より少ないものに、前後の記号が同じ場
合には、これら全一つのリストにし、前後の記号が異な
る場合には、曲の記号に含めるようにして・継続時間の
短いものは省略する。At f4, the formatting processing unit 1171 converts the repetition of the same symbol into a list of one symbol and its duration, and further, if the duration is less than a certain predetermined value and the symbols before and after are the same, all of these are unified. If the symbols before and after are different, include them in the song symbol.・Omit short duration symbols.

時間Ｍ綾型正規化処理部（１８）に、各リストの継続時
間の合計が２００（あるいに１０００）といった一定慎
にな仝ように、継続時間全正規ｆヒする。The time M-type normalization processing unit (18) calculates the total duration f so that the total duration of each list is 200 (or even 1000).

これは、従来例と同様に全サンづル愉２００（あるいは
１０００　）と継続時間との比率？それぞれの継続１１
す同に掛は合わせると良い。この１祭、リストの改が少
な匹（ｌｏ〜２０）ので、來除算にはアマり時間をとら
ない。Is this the ratio between the total number of samples 200 (or 1000) and the duration, as in the conventional example? Continuation of each 11
It is best to match the hangings. This festival, the number of changes in the list is small (lo to 20), so I don't spend much time on the next division.

以上のプロセスで、木刀式の音声パターンが、作成でき
る。With the above process, a wooden sword style voice pattern can be created.

この音声パターンは、登録上−トでは、徐準バターシロ
己（怠惰５（１９）に登録される。認職ｔ−ドでは、距
離計算部（２ｏ）で、標準パターンと間合するが、まず
Ｕｖの歌等で一次識別して、照付対象を限定してＰ〈。This voice pattern is registered as Seojun Butter Shiroki (Lazy 5 (19)) in the registration top.In the certified job t-do, it is matched with the standard pattern in the distance calculation section (2o), but first Primary identification is performed using Uv's song, etc., and the illumination target is limited to P〈.

次に、距離テーブル（２１ｊで、時間ＩＮ上で対シロす
る８已号聞の距離（相関１１１）を求めて、これを、全
サンプルについて合計したもの全バターシ間の距離とす
る。距離テーブルはとしては粥１表にｌＪ〈すようなも
の金柑いるものである。Next, in the distance table (21j), find the distance (correlation 111) between the 8 points that correspond to each other at time IN, and use this as the distance between all the batas, which is the sum of all the samples.The distance table is For example, porridge with kumquats on the first side.

（以下余白）第１表において、横の楠Ｐよび羅の欄はそれぞれ標準パ
ターンの符号および入カバターンのＨＩＪづに対応して
おり、例えば標準パターンの符号７５≦ａであって、し
かも入カバターンの符号もａであるときには、距離テー
ブル＋２１１の出力は２となシ、近似度が高いこと？示
すものでめる。−１だ標準Ｊ＼ターンの符号がＵＶであ
り、入カバターンの符号力２ａであるときには、距離テ
ーブル（２１）の出力Ｖｉ−２となり、近似度が低いこ
とを示すものである。したがって距離計算部間において
に、距離テーラ）しく２■）からの出力？順次加算する
だけの油算孫作により、入カバターンと標準パターンと
のパターン全体としての近似度全容易に計算することが
できるものである。(Leaving space below) In Table 1, the horizontal Kusunoki P and Luo columns respectively correspond to the code of the standard pattern and the input cover pattern HIJ. For example, if the code of the standard pattern is 75≦a, and the input cover pattern is When the sign of is also a, the output of distance table +211 is 2, which means that the degree of approximation is high? Describe what you show. -1. When the sign of the standard J\ turn is UV and the sign strength of the input pattern is 2a, the output of the distance table (21) is Vi-2, indicating that the degree of approximation is low. Therefore, between the distance calculation sections, the output from the distance tailor)?2)? The degree of approximation of the entire pattern between the input cover pattern and the standard pattern can be easily calculated by simply adding the patterns in sequence.

旬惹性ｍ冗都゛０は、距離の蚊も近いパターンがある設
定イ直より近く、さらに２４日ｖ？Ｃ近いものより、あ
る設定１１Ｉ！以上離れている場合Ｖこ、この辰も近い
パターンと人力ＡターフがＩ口Ｊじとみ！　シ、＋ｗの
場合には謁峨不良としてリジェクトする。泌誠紬朱は峨
が」結果出力都日）より出力する０された併合発明の構
成を、Ｏｉ能的に９０ツクｒヒして示したいわゆるクレ
ーム対心凶であり、捷た第２９図は第２８図の、溝成金
さらに共悼化した゛夫施例の構成を示すブロック凶であ
る。上記各図において、８８％　Ｓｌ、Ｓ２、Ｓ３、Ｓ
、はそれぞれ、　　ＵＶ／Ｖ　　差イ＝Ｓ　、　　Ｖｅ
ａＯ／Ｖｉｕ７ｉ　イ占号、　　Ｖａ／Ｖｅｏ　　差信
号、Ｖ　ｅ　／　Ｖ　ｏ　差（ｇ”ｌ、Ｖ　ｉ　／　’
Ｖ　ｕ　ｆ４イ吉号を抽出するための斧前増幅手攻であ
る。谷箆制増１隔手紋Ｓ。−８４の出力に、そｒ）Ｃれ
比軟器Ｘ２由〜ｊ３３）において所定の基Φしベルと比
軟され、谷基準レベルとの大小間係に応じてそれぞれ別
々の符号を削り当てられる。丑ず比献器久；２５１は、
羨卯Ｊ増輻手段Ｓ。の出力が正の一定値以上であるとさ
には符りＵＶｉ削り当て、貝の一定１直以下であるとき
にｉｌ：ｉ符すＶ全削り当て、その他の場合には９９−
づＳを却」り当てるものである。次ＶＣ比威温婦）シフ
ｊは、斧切増１而手攻Ｓ１の出力が止の一足値以上であ
るときには符号Ｖｅａｏを酌り当て、黄の一定旭以下で
あるときには符号Ｖｉｕ　　ｆ割り当て、その１世の場
合には符号Ｓ全丼」り当てるものである。筐た比軟器（
ハ）！２９１　ｉｄｓ差はυ増１商手段Ｓ２の出力が正
の一定値以上であるときには符号Ｖａを割り当て、負の
一定１直以下であるときには符すＶｅｏ　　’＜割り当
て、その他の場合には符り・ｓ２割り当てるものである
。次に比軟器＋３［１）　ｔ３１）は、斧制増）面手Ｉ
役８３の出力が正の一足値以上であるときには符号Ｖ　
ｅ　’ｙ　＋！ａり当て、負の一定１直以下であるとき
には符号Ｖ。The seasonal attraction is closer than the setting I, where the mosquitoes in the distance have a similar pattern, and even 24 days v? A certain setting 11I is better than something close to C! If it is more than V, this dragon is also close pattern and human power A turf is I mouth J! In the case of +w, the request is rejected as an audience failure. Figure 29 shows the structure of the merged invention outputted from ``Result Output Date'', which is a so-called claim vs. 28 is a block diagram showing the structure of the second example, which has become more commonplace. In each figure above, 88% Sl, S2, S3, S
, are respectively UV/V difference i=S, Ve
aO/Viu7i A horoscope, Va/Veo difference signal, Ve/Vo difference (g"l, Vi/'
This is an ax front amplification move to extract the V u f4 i lucky number. Valley Seki system increase 1 interval hand pattern S. The output of -84 is softened with a predetermined base Φ level in the Sor)C ratio softener . 251 is,
Envy J augmentation means S. If the output is above a certain positive value, it will be marked UVi, and if it is less than a certain number of shells, it will be marked V, and in other cases, it will be 99-
It is the one that assigns "S". Next VC Hii Onfu) Schiff j takes into consideration the code Veao when the output of the ax cut increases 1 and moves S1 is more than the value of the stop, and when it is less than the constant value of yellow, the code Viu f is assigned, In the case of the first generation, the code "S Zendon" is applied. Kakitatahi Software (
Ha)! 291 ids difference is assigned a sign Va when the output of the υ increase 1 quotient means S2 is above a positive constant value, is assigned a sign Veo '< when it is below a negative constant 1, and in other cases is assigned a sign ・s2 is assigned. Next, Hi Souki +3 [1) t31) is Ax control increase) Mente I
When the output of winning combination 83 is greater than or equal to the positive one-leg value, the sign V
e 'y +! a, and when it is less than a negative constant 1, the sign is V.

（ｉ−６１ｊり当て、そのｆｍの場合にに符づＳをθＪ
り当てるものである。さらに比軟器）肋１９３ｉは、差
φυイＵ幅手段Ｓ４の出力か止の一定値以上であるとき
には付づ■ｉを割り当て、貝の一足１１は以下であると
きに（Ｃ符号Ｖｕ會６１」り当て、ぞの１凹の′ｆ＠佇
にに勾・号Ｓ？割り当てるものである。谷比軟器１２６
］〜（３３ｊの出力μ人カピットハターシレジスタ州、
Ｖこ一時記１暮され、Ｖｊｓ己ちイし処Ｊ３Ｊ’−’ｒ
ｊＴ＋　ｃ４５）に２いて占己刀化テーブルｔＪ６１と
診１（彊しながら、第１２凶の場合と同様に、符号ａ、
ｅｈ　　０％　ｉｓ　　ｕ％　ｈｚ　　１％　ｆ、ｂ％
　ＷＳ　ｒｎのうちのいずれか１つの符号に便４装びれ
る。配、号化テーブルｌ３６ｉの一例を示すと、第２表
のようになる。(i-61j, and in the case of fm, let S be θJ
It is something to be assigned. Furthermore, when the difference φυ is greater than a certain value of the output of the width means S4, the rib 193i is assigned ``i'', and when the shellfish foot 11 is less than (C code Vu 61 It is assigned the number S? to the 'f @ position of the first concave part of the zono.
]~(33j output μ person Kapit hatashi register state,
Vjs is my own place J3J'-'r
jT + c45), and the fortune-telling table tJ61 and diagnosis 1 (while turning, as in the case of the 12th evil, the code a,
eh 0% is u% hz 1% f, b%
Any one code of WS rn is loaded with stool 4. An example of the layout and encoding table l36i is shown in Table 2.

第　　　２　　　　：にただし、第２表において＊け０．１のいずれでもよいこ
とを示しており、Ｏ／ｌｉＯの場合と１の場合を示して
いる。かかる記号化テーブルＡは例えばＲＯＭなど上用
いて構成されており、入力ヒツトパターンレジスタ１３
４の内容全アドレス入力としてＲＯＭ’ｉアクしスする
ことにより、３％　ｅｌｏ、・・・などの各符号のコー
ドがデータ出力として得られるようにするか、あるいは
、第２９因に示すように入力ピットパターンレジスタ＋
３４の出力全記号化テーブル１３６１の出力と排他同論
理相で比軟し、一致したときの符号を出力するような楕
我となっている。2nd: However, Table 2 shows that * can be either 0.1, and shows the case of O/liO and the case of 1. Such symbolization table A is configured on, for example, a ROM, and is stored in the input hit pattern register 13.
By accessing ROM'i as the contents of 4 as input for all addresses, the codes of each code such as 3% elo, etc. can be obtained as data output, or as shown in factor 29, Input pit pattern register +
It is an ellipse that compares with the output of the total symbolization table 1361 of 34 and outputs the code when they match.

次に第３０凶に、上述の第２８図の構成における■記号
化処理部β５）の機能ｔマイクロコンピュータの逐次判
別処理プロクラムによって夫現する方法を示すフ０−チ
セートである。同区のフローチャートにあっては、まず
第１段階としてＶｅａｏ／Ｖｉｕ　　差信号が高しベル
Ｈであるか、甲レベルＭであるか、低レベルしてあるか
によって、５グループに分けている。そして、第２段階
でに、まず男１段階がＨのときは、Ｖ　ａ　／　Ｖ　ｅ
　ｏ　　差信号がＨならば、記り／ａ／を出力し、Ｍな
らば記Ｊｙ９′／Ｗ／を出力し、Ｌならば第５段階に移
り、Ｖｅ／ＶＯ差信号を調べて、Ｈ，４らは／ｅ／を出
力し、Ｍならば／ｈ／に出力し、Ｌならば１０／を出力
する。一方、′第１段階がＭの場合、第２段階では、Ｖ
　ｅ　／　Ｖ　ｏ差信号がＨならば／ｆ／會出力出力Ｍ
ならば／ｍ／”ｋ出力し、Ｌならば／ｂ／”ｋ出力する
。さらに第１段階がＬの場合、躬２段階では■ｉ／Ｖｕ
差信号がＨならは／　ｉ　／　ｆａ−出力し、Ｍならｔ
ｄ／ｌ／に出力し、Ｌならば／ｕ／會出力出力第３０図
のフローチャートと、第７因に示す従来例のフＯ−チ？
−トとの嵐いについて説明すると、第７凶でに、５つの
走伯号のＨｌＬの組合ぜのδ通りの中から５母音に記号
化していたので、それぞれの差信号の正負には、５母目
の内の複奴のものの識別を兼ねていたが、４８３０凶の
＠台には、差信号の正、負に、５母音が一個つつ職別さ
れれは良いので、音声分析のフィルタ帯域設定や差信号
のバランス調整が最適に行ない得るものである。Next, the 30th section is a flowchart showing a method of implementing the function of the symbolization processing section β5) by the sequential discrimination processing program of the microcomputer in the configuration of FIG. 28 described above. In the flowchart for this section, the first step is to divide the Veao/Viu difference signal into five groups depending on whether it is high (high level H), high level M (high level M), or low level. Then, in the second stage, when the man's first stage is H, V a / V e
o If the difference signal is H, it outputs /a/, if it is M, it outputs Jy9'/W/, and if it is L, it moves to the fifth step, checks the Ve/VO difference signal, and outputs H, 4 outputs /e/, M outputs /h/, and L outputs 10/. On the other hand, if the first stage is M, then in the second stage V
If the e/Vo difference signal is H, /f/company output output M
If it is L, /b/”k is output. Furthermore, if the first stage is L, in the second stage ■i/Vu
If the difference signal is H, /i/fa- is output, and if it is M, t
output to d/l/, and if it is L, /u/ will output the flowchart in FIG.
To explain the storm with -G, in the 7th episode, five vowels were symbolized from among the δ combinations of HlL of the five Sohakugo, so the positive and negative of each difference signal is as follows. It was also used to identify compound vowels among the 5th vowel, but since it is good to distinguish one 5th vowel in the positive and negative of the difference signal in the @ stand of 4830, a voice analysis filter is used. Band setting and difference signal balance adjustment can be performed optimally.

ここでフィルタ対のｍ４Ｈ号のバランス調整の仕方につ
いて説明すると次の超りである。Here, the method of adjusting the balance of the m4H number of the filter pair will be explained as follows.

■Ｖｅａｏ／Ｖｉｕ斧１ａ号は、従来と同様に第１ホル
マントＦ１により、５母音ｋ（ｅ％　ａ、ｏ）と（ｉ％
Ｕ）に分けるように調整する。■Veao/Viu ax No. 1a uses the first formant F1 as before, with 5 vowels k (e% a, o) and (i%
Adjust so that it is divided into U).

■Ｖａ／Ｖｅｏ　　差信号は、やはり９４１Ｊ１ホルマ
ントＦ１により（ｅｌ　３％　ｏ）４（ａ）と（ｅ、。■Va/Veo difference signal is (el 3% o) 4(a) and (e,) due to 941J1 formant F1.

）に分けるように調整する。).

（５）　Ｖ　ｅ　／　Ｖ　ｏ差信号に、化２ホルマント
によりｓ　　（ｅ、ｏ）を（ｅ）と（０）に分けるよう
に調整する。(5) The V e /V o difference signal is adjusted so that s (e, o) is divided into (e) and (0) using the 2-formant.

■Ｖ　＋　／　Ｖ　ｕ差信すは１へ５２ホルマシトによ
り、（＋、ｕ）ゲ（１）と（ｕ）に分けるように調整す
る。■V + / V u difference is adjusted to 1 by 52 formacites so that (+, u) is divided into (1) and (u).

ここで第２図？見ると、第１ボルマシトＦｌ　と第２ホ
ルマントＦ２に、５母晋による貧化量が大きいことがわ
かる。一方第凸凶勿見ると・Ｆｌ軸上の男女の庄は小さ
く、１２Ｍ上で左が大きいことがわかる。し友がって記
号化を確夫にするためには、本方式の場合、フィルタ帯
域を男女で分けることが望ましい。このようにフィルタ
帯域を切り換えるためには、デジタルフィルタを出いて
、男性用、女性用、Ｐよび子供用などについて、そｎぞ
れフィルタ係故全メ七りに記儂さぞてＰいて、適宜切り
挨えるようにすれは好都合である。Figure 2 here? It can be seen that the first formant Fl and the second formant F2 have a large amount of impoverishment due to the 5th formant. On the other hand, if you look at the first convex axis, you can see that the male and female positions on the Fl axis are small, and the left side on 12M is large. In order to ensure consistent symbolization, in the case of this method, it is desirable to separate the filter bands for men and women. In order to switch the filter band in this way, you need to go out of the digital filter, write down the settings for men's, women's, P, and children's filters, respectively, and then change them accordingly. It is convenient to be able to cut it.

一方、第３図上児れは多少の誤り全計丁ものとすれば、
男性用と女性用と全ひとまとめにし又、Ｆ１軸上で（ａ
）、（ｅ、ｏ）、（ｉ、ｕ）の５ジループに分け、Ｆ２
＠上で（ｅ）と（Ｏ）、及び（ｉ）と（ｕ）のタループ
分けができることがわかる。この際（ｉ）と（、ｕ　）
と全男女共面のフィルタで分けるのは、多少無理ケして
いるが、従来例のように、Ｆ２軸上で（ｉ、ｅ）と（３
％　０、ｕ）に分けるよりも容易である。去除、（ｅ）
と（Ｏ）についての第２ホルマントＦ２の境界は１、５
　Ｋ　ＨＺ付近で、（Ｕ）の分布の中心になり、一方（
ｅ）と（Ｕ）についての褐２ホルマントＦ。On the other hand, if we assume that the upper part of Figure 3 has some errors, then
All the men's and women's products are grouped together, and on the F1 axis (a
), (e, o), (i, u), and F2
It can be seen that (e) and (O), and (i) and (u) can be separated on @. In this case, (i) and (, u )
Although it is somewhat difficult to separate all men and women using a coplanar filter, as in the conventional example, (i, e) and (3
It is easier than dividing into % 0, u). removal, (e)
The boundary of the second formant F2 for and (O) is 1,5
Near K HZ, the distribution of (U) is centered, while (
Brown 2 formant F for e) and (U).

についての境界はｓ　　１．ｅＳ　Ｋ　Ｈｚ刊近でがな
り分布が重なっているために、従来例のように第２ホル
マントＦ２により、（ｉ、ｅ）と（ａｓｏ％ｕ）に分け
るためには、男女差や個人差によるＶｉｅ／Ｖａｏｕ差
（ｉｇ号の零点のオフセットＭ％絨ゝ大きくなり、正し
く記号化できないおそれがある。こｎに対して不方式の
場位には、第３凶の分布図で判Ｉ０１シて、差信号によ
る判別音するには、最も億夫で簡単な搗収であると考え
ることができ、（ｉ）と（ｕ）の誠別にフィルタ；Ｆｌ
；域？男女で切り換えるように丁れば、はとんど完全に
記号化できるものと思われる。The boundary for s 1. Since the distributions overlap in the eS K Hz publication, it is necessary to divide the values into (i, e) and (aso%u) using the second formant F2 as in the conventional example, due to differences between men and women and individual differences. The Vie/Vaou difference (the offset M% of the zero point of the ig number becomes large, and there is a risk that it cannot be encoded correctly.On the other hand, in the case of an irregular method, the size I01 in the third worst distribution map is , it can be considered to be the most convenient and simple method to distinguish the sound by the difference signal, and the filter; Fl
;Area? I think it would be possible to completely symbolize the word if it were changed between men and women.

〔Effect of the invention〕

不発明に叙上のように構成されており、音声入力から母
音のア、工、才のような顎の開きの広い広頒慣声音と、
母音のイ、つのような頒の開きの伏い伏頓自声音との比
率２に米めるグ）１のフィルタ対と、広別儒声音のうち
母音のアのような第１ホルマントの高い音と、母音の工
、才のような第１ホルマントの低い音との比率゛を氷め
る第２のフィルタ対と、第１ホルマントの低い広預儒沖
音のうち母音の工のような第２ホルマントの高い音と一
母音のオのような化２ホルマントの低い音との比率を求
める第５のフィルタ対と、狭顎有声音のうち母音のイの
ような第２ホルマントの高い音と、母音のつのような粥
２ホルマントの低い音との比率？氷める第４のフィルタ
対とを設け、り３１乃至第４のフィルタ対の差信号出力
から音声の特徴を抽出するようにしたものであるから、
母音の第２ホルマントの特４１確夫に抽出できるように
なり、日本語の５母音のうち、特に従来不完全であった
ｅｓｕ、ｏの誠別會確夫に行なうことができて、５母音
のより完全な記号化がｌ１ｌＩＪ能となり、また第２ホ
ルマントの抽出に際して、従来のような無理な抽出の仕
方をしていないので、フィルタ対の斧イ８号田力のめ者
による零点補止量？少なくできるという効果がある。ま
た不発明にあっては、上記谷フィルタ対の差信号出力全
成分とする壬次元ベクトル七入力とし、この牛久フしベ
クトルに又候行列會乗算して日本語の５母音およびその
１世の有声音の短時間平均パワー？各区分とするベクト
ル全算出する行列計算都を設け、ｂ列〜計算部から出力
されるベクトルの各成分のうちの最大の成分に対応する
符号全出力する最大値判定郡を設け、比較手段の出力お
よび最大値’ｆｌＪ足部の出力にて入カバターンを形成
するようにしたものであるから、行列計算都や最大ＩＦ
Ｉｉｉ判定部のような比軟的汎用性の高い手段？用いて
５母音およびその他の有声音の符号上寿ることができ、
装置ａの構成が藺単になるという効果もある。さらにま
た、併合発明にあっては、第１乃至第４のフィルタ対の
差信す出力を復奴の基準値と比軟して、この基１＄値と
の大小間係に応じてそれぞれ別々の符号を割り当てて、
各フィルタ対ごとに割シ当てられた符りの丁べての組合
せに応じて、日本語の５母音およびその池の有声音の符
号のうちいずれ力・１つの符号？θＪり当てて出力する
有声音判別手段？肢けたものであるから、ＲＯＭテーブ
ルなどを用いて、罰＊な検収で、し〃１も高速度で有声
音の判別ｒ行なうことができるという効果があり、電子
倣器を音μメｔシ℃−ジによって操作する除のわ谷運度
を早くし、かつ安価に構成できるという効果がある。It is inventively structured as described above, and from the audio input, it can produce wide-jaw idiomatic sounds such as vowels a, ko, and sai.
With the ratio of 2 to the vowel i, tsu, which is the opening of the pronunciation, the filter pair of 1 is used, and the first formant, which is high in the first formant, such as the vowel a, is used. A second filter pair that cools the ratio between the vowel sound and the low sound of the first formant, such as the vowel sound, and the low sound of the first formant, such as the vowel sound, A fifth filter pair that calculates the ratio between a high sound in the second formant and a low sound in the double formant, such as the vowel o, and a high sound in the second formant, such as the vowel i, among narrowly voiced sounds. And the ratio of the low sound of the porridge 2 formant like the vowel? A fourth filter pair is provided, and audio characteristics are extracted from the difference signal output of the fourth filter pair.
The second formant of the vowel can now be extracted to special 41 katsuo, and among the five vowels in Japanese, it has been possible to extract the special 41 katsuo of esu and o, which were previously incomplete, and the 5th vowel. A more complete symbolization of 111IJ is possible, and when extracting the second formant, no forced extraction method is used as in the conventional method, so the amount of zero point correction by the filter pair ax I8 is reduced. ? It has the effect of reducing the amount. In addition, in the non-invention, seven I-dimensional vectors are inputted as all the difference signal output components of the pair of valley filters, and this Ushiku-Fushi vector is multiplied by a candidate matrix to obtain the five Japanese vowels and their first vowels. Short-term average power of voiced sounds? A matrix calculation unit is provided to calculate all the vectors for each category, a maximum value judgment group is provided to output all codes corresponding to the maximum component of the vectors output from column b to the calculation unit, and a comparison means is provided. Since the input cover pattern is formed by the output and the output of the maximum value 'flJ foot, the matrix calculation capital and the maximum IF
Is it a highly flexible and versatile method like the III judgment part? can be used to describe the sign of five vowels and other voiced sounds,
This also has the effect of simplifying the configuration of device a. Furthermore, in the merged invention, the differential outputs of the first to fourth filter pairs are compared with the Fuku reference value, and each of the differential outputs of the first to fourth filter pairs is compared in accordance with the magnitude relation with this base $1 value. Assign the sign of
Depending on the combination of numbers assigned to each filter pair, which of the five vowels in Japanese and the codes of the voiced sounds of the same type of voiced sound is selected? Voiced sound discrimination means that outputs by matching θJ? Since it is an advanced device, it has the effect of making it possible to identify voiced sounds at high speed by using a ROM table, etc., without having to undergo a punitive acceptance inspection. This has the effect of speeding up the movement of the gate operated by the °C-gear and being able to be constructed at low cost.

[Brief explanation of the drawing]

第１図は日本語の５母音の儲準スペクトル？示す図、第
２凶は母音のホルマントの男女差ゲ示す図、第３図は母
音の第１ホルマントと粥２ホルマントの分Ｓ？示す凶、
第４図は日本語の５母音と舌の位置の関体を示す図、第
５図は従来例の構成を示すブロック図、第６凶（ａＪ　
（ｂＪに向上に用いるフィルタの周波数特性？示′ｊ凶
、第７凶は従来例における擬音＠記号化処理の手順上水
す〕０−チャート、弗８図乃至弗１１図に向上の制作説
明図、第１２凶は不発明の要旨となる構＠、？示すクレ
ーム対応ブロック図、第１３凶は不発明の一夫施例の５
０９９図、弗１４凶に同上の飢の夫施例のづロック凶−
第１５凶（ａＪ（ｂ）は向上に月Ｊいるフィルタの周波
奴特住？示す凶、粥１６凶に同上のさらに他の夫施例の
ブロック図、第１７図（ａ）　（ｂｌは向上に用いるフ
ィルタの周波欽特住金不丁凶、協１８凶に向上の別の去
施例紫示すプロツク凶−第１９凶に同上のさらに別の大
、地側【示すブロック凶−第２０区乃至第２３凶に第１
８凶夫施例の動作説（ト）図、りろ２４凶乃至第２７図
に第１９凶夫施例の動作説明図、第２８図は併合発明の
要“ｄとなる構成ケ示すクレーム対に６”）ロック区、
第２９凶に向上の一夫施例のづロック区、第３０凶は同
上の擬音韻記号化処理の手順を示すフローチｐ−１・で
あるＦｖｔ　　Ｆｕｂ　Ｆａ、　、Ｆａ２　％　Ｆｂｌ
、Ｆｂ２、Ｆ　Ｃ１、Ｆ　Ｃ２、Ｆ　ｄ１％　　Ｆａ２
　はフィルタ、５ｏ−５，は燈剣増１１’ｆｆｆ手段゛
、Ｃｏ（’Ｊ比杖手攻、ＭＣｏｎ行夕ＩＪ計算部、ＭＸ
ｏｉ・：ＪＪｔ大１１１．１判定都、（へ〜シ３３）は
比ｅ、、器、・３６）は記号化テーブルである。代理人　弁理士　　石　山　炙　上第　１　図第３図第２図第４図第５図第６図周波数（ＫＨｚ）手続補正書（自発）１．事件の表示割引５８年特肝願第６７２６１号２、発　明の名称音声メツセージ識別方式３、補正をする者事件との関係　　　　　　　特許出願人件　　所　　大
阪府門真市太字門真１０４８番地名　称　（５８３）松
下電工株式会社代表者小　林　　郁４、代理人郵便番号　５３０５、補正命令の日付ｖｌｕ　Ｊご日」止玖し筐す。Is Figure 1 the quasi-spectrum of the five Japanese vowels? The second figure shows the gender difference in the vowel formants, and the third figure shows the difference between the vowel formants 1 and 2. The evil that shows,
Figure 4 is a diagram showing the relationship between the five Japanese vowels and the position of the tongue. Figure 5 is a block diagram showing the configuration of the conventional example.
(The frequency characteristics of the filter used to improve bJ are shown. The 7th problem is onomatopoeia in the conventional example @ symbolization processing procedure) 0-Chart, Figures 8 to 11 explain the production of improvement. Figure, the 12th culprit is a block diagram corresponding to claims showing the gist of non-invention @, ?, and the 13th culprit is the 5th example of Kazuo's non-invention
Figure 099, 14th edition of 弗 and the same example of the starving husband mentioned above.
The 15th problem (aJ (b) is the frequency characteristic of the filter that is improved in the month J? The problem is shown, the block diagram of the same as the 16th problem is the same as the above, and the block diagram of another husband example, Fig. 17 (a) (BL is improved The frequency of the filter used for 1st to 23rd evil
The operation theory (g) diagram of the 8th Ikuo Example, Riro 24 Iku to 27 are explanatory diagrams of the operation of the 19th Ikuo Example, and FIG. 6”) Rock Ward,
The 29th example is Kazuo's example of improvement, and the 30th example is a flowchart p-1 which shows the procedure of the onomatopoeic symbolization process as above. Fvt Fub Fa, , Fa2 % Fbl
, Fb2, F C1, F C2, F d1% Fa2
is a filter, 5o-5, is a light sword increase 11'fff means, Co ('J Hijo Tetsu, MCon row IJ calculation department, MX
oi・:JJt University 111.1 judgment capital, (he~shi33) is ratio e,, vessel,・36) is a symbolization table. Agent Patent Attorney Aburi Ishiyama Part 1 Figure 3 Figure 2 Figure 4 Figure 5 Figure 6 Frequency (KHz) Procedural amendment (voluntary) 1. Case Display Discount 1958 Special Request No. 672612, Name of the invention Voice message identification method 3, Person making the amendment Relationship with the case Patent applicant Location 1048 Bold Kadoma, Kadoma City, Osaka Name (583) Matsushita Denko Co., Ltd. Representative Iku Kobayashi 4, agent postal code 530 5, date of amendment order vlu J day”.

Claims

[Claims] A difference signal output between a pair of filters that yields the short-term average power t of the cylindrical circumferential component and the low frequency component of the audio input. As an input, when the high frequency component is stronger, the sign of an unvoiced sound is t1. When the low frequency component is stronger, the sign of a voiced sound is t1. When the high frequency component and the low frequency component are approximately the same, the sign of silence + j"K is output. Ratio of broad voiced sounds with a wide jaw opening, such as the soft means fe ke, phonetic human power sword, and ra vowels, such as a, ko, and sai, and narrow voiced sounds, with a narrow jaw opening, such as the vowels i and tsu. and the first filter pair to obtain the 11th formant high tone,
A second filter pair that calculates the ratio of low sounds in the first formant such as the vowels ku and sai, and one of the low wide-mouthed sounds in the first formant, A sixth filter pair that calculates the ratio r of the high pitch and the low pitch of the second formant like the vowel, and the high pitch of the second formant like the vowel "I" among the narrowly voiced sounds, and the vowel. '44i 4 filter pairs and t are provided, and the left signal output of the 1st to brown 4 filter pairs is used as a component to create a 4-dimensional hect+b
= human power, and multiply this current-dimensional vector by a transformation matrix to make the short-term average power of the five Japanese vowels and other voiced sounds into a vector with six components. ), and the input pattern is formed by the output P of the comparing means and the output of the maximum value determining section. , This voice message identification method is characterized in that the standard pattern closest to the entering cover turn is identified as the input message by comparing it with the standard tS turns of several types of pine that have been recorded in advance. (2) Input the difference signal output of a pair of filters that extract the short-term average power of the high-frequency component P and low-frequency component of the audio input, and if the high-frequency component is more frequent, the signal of the unvoiced sound is determined. , when the low-frequency component is stronger, the sign t of a voiced sound is output, and when the high-frequency component and the low-frequency component are almost the same, a silent sign JPjτ is provided. The ratio of wide-jawed voiced sounds with a wide jaw opening, such as vowels, and bowed-jawed voiced sounds with a narrow jaw opening, such as The high sound of the first formant, such as the vowel a, and the sound of the vowel,
Comparison with the low tone of the first formant like 1<? The second filter pair that freezes, and among the low wide-jawed tones of the square l formant, the hollow sound of the second formant, which sounds like a vowel, and the low sound of the second formant, which sounds like a vowel. The ratio of the first convex filter pair of the ratio of narrow-jawed Confucian tones, such as the high sound of the second formant such as the vowel "I", and the low sound of the second formant such as the vowel "nori" xτ ice. The difference output of the first to second filter pairs is compared with a plurality of base values, and the difference output is calculated according to the magnitude relationship with this standard 1U. A voiced sound discriminating means is provided which allocates n separate notes and trays and outputs which one of the codes of the other voiced sounds of the n notes assigned to each pair of valley filters determines one code. , the output of the voiced sound discrimination means and the above J
Forming an input cover pattern r with the output of the comparison means, and comparing this human cover pattern with a plurality of pre-recorded standard patterns to identify it as the standard pattern τ human figure message closest to the input cover pattern; al- Characteristic voice message identification method.