JPS6024595A - Voice recognition equipment - Google Patents

Voice recognition equipment

Info

Publication number
JPS6024595A
JPS6024595A JP58132508A JP13250883A JPS6024595A JP S6024595 A JPS6024595 A JP S6024595A JP 58132508 A JP58132508 A JP 58132508A JP 13250883 A JP13250883 A JP 13250883A JP S6024595 A JPS6024595 A JP S6024595A
Authority
JP
Japan
Prior art keywords
voice
time series
parameter
standard deviation
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP58132508A
Other languages
Japanese (ja)
Other versions
JPH0459638B2 (en
Inventor
正典 宮武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sanyo Electric Co Ltd
Sanyo Denki Co Ltd
Original Assignee
Sanyo Electric Co Ltd
Sanyo Denki Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sanyo Electric Co Ltd, Sanyo Denki Co Ltd filed Critical Sanyo Electric Co Ltd
Priority to JP58132508A priority Critical patent/JPS6024595A/en
Publication of JPS6024595A publication Critical patent/JPS6024595A/en
Publication of JPH0459638B2 publication Critical patent/JPH0459638B2/ja
Granted legal-status Critical Current

Links

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。
(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】 (イI 産業上の利用分野 本発明は音声!認識下る事のできる音声認識装置に閣下
る。
DETAILED DESCRIPTION OF THE INVENTION (ii) Industrial Application Field The present invention is directed to a speech recognition device capable of recognizing speech!

1d従従来 従来の音声認識装置は、音声信号からその音声の特徴ン
示す特徴パラメータの時系列からなる音声パターンA 
x、(a (11、@ fi+ 、・・・・・・)を予
め定められり複数の、音声について抽出しておき、未知
の音声S二ついての特徴パラメータの時系列からなる未
知パターンx=(χ11+ 、 x’ In、・・・)
を各時系列A m (@ (tl、a (tl、・・・
)と比較し、これ等両パターンの距離−D = i A
 −X l −Z l’a(il−x(111が最小と
なる時系列Aの音声をこの時の未知の音声であると判定
するものであった。
1d Conventional speech recognition devices generate a speech pattern A consisting of a time series of feature parameters indicating the characteristics of the speech from a speech signal.
x, (a (11, @ fi+ , ...) are extracted for a plurality of predetermined voices, and an unknown pattern x = consisting of a time series of feature parameters of two unknown voices S is extracted. (χ11+, x'In,...)
For each time series A m (@ (tl, a (tl,...
), and the distance between these two patterns is −D = i A
-X l -Z l'a(il-x(111) was determined to be the voice of time series A with the minimum value as the unknown voice at this time.

しかしながら、上述の音声の特徴パラメータとしては音
声のスペクトル値の時系列、又は自己相関係数の時係列
等が用いられるが、これ等のパラメータは話音の発声状
況に依って多小なりとも変@IJTる1其れがあり、こ
の為(二同じ音声であっても予め登録された音声と未知
の入力音声との両パターンに大きな距囃が生じ、誤認識
7招く欠点があつ1こ。
However, although a time series of speech spectral values or a time series of autocorrelation coefficients are used as the above-mentioned voice characteristic parameters, these parameters may change to a greater or lesser degree depending on the speech production situation. @IJT 1. Because of this, there is a drawback that even if the voice is the same, there is a large difference between the patterns of the previously registered voice and the unknown input voice, leading to misrecognition.

し罎 発明の目的 本発明は誤認識の発生ン低減し1こ即ち認識率の向上7
図っ1こ音声認識装置を提供下るものである。
OBJECT OF THE INVENTION The present invention reduces the occurrence of erroneous recognition and improves the recognition rate.
We are currently offering a voice recognition device.

に)発明の構成 本発明の音声認識装作は、予め貯えられ1こ登録音声の
特徴パラメータの平均値を示T平均値パラメータの時系
列とその標塾偏差の時系列とt用いて未知の入力音声の
特徴パラメータの時系列に対して統計的処理な飾こし、
この入力音声の特徴パラメータン平均値パラメータとの
差が足数倍の標準偏差より大なる時と小なる時の:値状
態を示すゝゝ1“、0“の二値信号の時系列に変換して
類似度!求め、最も類似度が大なる時の登録音声をこの
時の入力音声と判定下るものである。
B) Structure of the Invention The speech recognition device of the present invention shows the average value of the feature parameters of one registered speech stored in advance, the time series of the mean value parameter, the time series of its standard deviation, and the unknown unknown. Decorate the time series of the characteristic parameters of the input voice using statistical processing,
When the difference between the characteristic parameter and the average value parameter of this input voice is larger than the standard deviation times the number of feet, and when it is smaller: Converted into a time series of binary signals of ゝゝ1'' and 0'' that indicate the value state. Similarity! Then, the registered voice with the highest degree of similarity is determined to be the input voice at that time.

(ホ)実施例 $1図図1法発明の音声認識製雪の一実施例を示To同
図に於いて、(l)は音声を電気的な音声信号に変換下
るマイクロフォン、(2)は該マイクロフォン(1)か
ら得られる音声信号からその音声の特徴を示す特徴パラ
メータである周波数スペクトル値を抽出Tるパラメータ
抽出回路であり、例えば8チヤンネルのバンドパスフィ
ルタが用いられ、音声帯域(100−4000HりY8
分割した周波数スペクトル値f f ・・・f8の各8
サンプ1 % 2 % ルからなる時系列で表わされた音声パターンが得られる
。、即ちブイルタ番号をn、サンプル番号をtとした時
の特徴パラメータはfn(tlで表わされ、音声パター
ンFは となる。(31は登録モードと認識モードyIl−切り
換えるモード選択スイッチであり、Q側に接続下れば登
録モードとなり、逆にP側に接続Tれば認識モードとな
る。(4)は該モード選択スイッチ(31をq側に接続
した登録モード時に上記パラメータ抽出回路(2)から
の音声パターンが入力される統計処理回路であり、同一
音声を少数回連続して入力下る事に依って得られる複数
の音声パターンに基づいて、その各特徴パラメータfn
ltlが第2図に示す如き正規分布YET事とLで平均
値パラメータfn(tlからなる 平均値パターン 準偏差パターン ン算出下る。(5)はメモリ回路であり、例えば異なる
A、B、CすQ、Ftの複数の登録音声に対して、上記
統計処理回路(4)から得られる平均値ノ(ターンX1
’ii−c−■、iビ貯える平均値)くターンメモリ部
(51)と、これに対応づけて標準偏差ノくターン實、
省、t、會、會を貯える標準偏差メモリ部(52)と、
から構成されている。−万(6)は上記モード選択スイ
ッチ131tP側に接続しTこ認識モード時に上記パラ
メータ抽出回路12)から得られる未知の入力音声x6
二対してその音声ノ(ターンXを一時的に貯えるバッフ
ァメモリである。(7)は比較手段であり、上記バッフ
ァメモリ(61の入力音声)(ターンXの各パラメータ
xn(tlから上記メモリ回路(5)の平均値パターン
メモリ部(51)の各平均値ノくターンX1石、・・・
夫々のパラメータ’; n (tl、1n(t、l。
(E) Embodiment $1 Figure 1 shows an embodiment of the voice recognition snowmaking method of the invention. In the figure, (l) is a microphone that converts voice into an electrical voice signal, (2) is This is a parameter extraction circuit that extracts a frequency spectrum value, which is a characteristic parameter indicating the characteristics of the voice, from the voice signal obtained from the microphone (1). 4000Hri Y8
Each 8 of the divided frequency spectrum values f f ...f8
A speech pattern expressed in time series consisting of 1% 2% samples is obtained. That is, when the filter number is n and the sample number is t, the characteristic parameter is expressed as fn(tl), and the voice pattern F is as follows. (31 is a mode selection switch that switches between registration mode and recognition mode yIl, If it is connected to the Q side, it will be in the registration mode, and if it is connected to the P side, it will be in the recognition mode. (4) is the parameter extraction circuit (2) when in the registration mode when the mode selection switch (31 is connected to the q side). ) is a statistical processing circuit that receives voice patterns from the following inputs, and calculates each feature parameter fn based on a plurality of voice patterns obtained by inputting the same voice a small number of times in succession.
If ltl is a normal distribution YET as shown in Fig. 2, then the average value pattern and standard deviation pattern consisting of the average value parameter fn (tl) are calculated. The average value (turn
'ii-c-■, the average value stored in i-turn memory section (51), and the standard deviation of the turn actual value corresponding to this,
a standard deviation memory unit (52) for storing the saving, t, meeting, and meeting;
It consists of -10 (6) is connected to the mode selection switch 131tP side, and unknown input audio x6 obtained from the parameter extraction circuit 12) in the recognition mode
On the other hand, it is a buffer memory that temporarily stores the voice (turn (5) Each average value of the average value pattern memory section (51) x 1 stone...
Each parameter'; n (tl, 1n(t, l.

・・・を減じる減算器(71)と、上記メモリ回路(5
)の標準偏差メモリ部(52)の各標準偏差パターンt
、會、・・・夫々の標準偏差:n、z、、會□□、・・
・に定数K例えば、1、又は2を乗算Tる乗算器(72
)と、上記減算器(71)からの減算値xnit) −
”;、 n、tI、x n ltl −M n1tl、
・・・乞夫々上記乗算!(72)からの乗算値Ktn0
.5に官□12.・・・と比較し、’ ” ”” a 
n(tll−Ka n(klの時1′1ゝゝ を出力L
 I X n1tl 、 n(tll > K Jln
(11の時ゝゝ0“を出力Tる比較器(75)とから構
成されている。即ち、K−1としTこ時、例えばAの登
録音声に対して、xn(” a n(tl −a n1
tlの時、弔2図に示しTこ如き正規分布(二基づいて
xn(tlがan(tll二68.5 %の確率をもっ
て類似していると看做されるので、1“が与えられ、逆
の場合はゝゝ0″が与えられる事となり、未知パターン
は各登録音声(二対しC1 ”11/%′0”の2値信号δで表わされる行列パター
ン に変換される。
A subtracter (71) that subtracts ... and the memory circuit (5
) of each standard deviation pattern t in the standard deviation memory section (52)
, society,...Respective standard deviations: n, z,, society□□,...
A multiplier (72
) and the subtracted value xnit) from the subtractor (71) -
”;, n, tI, x n ltl −M nltl,
...Multiply the above! Multiply value Ktn0 from (72)
.. 5. Government□12. Compare with...' ” ””a
When n(tll-Ka n(kl, output 1'1ゝゝL
I X n1tl, n(tll > K Jln
(tl -a n1
When tl, it is assumed that xn(tl is similar to an(tll2) with a probability of 68.5% based on the normal distribution (2) shown in Figure 2, so 1" is given, In the opposite case, "0" is given, and the unknown pattern is converted into a matrix pattern represented by a binary signal δ of each registered voice (2 vs. C1 "11/%'0").

(S+は上紐比較手段(7)から得られる二値信号の行
列パターン△に基づ穴、その°16個の構成要素の総和
pXδij即ち1”の存在数Z類C度として算出Tる認
識処理回路であり、A、B、C,D、Eの各登録音声に
対してこの類似度が、例えば。
(S+ is a hole based on the matrix pattern △ of the binary signal obtained from the upper string comparison means (7), the total sum of its 16 components pXδij, that is, 1", the number of existence Z class C degree It is a processing circuit, and this degree of similarity for each registered voice A, B, C, D, and E is, for example.

11.2,8.7,3であれば、この時の入力音声はA
であった事と判定される。
11.2, 8.7, 3, the input audio at this time is A
It is determined that this was the case.

而して、モード選択スイッチ(31をQ(二接続した登
録モードに於いては、複数の足めめられγこ音声を夫々
数回、例えば3回づつ発声入力して、メモで、モード選
択スイッチ(3)ンPに接続した認識モードに於いて、
未知の音声が入力され、その音声パターンXは比較手段
(7;に依って上記平均値パター71%石、・・・並び
C二標準偏差?、舎、・・・を用いて音声の発声の際の
変動成分を除去した形式の2値信S パターンに変換さ
れる。この時、音声の許容変動分ン決定下る為のKの値
をO15〜2程度に設定しておけば、2fT信号パター
ンは未知音声パターンと登録音声パターンとの類似性ン
最適に示すものとなり、これに依って信頼性の高いパタ
ーン徳職!実行できる。
Then, in the registration mode in which the mode selection switch (31 is connected to In the recognition mode connected to switch (3) P,
An unknown voice is input, and its voice pattern It is converted into a binary signal S pattern in which the fluctuation component of the sound is removed.At this time, if the value of K for determining the permissible fluctuation of the audio is set to about 015 to 2, the 2fT signal pattern This optimally shows the similarity between the unknown speech pattern and the registered speech pattern, and this allows highly reliable pattern execution!

以上の説明に於いては、特徴パラメータの)L♀系列と
して周波数スペクトル値の時系列からなる音声パターン
を用いTこが、この他に自己相関係数であるパーコール
係数等各種の特徴パラメータの時系列の使用も可能であ
る。
In the above explanation, a voice pattern consisting of a time series of frequency spectrum values is used as the L♀ series of characteristic parameters, and in addition, various characteristic parameters such as the Percoll coefficient, which is an autocorrelation coefficient, are used. The use of series is also possible.

(へ)発明の効果 本発明の音声認識装置は、以上の説明から明らかな如く
、予め貯えられ1こ登録音声の特徴パラメータの平均値
を示T平均値パラメータの時系列とその標準偏差の時系
列と7弔いて未知の入力音声の特徴パラメータの時系列
に対して統計的処理を施こしてゝゝ1“、SS O//
 の2値信号の時系列に変換して類似度をめ、この類似
度が最大となる時の登録音声をこの時の入力音声と判定
下るものであるので、音声の発声状況に依る各特徴パラ
メータの変動分を除去でき、最適な類似度を導出Tる事
が可能となり、認識率の大巾な向丘が望める。
(f) Effects of the Invention As is clear from the above description, the speech recognition device of the present invention shows the average value of the feature parameters of one registered speech stored in advance, and the time series of the mean value parameter and its standard deviation. 7. Perform statistical processing on the time series of the feature parameters of the unknown input voice.
The system converts the binary signals into a time series to determine the degree of similarity, and determines the registered voice when the degree of similarity is maximum as the input voice at that time, so each feature parameter depending on the voice utterance situation is determined. It is possible to remove the variation of T, and it is possible to derive the optimal degree of similarity T, and a wide range of recognition rates can be expected.

【図面の簡単な説明】[Brief explanation of the drawing]

第1因は本発明の音声認識装置の一実施例を示すブロッ
ク図、@2図は正規分布図1であり、txtはマイクロ
フォン、12;はパラメータ抽出回路、14)は統計処
理回路%(51はメモリ回路、(7)は比較手段、(8
)は認識処理回路!夫々示している。
The first factor is a block diagram showing an embodiment of the speech recognition device of the present invention, Figure @2 is a normal distribution diagram 1, txt is a microphone, 12; is a parameter extraction circuit, and 14) is a statistical processing circuit % (51 is the memory circuit, (7) is the comparison means, (8
) is a recognition processing circuit! shown respectively.

Claims (1)

【特許請求の範囲】[Claims] り 音声y!/電気信号に変換Tる音声入力手段と、入
力されたW声の電気信号からその音声の特@を示す特徴
パラメータの抽出を行r(うパラメータ抽出手段と、予
め複数の登録音声毎にその音声の特徴パラメータの平均
値を示T平均値パラメータの時系列と共に該平均値パラ
メータの時系列に対応下る標準偏差の時系列を貯えに記
憶手段と、上記パラメータ抽出手段から得られる未知入
力音声の特徴パラメータの時系列に応答して上記記憶手
段から各登@音声毎の平均値パラメータの時系列並びに
その標準偏差の時系列を読み出し、入力音声の特徴ハラ
メータと平均値パラメータとの誤差値ン定敬倍しTこ標
準偏差と比較し、この比較結果に基づいて上記誤差値が
定数倍しtこ標準偏差より、大なる時と小なる時の二値
状態ン示T″1“、10″の二値信号を出力下る事に依
って、上記未知人力音声の特徴パラメータの時系列を二
値信号の時系列に変換Tる比較手段と、該比較手段から
得られる二値信号の時系列に基づ伴、登録音声毎の未知
音声に対する類似度を算出し、最も類似度が大なる時の
登録音声をこの時入力された未知音声であると判定する
認識処理手段、とからなる音声認識装置。
Ri audio! / A voice input means for converting into an electric signal, and a parameter extraction means for extracting characteristic parameters indicating the characteristics of the voice from the electric signal of the input W voice. a storage means for storing a time series of the mean value parameter indicating the mean value of the feature parameter of the voice and a time series of the standard deviation corresponding to the time series of the mean value parameter, and a storage means for storing the time series of the standard deviation corresponding to the time series of the mean value parameter, In response to the time series of the feature parameters, the time series of the average value parameter and the time series of its standard deviation for each input voice are read out from the storage means, and the error value between the feature parameter of the input voice and the average value parameter is determined. The above error value is multiplied by a constant and compared with the standard deviation, and based on this comparison result, the above error value is multiplied by a constant and the binary state when it is larger and smaller than the standard deviation is displayed.T"1", 10" a comparison means for converting the time series of characteristic parameters of the unknown human voice into a time series of binary signals by outputting a binary signal; a recognition processing means that calculates the degree of similarity of each registered voice to the unknown voice and determines that the registered voice with the highest degree of similarity is the unknown voice input at this time; .
JP58132508A 1983-07-20 1983-07-20 Voice recognition equipment Granted JPS6024595A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP58132508A JPS6024595A (en) 1983-07-20 1983-07-20 Voice recognition equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP58132508A JPS6024595A (en) 1983-07-20 1983-07-20 Voice recognition equipment

Publications (2)

Publication Number Publication Date
JPS6024595A true JPS6024595A (en) 1985-02-07
JPH0459638B2 JPH0459638B2 (en) 1992-09-22

Family

ID=15082989

Family Applications (1)

Application Number Title Priority Date Filing Date
JP58132508A Granted JPS6024595A (en) 1983-07-20 1983-07-20 Voice recognition equipment

Country Status (1)

Country Link
JP (1) JPS6024595A (en)

Also Published As

Publication number Publication date
JPH0459638B2 (en) 1992-09-22

Similar Documents

Publication Publication Date Title
EP0077558B1 (en) Method and apparatus for speech recognition and reproduction
KR0123934B1 (en) Low cost speech recognition system and method
US5091947A (en) Speech recognition method and apparatus
EP0077194B1 (en) Speech recognition system
JPS634200B2 (en)
US4426551A (en) Speech recognition method and device
JPS6024595A (en) Voice recognition equipment
JPH0461359B2 (en)
JP2001083978A (en) Speech recognition device
JPS63213899A (en) Speaker collation system
JPS59124397A (en) Non-voice section detecting circuit
JPH0221598B2 (en)
JP2666296B2 (en) Voice recognition device
JP2975808B2 (en) Voice recognition device
JPS5999500A (en) Voice recognition method
JP3065088B2 (en) Voice recognition device
JPS60166993A (en) Word voice recognition equipment
JPS63266497A (en) Voice recognition equipment
JPS6273299A (en) Voice recognition system
JPS61290496A (en) Voice recognition equipment
JPS6227798A (en) Voice recognition equipment
JPS6273298A (en) Voice recognition system
JPS61292695A (en) Voice recognition equipment
JPS625298A (en) Voice recognition equipment
JPH03206500A (en) Voice recognition device