JPS61183696A

JPS61183696A - Specified speaker voice recognition equipment

Info

Publication number: JPS61183696A
Application number: JP60022841A
Authority: JP
Inventors: 敏雄吉原
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1985-02-08
Filing date: 1985-02-08
Publication date: 1986-08-16

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は音響的な音声信号のマツチングによる音声認識
装置に関し、特に特定話者音声認識装置に関する。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a speech recognition device by matching acoustic speech signals, and more particularly to a speech recognition device for a specific speaker.

[Conventional technology]

従来、この種特定話者音声認識装置では、特定話者の音
声分析パターンをあらかじめ登録しておき、認識時にこ
れらの固定の音声データパターンに対して認識動作を行
うものであった０〔発明が解決しようとする問題点〕上述した従来の特定話者音声認識装置は、固定の音声デ
ータパターンに対して、音声認識動作を行うため、話者
の情緒変化、環境変化等による発声音声の変化により認
識率が低下するという欠点があった。Conventionally, in this type of specific speaker speech recognition device, speech analysis patterns of a specific speaker are registered in advance, and recognition operations are performed on these fixed speech data patterns during recognition. Problems to be Solved] The conventional speaker-specific speech recognition device described above performs speech recognition operations on fixed speech data patterns, so it is difficult to recognize changes in the uttered speech due to changes in the speaker's emotions, changes in the environment, etc. There was a drawback that the recognition rate decreased.

[Means for solving problems]

本発明の特定話者音声認識装置では、認識動作時に登録
されている音声データパターン中に認識判定条件を満足
するものが見い出され九場合、入カされた最新の音声デ
ータと、認識された登録済音声データパターンと、動的
パターンマツチングにて得られ九タイムフレーム上の対
応関係とから新らしい音声データパターンを生成し、再
登録する手段を有している。In the specific speaker speech recognition device of the present invention, if a speech data pattern that satisfies the recognition judgment condition is found among the registered speech data patterns during recognition operation, the latest input speech data and the recognized registered speech data are used. It has means for generating and re-registering a new audio data pattern from the existing audio data pattern and the correspondence on nine time frames obtained by dynamic pattern matching.

〔Example〕

次に、本発明について図面を参照して説明する。 Next, the present invention will be explained with reference to the drawings.

第１図は本発明の一実施例のブロック図である。FIG. 1 is a block diagram of one embodiment of the present invention.

音声入力端子１は音声分析部２に接続されている。A voice input terminal 1 is connected to a voice analysis section 2.

音声分析部２は、タイム７レーム毎にバンドパスフィル
タバンクにより複数の周波数成分に分割し分析信号出力
線３に出力する。The speech analysis section 2 divides the signal into a plurality of frequency components using a bandpass filter bank every seven frames, and outputs the divided frequency components to the analysis signal output line 3.

分析信号出力線３は分析された信号を分析信号バッファ
４及び距離計算部７に伝達する。最初の音声登録時には
、分析信号バッファ４からバッファ出力線５を通して、
分析された信号が記憶部６に登録される。The analysis signal output line 3 transmits the analyzed signal to the analysis signal buffer 4 and the distance calculation section 7. At the time of first voice registration, the buffer output line 5 is passed from the analysis signal buffer 4,
The analyzed signal is registered in the storage unit 6.

次に認識動作時には、距離計算部７は分析信号出力線３
より入力された信号と記憶部出力線８全通して得た登録
済音声データパターンとに対し動的パターンマツチング
演算を行い、各登録済音声パターンと入力音声の分析信
号との間の各距離を求め、距離出力線９を通して比較判
定部１０に出力する。比較判定部１０はあらかじめ設定
されている認識判定値以下で最小の距離を有する登録済
音声データパターンを認識音声データパターンと判定し
、認識結果出力線１１より出力する。Next, during the recognition operation, the distance calculation section 7
A dynamic pattern matching calculation is performed on the input signal and the registered voice data pattern obtained by passing through the entire memory unit output line 8, and each distance between each registered voice pattern and the analysis signal of the input voice is calculated. is determined and output to the comparison/judgment section 10 through the distance output line 9. The comparison/determination unit 10 determines the registered voice data pattern having the minimum distance less than or equal to a preset recognition determination value as the recognized voice data pattern, and outputs the recognized voice data pattern from the recognition result output line 11 .

次に再登録動作時には、認識結果出力線１１から指定さ
れた認識音声データパターンと分析信号バッファ内の分
析信号とを入力として距離計算部７は再度動的パターン
マツチング演算を行い、比較した２つの入力信号のタイ
ムフレームの対応関係をタイムフレーム対応出力線１３
ｔ−通して新音声データパターン計算部１２に出力する
。新音声データパターン計算部１２は出力線１３のタイ
ムフレーム対応関係を示すデータと認識音声データパタ
ーンと、分析信号とから新音声データパターンを生成し
、新音声データパターン登録線１４ｔ−通して再登録を
行う。Next, during the re-registration operation, the distance calculation unit 7 performs the dynamic pattern matching calculation again using the recognized voice data pattern specified from the recognition result output line 11 and the analysis signal in the analysis signal buffer as input, and compares the two The time frame corresponding output line 13 shows the time frame correspondence of the two input signals.
The data is output to the new audio data pattern calculation unit 12 through t-. The new audio data pattern calculation unit 12 generates a new audio data pattern from the data indicating time frame correspondence of the output line 13, the recognized audio data pattern, and the analysis signal, and re-registers it through the new audio data pattern registration line 14t. I do.

以後認識動作時には同様の処理を繰シ返す。Thereafter, similar processing is repeated during recognition operations.

第２図は第１図の実施例における音声データパターンと
分析信号とのタイムフレームの関係を示す図である。１
５は登録済音声データパターン。FIG. 2 is a diagram showing the time frame relationship between the audio data pattern and the analysis signal in the embodiment of FIG. 1. 1
5 is a registered voice data pattern.

１６〜２７はその各タイムフレームを示す。４１は分析
信号、４２〜４９はその各タイムフレームを示す。２８
〜４０は両者の対応関係を示す線である。16 to 27 indicate each time frame. Reference numeral 41 indicates an analysis signal, and 42 to 49 indicate its respective time frames. 28
40 is a line showing the correspondence relationship between the two.

以下に新音声データパターンを生成する手順を示す。The procedure for generating a new audio data pattern is shown below.

まず対応関係にあるすべてのタイムフレームについて同
−周波数成分についてそれぞれ算術平均を計算し、これ
を仮の新音声データパターンとするＯ次に、この仮の新音声データパターンについてタイムフ
レーム数を求める。First, the arithmetic mean of the same frequency components is calculated for all time frames in a corresponding relationship, and this is used as a temporary new audio data pattern.Next, the number of time frames is determined for this temporary new audio data pattern.

次に、このタイムフレーム数が元の両データのタイムフ
レーム数の算術平均値と同一となる様に仮の新音声デー
タパターンからタイムフレームを間引く、この時間引か
れるタイムフレームは複数の対応関係金有するタイムフ
レームとする。Next, time frames are thinned out from the temporary new audio data pattern so that this number of time frames is the same as the arithmetic mean value of the number of time frames of both original data. The time frame shall be as follows.

〔Effect of the invention〕

以上説明したように本発明は、音声認識時に、認識され
た入力音声信号を新らしい音声データパターンとして再
登録することにより、常に最新の音声データパターンを
保持し話者の発声の経時的な変化に追随する事が可能と
なシ、認識率向上の効果がある。As explained above, the present invention re-registers the recognized input speech signal as a new speech data pattern during speech recognition, so that the latest speech data pattern is always maintained and changes in the speaker's utterance over time are realized. This has the effect of improving the recognition rate.

[Brief explanation of drawings]

第１図は本発明の一実施例のブロック図、第２図はタイ
ムフレームの対応関係を示す図である。１・・・・・・音声入力端子、２・−・・・・音声分析
部、３・・・・・・分析信号出力線、４・・・・・・バ
ッファ、５・・・・・・バッファ出力線、６・・・・・
・記憶部、７・・・・−・距離計算部、８・・・・・・
記憶部出力線、９・・・・・・距離出力線、１０・・−
・・・比較判定部、１１・・・・・・認識結果出力線、
１２・・・新音声データパターン計算部、１３・・・・
・・タイムフレーム対応出力線、１４・・・・・・新音
声データパターン登録線、１５・・・・・・音声データ
パターン、１６〜２７・・・・・・タイムフレーム、２
８〜４０・・・・・・対応を示す線、４１・・・・・・
分析信号、４２〜４９・・−・・・タイムフレーム。＝１・≦−こパ・、FIG. 1 is a block diagram of an embodiment of the present invention, and FIG. 2 is a diagram showing the correspondence of time frames. 1...Audio input terminal, 2...Audio analysis section, 3...Analysis signal output line, 4...Buffer, 5...... Buffer output line, 6...
・Storage section, 7...-Distance calculation section, 8...
Memory unit output line, 9...Distance output line, 10...-
. . . Comparison/determination section, 11 . . . Recognition result output line,
12... New audio data pattern calculation section, 13...
... Time frame compatible output line, 14... New audio data pattern registration line, 15... Audio data pattern, 16-27... Time frame, 2
8-40... Line showing correspondence, 41...
Analysis signal, 42-49...time frame. =1・≦−kopa・,

Claims

[Claims]

An analysis unit that divides an input audio signal into multiple frequency components and time frames of a constant period, a storage unit that registers multiple audio data patterns of a specific speaker, an inter-pattern distance calculation unit, and an inter-pattern distance calculation unit a determination unit that determines audio data with the highest degree of similarity; a recognized audio data pattern, a signal obtained by analyzing the input audio signal, and a time frame correspondence relationship between the two obtained by the inter-pattern distance calculation unit; 1. A speech recognition device comprising: means for generating a new speech data pattern from a given speech data; and means for re-registering the new speech data pattern.