JPS5852694A

JPS5852694A - Monosyllabic voice recognition system

Info

Publication number: JPS5852694A
Application number: JP56150370A
Authority: JP
Inventors: 佐藤　泰雄; 杉田　忠靖
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1981-09-22
Filing date: 1981-09-22
Publication date: 1983-03-28
Also published as: JPH0145920B2

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】本発明は、単音節音声認識システム、特に音声信号の周
波数分析結果にもとづいて特徴パラメータ時系列を抽出
して認識処理を行なう単音節音声認識システムにおいて
、入力音声についての入力特徴パラメータ時系列を少数
の区間に区分し、各区間毎に例えばパラメータ値を平均
化した平均値からなる縮小特徴パラメータ時系列を抽出
し、当該縮小特徴パラメータ時系列によって認識対象候
補単音節を選び出し、該候補単音節に対して照合をとる
ようにして処理速度を大幅に向上するようにした単音節
音声認識システムに関するものである。DETAILED DESCRIPTION OF THE INVENTION The present invention provides a monosyllabic speech recognition system, particularly a monosyllabic speech recognition system that performs recognition processing by extracting feature parameter time series based on frequency analysis results of speech signals. Divide the input feature parameter time series into a small number of intervals, extract a reduced feature parameter time series consisting of an average value obtained by averaging the parameter values for each interval, and use the reduced feature parameter time series to identify candidate monosyllables for recognition. The present invention relates to a monosyllabic speech recognition system in which the processing speed is greatly improved by selecting candidate monosyllables and comparing the candidate monosyllables.

単音節音声認識システムにおいては、単音節音声信号の
周波数分析結果を利用して各音素の特徴を表わす特徴パ
ラ、メータを抽出し、該抽出された特徴パラメータと登
録単音節に対応した予め登録されている特徴パラメータ
と照合して未知入力単音節音声の認識を行なうようにさ
れる。即ち上記特徴パラメータとして例えば第１ホルマ
ント周波数および第２ホルマント周波数などをサンプリ
ングしてこのパラメータ全使用するようにされる。In a monosyllabic speech recognition system, the frequency analysis results of a monosyllabic speech signal are used to extract feature parameters and meters that represent the characteristics of each phoneme, and the extracted feature parameters and pre-registered meters corresponding to the registered monosyllables are used. The unknown input monosyllabic speech is recognized by comparing it with the feature parameters. That is, for example, the first formant frequency and the second formant frequency are sampled as the characteristic parameters, and all of these parameters are used.

しかし、上記照合に当ってデータ処理量が犬となり、認
識カテゴリ数が犬となるにつれて上記照合処理に要する
時間が犬となる。However, as the amount of data processed in the above matching increases and the number of recognized categories increases, the time required for the above matching process increases.

このため、上記特徴パラメータが時間的に急変する区間
となだらかに変化する区間とが存在することに着目し、
前者区間において密にサンプリングし、後者区間におい
て粗にサンプリングすることによって、即ち不均一なサ
ンプリング点でサンプリングすることによって、よシ少
ない標本数のもとて認識率を高める方式が考慮されてい
る（特願昭５２−４３９７２号）。For this reason, we focused on the fact that there are sections where the above characteristic parameters change rapidly over time and sections where they change gently.
A method is being considered that increases the recognition rate with a small number of samples by sampling densely in the former interval and sparsely sampling in the latter interval, that is, by sampling at non-uniform sampling points ( (Japanese Patent Application No. 52-43972).

この方式に対して、種々の特徴量により予め認識対象候
補をしぼった上で、よし詳細な照合処理を行なうようＫ
して処理速度を向上させる種々の方式（特願昭５３−５
３９６５号、特願昭５３−５３９６６号、特願昭５３−
５３９６７号）が提案されているが、上記特徴蓋を抽出
する抽出アルゴリズムが複雑°であるとか、候補を大幅
にしぼることが困難であるといった開本発明は、上記の
点を改善することを目的とし、比較的簡単なアルゴリズ
ムの下で、効率よく認識対象単音節候補を決定［〜、単
音節音声認識率を向上させると共に、認識処理時間の削
減を図ること全目的としている。そのだめ、本発明の単
音節音声認識システムは、未知入力単音節の音声信号を
分析し、当該音声信号から抽出された入力特徴パラメー
タ時系列と予め登録されている登録特徴パラメータ時系
列とを照合して、未知入力単音節音声の認識を行なう単
音節音声認識システムにおいて、上記未知入力単音節音
声の始端から該単音節１音声に含まれる母音までの上記
入力特徴バラメー！夕時系列を多くても］０個以下の区間に分割し、各区間
内のパラメータ値全干均した値まだは区間境界値からな
る入力縮小パラメータ時系列を抽出するよう構成され、
予め同一の手法で抽出され予め登録されている登録縮小
パラメータ時系列と照合することによって認識対象候補
単音節を決定するようにしたことを特徴としている。以
下図面を参照しつつ説明する。For this method, K
various methods to improve processing speed (Patent Application No. 53-53)
No. 3965, Patent Application No. 53966, Patent Application No. 1983-
No. 53967) has been proposed, but the extraction algorithm for extracting the feature cover is complicated and it is difficult to narrow down the candidates significantly.The purpose of the present invention is to improve the above points. The overall purpose is to efficiently determine monosyllable candidates for recognition using a relatively simple algorithm.The overall purpose is to improve the monosyllable speech recognition rate and reduce the recognition processing time. Therefore, the monosyllabic speech recognition system of the present invention analyzes an unknown input monosyllabic speech signal and compares the input feature parameter time series extracted from the speech signal with the registered feature parameter time series registered in advance. In a monosyllabic speech recognition system that recognizes an unknown input monosyllabic speech, the input feature variation from the beginning of the unknown input monosyllabic speech to the vowel included in the monosyllabic speech is determined. The evening time series is divided into at most 0 or less intervals, and an input reduced parameter time series is extracted consisting of the averaged value of all parameter values in each interval or interval boundary values,
The method is characterized in that recognition target candidate monosyllables are determined by comparing them with a registered reduction parameter time series that has been extracted using the same method and registered in advance. This will be explained below with reference to the drawings.

第１図は本発明の一実施例の考え方を説明する説明図、
第２図は本発明の他の一実施例の考え万全説明する説明
図、第３図は上記処理を行なう本発明の一実施例構成、
第４図は上記第２図に対応した一実施例における区間決
定を行なう処理についてフローチャートの形で表わした
説明図を示す。FIG. 1 is an explanatory diagram illustrating the concept of an embodiment of the present invention,
FIG. 2 is an explanatory diagram fully explaining the idea of another embodiment of the present invention, and FIG. 3 shows the configuration of an embodiment of the present invention that performs the above processing.
FIG. 4 shows an explanatory diagram in the form of a flowchart of a process for determining a section in an embodiment corresponding to FIG. 2 above.

第１図図示の如く、時点ＴｏからＴ、ｔでの間に、サン
プリングされた特徴パラメータＰが存在するものとする
とき、本発明の第１の実施例の場合、時点ＴｏからＴｒ
ｔでの時間を例えば５つの等分された点Ｔｇ１５．２Ｔ
Ｅ１５　、３　ＴＥ０１　、４．　Ｔｉ２Ｂ　、　Ｔｚ
を決定する。As shown in FIG. 1, when it is assumed that the sampled feature parameter P exists between time To and T, t, in the case of the first embodiment of the present invention, from time To to Tr
For example, the time at t is divided into five equal points Tg15.2T
E15, 3 TE01, 4. Ti2B, Tz
Determine.

そして、時点ＴｏないしＴｚ１５　ｔでの間の各特徴パ
ラメータ値を平均し、時点ＴＥ１５ないし２ＴＫ１５ま
での間の各特徴パラメータ値を平均し、・・・・・・・
・・時点４Ｔｇ１５７ｉいしＴにまでの間の各特徴パラ
メータ値を平均し、例えば５個の平均値パラメータよシ
なる縮小パラメータ時系列を抽出するようにする。Then, each feature parameter value between time To and Tz15t is averaged, and each feature parameter value between time TE15 and 2TK15 is averaged, and...
. . . Each feature parameter value between time 4Tg157i and T is averaged, and a reduced parameter time series of, for example, five average value parameters is extracted.

なお、上記特徴パラメータ値を平均する代わシに、簡略
化し区間境界値からなる縮小パラメータ時系列を抽出す
るようにしてもよい。Note that instead of averaging the feature parameter values, a reduced parameter time series consisting of interval boundary values may be extracted.

上記縮小パラメータ時系列の抽出にあたって、単音節音
声、例えば「マ（ｍα）」の音について、開始時点Ｔｏ
は、単音節音声の始端すなわち１ｍ」音の最初にとれば
よい。終シの時点Ｔｇは、単音節音声の最終端にとるよ
うにすることも考えられるが、第１図図示「α」部に示
す如く、母音には比較的安定した定常性を示す部分、即
ち特徴パラメータ値の変化の少ない部分があシ、この点
を母音代表点としてＴｇとすることが望ましい。そうす
れば、単音節音声の母音部のうち不安定な要素を含む後
方部分を排除することができ、認識率の向上を図ること
ができる。In extracting the above-mentioned reduced parameter time series, for monosyllabic speech, for example, the sound of "ma (mα)", the starting point To
should be taken at the beginning of the monosyllabic sound, that is, at the beginning of the 1m sound. It is conceivable to set the final point Tg at the final end of a monosyllabic voice, but as shown in the "α" section in Figure 1, the vowel has a part that shows relatively stable constancy, i.e. There is a part where the characteristic parameter value changes little, and it is desirable to set this point as Tg as the vowel representative point. By doing so, it is possible to eliminate the rear portion of the vowel portion of monosyllabic speech that includes unstable elements, and it is possible to improve the recognition rate.

本発明の第２の実施例の場合、上記第１の実施例におい
て時間軸上で等間隔に区分されるのに対して、特徴パラ
メータの変化率が比較的大きい箇所での区間間隔を小に
選ぶようにしている。即ち、特徴パラメータＰが第１図
図示の如くあるものとするとき、このパラメータＰの変
動量を累積した値即ち累積変動量を第２図図示の如く時
間を横軸（でとって描く。このように描かれた図形につ
いて、１２１〜て累積変動蓋が上記値−ＴＡＶ　、　−ｇ　ＴＡＶ
　、・・・・・・・・・となる時点ＴＩ　、Ｔｇ　、・
・・・・・・・・・・・Ｔｇ”ｆｉｒ抽出し、時点Ｔｏ
からＴｌｔでの間の第１図図示の各特徴パラメータ値を
平均し、時点Ｔ１からＴｇまでの間の第１図図示の各特
徴パラメータ値を平均し、・・・・・・・・・・・・、
時点Ｔ４からＴｚまでの間の第１図図示の各特徴パラメ
ータ値を平均し、例えば５個の平均値パラメータよりな
る縮小パラメータ時系列を抽出するようにする。In the case of the second embodiment of the present invention, in contrast to the above-described first embodiment, in which sections are divided at equal intervals on the time axis, the interval between sections is reduced at points where the rate of change of the characteristic parameter is relatively large. I try to choose. That is, when the characteristic parameter P is assumed to be as shown in FIG. 1, the cumulative amount of variation of this parameter P, that is, the cumulative amount of variation, is plotted by plotting time on the horizontal axis (as shown in FIG. 2). For a figure drawn as follows, the cumulative fluctuation cover is the above value −TAV, −gTAV
,......The point in time when TI , Tg ,...
・・・・・・・・・・・・Tg”fir extraction, time To
The values of each feature parameter shown in FIG. 1 from time T1 to Tlt are averaged, and the values of each feature parameter shown in FIG. 1 from time T1 to Tg are averaged. ...,
The characteristic parameter values shown in FIG. 1 from time T4 to Tz are averaged, and a reduced parameter time series consisting of, for example, five average value parameters is extracted.

上記縮小パラメータ時系列の平均値パラメータについて
考察すると該平均値パラメータは次の如きものと馬えて
よい。例えば本発明に用いる第１図図示の特徴パラメー
タＰとしてパラメータＭ、２（１，ｎ、、）とＸＪ（ｔ
ｔＬ）とを考慮するものとすると、各パラメータは公知
のように次の如く表わされる。Considering the average value parameter of the above-mentioned reduced parameter time series, the average value parameter can be considered as follows. For example, as the characteristic parameters P shown in FIG. 1 used in the present invention, parameters M, 2 (1, n, , ) and
tL), each parameter is expressed as follows, as is well known.

（但しノー１．２）第（１）式に示す特徴パラメータＭＪ（ｔｒＬ）はモー
メント法にもとづくパラメータでおって、パラメータＭ
ｌ（ｔｔＬ）は第１７オルマントに対応し、パラメータ
Ｍｚ（ｔｒＬ）は第２フオルマントに対応している。ま
た第（２）式に示す特徴パラメータＸＪ（ｔｒＬ）は帯
域別電力に対応したパラメータであって、パラメータＸ
ｌ（’ｔＬ）は正規化低域電力に対応し、パラメータＸ
２（ｔｔＬ）は正規化高域電力に対応している。(However, No. 1.2) The characteristic parameter MJ (trL) shown in equation (1) is a parameter based on the method of moments, and the parameter M
l(ttL) corresponds to the 17th formant, and parameter Mz(trL) corresponds to the 2nd formant. Further, the characteristic parameter XJ (trL) shown in equation (2) is a parameter corresponding to the power for each band, and the parameter
l('tL) corresponds to the normalized low-frequency power, and the parameter
2(ttL) corresponds to normalized high frequency power.

今区間を速値に区分したものとしたとき、上記平均値パ
ラメータは、第（１）式のパラメータ均’（’４Ｌ）と
第（２）式のパラメータ′ｘＪ’（！ｒＬ）とに対して
夫々次の如く表わされる。即ちｔ　＝−・Ｔｚ　　　　・・而・曲面・・・（５）Ｌｋ
Ｎｆまた第２図を参照して説明した場合の時点ＴＩ。Assuming that the current section is divided into speed values, the above average value parameter corresponds to the parameter 'xJ'('4L) of equation (1) and the parameter 'xJ' (!rL) of equation (2). Each is expressed as follows. That is, t = -・Tz... and curved surface... (5) Lk
Nf Also, time TI in the case described with reference to FIG.

Ｔ２＋・・・・・・・・・・・は次の如く表わされる。T2+...... is expressed as follows.

ここで、（但しＶ（ｉｎ）：変動量）また第（７）弐における変動量Ｖ（ｔｒＬ）は　９− （但しＰは帯域フィルタ群の出力）で与えら九る。here, (However, V (in): amount of variation) Also, the amount of variation V (trL) in the second part (7) is 9- (However, P is the output of the bandpass filter group) It is given by nine.

上記の如く累積変動量によって決定される区間は、例え
ばモーメント法にもとづくパラメータや帯域別電力に対
応したパラメータなど、各パラメータ毎に独立に求める
ようにしてよい。The interval determined by the cumulative amount of fluctuation as described above may be determined independently for each parameter, such as a parameter based on the method of moments or a parameter corresponding to band-specific power.

第３図は本発明の一実施例構成を示す。図中の符号１は
帯域フィルタ群、２はパラメータ抽出回路、３は母音代
表点決定回路、４は入力特徴パラメータ時系列バッファ
、５はパラメータ平均区間決定回路、６はパラメータ平
均回路、７および８は夫々切換回路であって登録モード
と認識モードとを切換えるもの、９は登録単音節縮小パ
ラメータ時系列登録部であってメモリによって構成され
るもの、１０は縮小パラメータ時系列照合部・候補決定
部、１１は登録単音節特徴パラメータ時系１０− 列登録部であって各登録単音節についての特徴パラメー
タ時系列を格納するメモリによって構成されるもの、１
２は候補選択回路であって登録部１１から読出される各
特徴パラメータ時系列のうちで本発明にいう認識対象候
補単音節に対応するもののみを選択するもの、１３は特
徴パラメータ時系列・照合判定部、１４は出力回路を表
わす。FIG. 3 shows the configuration of an embodiment of the present invention. In the figure, 1 is a band filter group, 2 is a parameter extraction circuit, 3 is a vowel representative point determination circuit, 4 is an input feature parameter time series buffer, 5 is a parameter average interval determination circuit, 6 is a parameter average circuit, 7 and 8 9 are switching circuits that switch between the registration mode and the recognition mode; 9 is a registered monosyllable reduction parameter time series registration unit which is constituted by memory; and 10 is a reduction parameter time series matching unit/candidate determination unit. , 11 is a registered monosyllable feature parameter time series 10-sequence registration unit which is constituted by a memory that stores feature parameter time series for each registered monosyllable;
Reference numeral 2 denotes a candidate selection circuit which selects only those corresponding to recognition target candidate monosyllables according to the present invention from among the feature parameter time series read out from the registration unit 11; 13 a feature parameter time series/verification circuit; The determination unit 14 represents an output circuit.

入力単音節音声信号が帯域フィルタ群１に入力され、パ
ラメータ抽出回路２によって入力単音節音声信号に対応
した入力特徴パラメータが抽出される。この抽出された
入力特徴パラメータは、母音代表点決定回路３に入力さ
れ、母音代表点決定１回路３は、第１図を用いて説明し
た如く、時点ＴＥとして母音代表点を用いるべく、母音
の定常性を示す部分の検出を行なう。ここで決定された
母音代表点までの入力特徴パラメータは入力特徴パラメ
ータ時系列の形でバッファ４に一時セットされる。パラ
メータ平均区間決定回路５は第１図図示時点−！−ＴＥ
：ＡＴＥ、・・・・・・・・・・・・の場合で言えば時
点Ｔｇを抽５′５出した」二でＴｏないしＴＥまでの間を５等分した時点
２百Ｔｇ、百ＴＥ、・・・・・・・・・、ＴＥを決定する
。々お第２図図示の時点Ｔ　１１　Ｔ　２・・・・・・
・・・については第４図を参照して後述する。」１記時
点にもとづいて区間が決定されると、パラメータ平均回
路６はバッファ４の内容にもとづいて各区間毎にパラメ
ータ値の平均値を演算する。An input monosyllabic speech signal is input to a bandpass filter group 1, and a parameter extraction circuit 2 extracts input feature parameters corresponding to the input monosyllabic speech signal. The extracted input feature parameters are input to the vowel representative point determination circuit 3, and the vowel representative point determination circuit 3 determines the vowel representative point to use the vowel representative point as the time TE, as explained using FIG. Detect the part that shows stationarity. The input feature parameters up to the vowel representative point determined here are temporarily set in the buffer 4 in the form of input feature parameter time series. The parameter average interval determination circuit 5 operates at the time shown in FIG. -TE
:ATE, In the case of ........., we drew the point Tg at 5'5, and divided the period from To to TE into 5 equal parts at time 2, 100 Tg, 100 TE. , . . . , TE is determined. Time points shown in Figure 2 T 11 T 2...
... will be described later with reference to FIG. 1. Once the sections are determined based on the time point 1, the parameter averaging circuit 6 calculates the average value of the parameter values for each section based on the contents of the buffer 4.

登録モードの場合、切換回路７および８は図示上方のル
ートがどられる。そして、パラメータ平均回路６によっ
て抽出された縮小パラメータ時系列（この場合登録単音
節縮小パラメータ時系列）が夫々図示登録部９に登録さ
れ、また図示バッファ４にセットされた特徴パラメータ
時系列が図示登録部１１に登録される。In the registration mode, the switching circuits 7 and 8 follow the route shown in the upper part of the figure. Then, the reduced parameter time series (in this case, the registered monosyllabic reduced parameter time series) extracted by the parameter averaging circuit 6 are registered in the graphical registration section 9, and the feature parameter time series set in the graphical buffer 4 are registered in graphical registration. It is registered in Section 11.

認識モードの場合、切換回路７および８は図示下方のル
ートがとられる。そしてパラメータ平均口＃！ｒ６によ
って抽出された縮小パラメータ時系列（この場合入力縮
小パラメータ時系列）が図示照合部・候補決定部１０に
導ひかれる。このとき図示登録部９から登録単音節縮小
パラメータ時系列が順次続出され、入力縮小バラメーク
時系列と照合され、本発明にいう認識対象候補単音節全
決定するＣ該照合部拳候補決定部１０においては例えば
単音節間距離Ｓｒ即ち登録縮小パラメータ時系列と人力
縮小パラメータ時系列との間のチェビシェフ距離を演算
する。該距離Ｓｒは次の式で与えられ入力縮小パラメー
タ時系列に対応するものである。In the recognition mode, the switching circuits 7 and 8 take the lower route shown in the figure. And the parameter average mouth #! The reduced parameter time series (in this case, the input reduced parameter time series) extracted by r6 is led to the graphic matching unit/candidate determining unit 10. At this time, the registered monosyllable reduction parameter time series is sequentially outputted from the illustrated registration unit 9, and is compared with the input reduction parameter time series to determine all the recognition target candidate monosyllables according to the present invention. calculates, for example, the inter-monosyllabic distance Sr, that is, the Chebyshev distance between the registered reduction parameter time series and the manual reduction parameter time series. The distance Sr is given by the following equation and corresponds to the input reduction parameter time series.

照合部・候補決定部１０において上記第（９）式にもと
づいて幾個かの認識対象候補単音節が決定されると、こ
の候補単音部名が候補選択回路１２に通知される。この
とき、図示省略した制御部からの制御によって図示登録
部１１から特徴パラメータ時系列が夫々順次読出される
。そして候補選択回路１２によって、候補単音節として
指定されだ単音節に対応する登録単音節特徴パラメータ
時系列のみが選択され、図示照合判定部１３に導びかれ
る。上記認識モードの場合、切換回路　７は図示下方の
ルートラとって訃す、図示バッファ４にセットされてい
る特徴パラメータ時系列（このとき入力特徴パラメータ
時系列）が上記照合判定部１３に導ひかれる。これによ
って、入力特徴パラメータ時系列は、上記候補単音節に
対応する各登録単音節特徴パラメータ時系列と照合され
る。この場合の照合に当って（佳公知のダイナミック・
プログラミング（ＤＰ）照合が行なわれるものと考えて
よい。このように（７て抽出された１つの単音節カテゴ
リが出力回路１４に出力される。When the matching section/candidate determining section 10 determines several candidate monosyllables to be recognized based on the above equation (9), the candidate selecting circuit 12 is notified of the names of the candidate monosyllables. At this time, the feature parameter time series are sequentially read out from the illustrated registration section 11 under control from a control section (not shown). Then, the candidate selection circuit 12 selects only the registered monosyllable feature parameter time series corresponding to the monosyllable designated as a candidate monosyllable, and guides the selected monosyllable to the illustrated comparison determination unit 13. In the case of the above-mentioned recognition mode, the switching circuit 7 selects the router at the bottom of the diagram, and the feature parameter time series (at this time, the input feature parameter time series) set in the diagram buffer 4 is guided to the matching determination section 13. . Thereby, the input feature parameter time series is compared with each registered monosyllable feature parameter time series corresponding to the candidate monosyllable. In this case, (well-known dynamic method)
It may be assumed that programming (DP) verification is performed. In this way, one single syllable category extracted in (7) is output to the output circuit 14.

上記第２図に示す時点ＴＩ＋Ｔ２＋・・・・・・・・・
・・・を決定する場合、第３図図示のパラメータ平均区
間決定回路５は第４図にフローチャートの形で示す如き
処理を行なうものと考えてよい。即ち、（１）ハラメータ抽出回路２によって抽出されたパラメ
ータにもとづいて各パラメータ毎に独立に第２図に示す
如き累積変動ｉｉ：ＴＡＶｉ抽出する。Time TI+T2+ as shown in the above figure 2
When determining . That is, (1) Based on the parameters extracted by the harameter extraction circuit 2, the cumulative variation ii: TAVi as shown in FIG. 2 is extracted independently for each parameter.

（２）そして累積変動量ＴＡＶの値を例えば５等分した
値ＤＴＡＶを決定する。(2) Then, a value DTAV is determined by dividing the cumulative variation amount TAV into five equal parts, for example.

（３）そして最初に時点Ｔｌヲ求めるべくＪ＝１として
おき、レジスタＡＶＨに上記値ＤＴＡＶをセットし、計
時スタート・レジスタＴＳ（Ｊ）に値Ｔ　（Ｉ）　’ｅ
上セツトる。(3) First, set J = 1 to find the time Tl, set the above value DTAV in the register AVH, and set the value T (I) 'e in the timing start register TS(J).
Set above.

（４）以下順次特徴パラメータの累積値ＡＶ（Ｉ）がレ
ジスタＡＶＨの内容と等しいが犬となるときまで、特徴
パラメータ値を累算してゆく。(4) Thereafter, the feature parameter values are sequentially accumulated until the cumulative value AV(I) of the feature parameters is equal to the contents of the register AVH, but is still a dog.

（５）　　累積値ＡＶ（Ｉ）がレジスタＡＶＩ（の内容
と等しいか大となると、そのときのタイミング値Ｔ　（
Ｉ）が時点Ｔｌ用レジスタＴＥ（Ｉ）にセットされ、上
記レジスタＴＳ　（Ｊ＋１　）に値Ｔ（Ｉ＋１）をセッ
トし、レジスタＡＶＨに値（ＡＶＨ＋ＤＴＡＶ）をセッ
トし、次の時点Ｔ２に求めるべくＪ＝２とする。(5) When the cumulative value AV(I) is equal to or greater than the contents of the register AVI(), the timing value T(
I) is set in the register TE(I) for time Tl, the value T(I+1) is set in the register TS (J+1), the value (AVH+DTAV) is set in the register AVH, and J is set to be determined at the next time T2. =2.

（６）以下同様に累積値ＡＶ（Ｉ）がレジスタＡＶＨの
内容と等しいか犬となるまで、特徴パラメータ値を累算
してゆく。即ち、時点Ｔ２１Ｔ３．Ｔ４ｆｃ求めてゆく
０（７）そして累積回路■が値Ｎに達すると、即ち累算処
理が第２図図示時点ＴＫに対応する特徴パラメータの累
算に達すると、その時点で時点Ｔｚが決定される。(6) Similarly, the feature parameter values are accumulated until the cumulative value AV(I) is equal to or equal to the contents of the register AVH. That is, time T21T3. Determining T4fc 0 (7) Then, when the accumulation circuit ■ reaches the value N, that is, when the accumulation process reaches the accumulation of the characteristic parameters corresponding to the time TK shown in FIG. 2, the time Tz is determined at that time. Ru.

以上説明した如く、本発明によれば、比較的簡易に効率
よく認識対象単音節候補をしぼることができるので、８
識処理時間の大幅に削減を図ることができる。As explained above, according to the present invention, recognition target monosyllable candidates can be narrowed down relatively easily and efficiently.
It is possible to significantly reduce the recognition processing time.

[Brief explanation of drawings]

第１図は本発明の一実施例の考え方を説明する説明図、
第２図は本発明の他の一実施例の考え方を説明する説明
図、第３図は上記処理を行なう本発明の一実施例構成、
第４図は上記第２図に対応した一実施例における区間決
定を行なう処理についてフローチャートの形で表わした
説明図を示す。図中、Ｐは特徴パラメータ、２はパラメータ抽出回路、
３は母音代表点決定回路、４は入力特徴パラメータ時系
列バッファ、５はパラメータ平均区間決定回路、６はパ
ラメータ平均回路、７および８は夫々切換何路、９は登
録単音節縮小パラメータ時系列登録部、１０はｍφパラ
メータ時系列照合部・候補決定部、１１は登録単音節特
徴パラメータ時系列登録部、１２は候補選択回路、１３
は特徴パラメータ時系列・照合判定部を表わす。特許出願人富士通株式会社代理人　弁理士森　　１）　　寛１１　日FIG. 1 is an explanatory diagram illustrating the concept of an embodiment of the present invention,
FIG. 2 is an explanatory diagram illustrating the concept of another embodiment of the present invention, and FIG. 3 shows the configuration of an embodiment of the present invention that performs the above processing.
FIG. 4 shows an explanatory diagram in the form of a flowchart of a process for determining a section in an embodiment corresponding to FIG. 2 above. In the figure, P is a feature parameter, 2 is a parameter extraction circuit,
3 is a vowel representative point determination circuit, 4 is an input feature parameter time series buffer, 5 is a parameter average interval determination circuit, 6 is a parameter average circuit, 7 and 8 are respectively switching paths, 9 is a registered monosyllable reduction parameter time series registration 10 is an mφ parameter time series matching unit/candidate determining unit, 11 is a registered monosyllabic feature parameter time series registration unit, 12 is a candidate selection circuit, 13
represents the feature parameter time series/matching determination unit. Patent applicant Fujitsu Ltd. agent Patent attorney Mori 1) Kan 11th

Claims

[Claims]

A unit that analyzes an unknown input monosyllabic speech signal and compares the input feature parameter time series extracted from the speech signal with the registered feature parameter time series registered in advance to recognize the unknown input monosyllabic speech. In the syllabic speech recognition system, the input feature parameter time series from the beginning of the unknown input monosyllabic speech to the vowel included in the monosyllabic speech is divided into at most 10 sections, and the parameters within each section are It is configured to extract an input reduced parameter time series consisting of average values or interval boundary values, and to identify recognition target candidate monosyllables by comparing it with a registered reduced parameter time series extracted using the same method and registered in advance. A monosyllabic speech recognition system characterized in that it determines.