JPS58159594A

JPS58159594A - Polysyllabic word recognition system

Info

Publication number: JPS58159594A
Application number: JP57024980A
Authority: JP
Inventors: 佐藤　泰雄; 大山　隆之
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1982-02-18
Filing date: 1982-02-18
Publication date: 1983-09-21

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】（ａ）　　発明の技術分野本発明は音声認識システムｌこ係り、特に複数音節単語
と単音節単語とを認識する上でモード切替等全必要とせ
ず、複数音節単語の認識時間の短い複数音節単語認識方
式に関する。DETAILED DESCRIPTION OF THE INVENTION (a) Technical Field of the Invention The present invention relates to a speech recognition system, in particular, to recognize multi-syllabic words and mono-syllabic words without the need for mode switching, etc. This paper relates to a multi-syllable word recognition method with short recognition time.

（ｂ）　　技術の背景主として単音節単語を認識する音声認識システム（こ於
て、複数音節単語の認識も行なうどとが出来れば改行１
句点、読点、　　（、）、等の特殊記号及びセーブ、プ
リント、等の制御用語の同時入力が可能になり、単語モ
ード、カナモード等のモード切替用キーが不要となって
音声認識システムの構成がそれだけ簡易化し、操作性の
向ト、Ｌ一本７）乙利点がある。従って複数音節単語の
認識処理速度の速い音声認識システムの出現が望まれて
いる。(b) Technical Background A speech recognition system that primarily recognizes monosyllabic words (in this case, if it is possible to recognize multi-syllabic words as well, the new line 1
It is now possible to simultaneously input special symbols such as periods, commas, (,), etc., and control terms such as save, print, etc., eliminating the need for mode switching keys such as word mode, kana mode, etc., and simplifying the configuration of the speech recognition system. It is simpler, has improved operability, and has the advantage of a single L7). Therefore, it is desired to develop a speech recognition system that can recognize multi-syllable words at a high speed.

（Ｃ）　　発明の目的本発明の目的は上記要望ｌこ基づきモード切替等を要さ
ず、且つ単音節単語と複数音節単語と全認識すると共ｌ
こ複数音節単語の認識速度の速い音声認識システム全提
供するととｌこある。(C) Purpose of the Invention Based on the above-mentioned requirements, the purpose of the present invention is to recognize all monosyllabic words and multi-syllabic words without the need for mode switching.
There are a number of speech recognition systems that can quickly recognize multi-syllable words.

（ｄ）　　発明の構成本発明の構成は音声信号中の互いｌこ異々る音節に対応
（７た母音部を検出するため、音声信号の特徴パラメー
タを不均一サンプリングして音声の特徴音の定常性ケ利
用し、音節代表点を抽出する手段金膜は単語の母音に対
応している音節を示す標本点を抽出し、該標本点が複数
あれば複数音節単語と判定して複数音節単語専用の認識
回路（こより認識し、＃標本点が単数ならば単音節単語
専用の認識回路ｌこより認識するようｌこしたものであ
る。(d) Structure of the Invention The structure of the present invention corresponds to different syllables in a speech signal. Method for extracting syllable representative points using stationarity The gold film extracts sample points that indicate the syllables that correspond to the vowels of a word, and if there are multiple sample points, it is determined that it is a multi-syllable word and it is a multi-syllable word. If the sample point is singular, it is recognized by a recognition circuit dedicated to monosyllabic words.

本発明は音声信号の特徴パラメータが時間的ｌこ急変す
る区間となだらかｔこ変化する区間とが存在すること−
こ注目し、１つ該特徴パラメータが時間的になだらかｌ
こ変化している点、即ち音声の特徴音の定常性をもつ点
が単語の母音に対応している音節であることを利用して
複数音節単語の選別を行々うものである。従って該音節
を代表する標本点を抽出する必要があるが、該標本点抽
出手段は特開昭５４−１４５４０８に祥細に述べられて
いる。簡単に説明すると特徴パラメータの累積変動量全
利用して不拘−丈ンプリング奮行ない、該不均一サンプ
リングで省略された結果を代表する重みをもって対応さ
せると、該重みは高いものが定常性金もつ区間ｌこ対応
し母音の位置ｌこ対応する。The present invention is based on the fact that there are sections in which the characteristic parameters of an audio signal change rapidly over time and sections in which they change gradually over time.
Note that one feature parameter is smooth in time.
Multi-syllable words are selected by utilizing the fact that the points at which these changes, that is, the points at which the characteristic sound of the voice has constancy, are syllables corresponding to the vowels of the words. Therefore, it is necessary to extract a sample point representative of the syllable, and the means for extracting the sample point is described in detail in Japanese Patent Laid-Open No. 145408/1983. To briefly explain, unconstrained sampling is performed using all the accumulated variation of feature parameters, and the results omitted in the non-uniform sampling are associated with representative weights, and the higher the weight, the more stationary the interval This corresponds to the position of the vowel.

（ｅ）　　発明の実施例図は本発明の一実施例を示す回路のブロック図である。(e) Examples of the invention The figure is a block diagram of a circuit showing one embodiment of the present invention.

先ず話者は予め音声音登録するため制御部１１の制御部
こより切替部３を単音節単語認識部９と複数音節認識部
１０ζこ接続し、単音節単語と特定の複数音節単語とを
入力より加える。前処理部１は音声レベル調整及びアナ
ログディジタル変換等全行ないパラメータ抽出′ｍ２へ
送る。バラメーク抽出部２は単語の特徴パラメータ全抽
出して＝３− 切替部３を経て単音節単語認識部９１こは単音節単語の
特徴パラメータ全機数音節単語認識部１０には複数音節
単語の特徴パラメータを格納する。次ｌこ話者は音声認
識を行なわせるため、制御部１１の制御により切替部３
全サンプリング時刻決定部４と配憶部７に接続し、単語
を発声する。入力より入った単語の音声信号は前記同様
に前処理部１、パラメータ抽出部２、切替部３會経て、
サンプリング時刻決定部４に入り、不均一サンプリング
時刻が決定され、音節代表点抽出部５に入る。音節代表
点抽出部５は不均一サンプリング点が決定されて行く間
ｌこ於る累積回１！（１カウントして行き、不均一サン
プリング点のカウント値を重みとして保持し、重みの最
大の不均一サンプリング点がらＪ１１４１こ標本点を予
め定めた個数だけ決定する。即ち音節ｔこ対応した標本
点全決定して行く。母音数検出部６は該標本点の数音カ
ウントして母音数が複かあれば複数音節単語と判定し選
択部８を制御して記憶部７ζこ入っている単Ｗｒ＋’を
複数音節単語認識部１０へ送り、認識結果を制御部１１
を経て出力　４− へ送出する。父母音数が１であれば単音節単語と判定し
、選択部８を制御して記憶部７の単語を単音節単語認識
部９へ送り、認識結果を制御部ｌｌを経て出力ｌこ送出
する。First, the speaker connects the switching unit 3 from the control unit 11 to the monosyllabic word recognition unit 9 and the multi-syllable recognition unit 10ζ in order to register voice sounds in advance, and inputs monosyllabic words and specific multi-syllable words. Add. The preprocessing unit 1 performs all audio level adjustment, analog-to-digital conversion, etc., and sends it to parameter extraction 'm2. The variable extraction unit 2 extracts all feature parameters of the word, and then passes through the switching unit 3 to the monosyllabic word recognition unit 91, which then extracts all the feature parameters of the monosyllabic word. Store parameters. The next speaker selects the switching unit 3 under the control of the control unit 11 in order to perform speech recognition.
It is connected to the total sampling time determining section 4 and the storage section 7, and utters words. The audio signal of the input word passes through the preprocessing section 1, parameter extraction section 2, and switching section 3 in the same manner as described above.
The sampling time determination section 4 determines nonuniform sampling times, and the syllable representative point extraction section 5 enters the sampling time determination section 4 . The syllable representative point extraction unit 5 performs the cumulative number of times 1! while the non-uniform sampling points are being determined. (The count value of the non-uniform sampling points is kept as a weight, and a predetermined number of sample points are determined from the non-uniform sampling points with the largest weight. In other words, the sample points corresponding to the syllable t) The vowel number detection section 6 counts the number of sounds at the sample point, and if the number of vowels is more than one, it is determined that it is a multi-syllable word, controls the selection section 8, and selects the single Wr+ stored in the storage section 7ζ. ' is sent to the multi-syllable word recognition unit 10, and the recognition result is sent to the control unit 11.
The signal is then sent to output 4-. If the number of parent vowels is 1, it is determined to be a monosyllabic word, and the selection unit 8 is controlled to send the word in the storage unit 7 to the monosyllabic word recognition unit 9, and the recognition result is outputted via the control unit 11. .

（ｆ）　　発明の詳細な説明した如く本発明は音声の＋４！微量の定常性を利
用し、音声信号中の互いｌこ異なる音節ｌこ対応した母
音部を検出するととｌこより、母音数をカウントして単
音節単語か複数音節単語か判定し、未知入力単語の選別
を行なうためモード切替等管必要としない複数音節単語
の認識速度の速い音声認識システム全提供し得る。(f) As described in detail, the present invention provides an audio +4! Utilizing the small amount of constancy, we can detect vowel parts that correspond to different syllables in the audio signal, then count the number of vowels to determine whether it is a monosyllabic word or a multi-syllabic word, and calculate the unknown input word. It is possible to provide an entire speech recognition system that can recognize multi-syllabic words without the need for mode switching for selection.

[Brief explanation of drawings]

図ハ本発明の一実施例金示す回路のブロック図である。 ■は前処理部、２はパラメータ抽出部、４はサンプリン
グ時刻決定部、５は音節代表点抽出部、６は母音数扶出
部、８は選択部、１】は制御部である。FIG. 3 is a block diagram of a circuit showing an embodiment of the present invention. 2 is a preprocessing section, 2 is a parameter extraction section, 4 is a sampling time determining section, 5 is a syllable representative point extraction section, 6 is a vowel number determining section, 8 is a selection section, and 1] is a control section.

Claims

[Claims] 1) In a speech recognition system that recognizes multi-syllable words and monosyllabic words, it takes full advantage of the constancy of the characteristic sounds of speech to recognize different syllables in a speech color code. A multi-syllable word recognition method, characterized in that a means for detecting all corresponding vowel parts is provided, and an unknown input word is recognized as a multi-syllable word when a plurality of vowel parts are detected by the detection means. 2) The means for detecting the vowel part sequentially calculates and accumulates the cumulative variation amount of the characteristic parameter of the audio signal, and determines a non-uniform sampling point where the cumulative variation amount reaches a predetermined threshold l, Using sample points extracted from non-uniform sampling points extracted in descending order of weight corresponding to the cumulative number l of determining the non-uniform sampling point for each non-uniform sampling point, the sampling point is Corresponding to the extracted non-uniform sampling points in descending order of magnitude, other non-uniform sampling points within a predetermined neighborhood range of the extracted non-uniform sampling points are excluded from the above extraction process, and then A multi-syllable word recognition method according to claim 1F, characterized in that the non-uniform sampling points with large weights are selected.