JPS60154300A

JPS60154300A - Syllable standard pattern delivering system

Info

Publication number: JPS60154300A
Application number: JP59011429A
Authority: JP
Inventors: 伸神谷; 和彦松尾
Original assignee: Computer Basic Technology Research Association Corp
Current assignee: Computer Basic Technology Research Association Corp
Priority date: 1984-01-24
Filing date: 1984-01-24
Publication date: 1985-08-13

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】く技術分野〉本発明は音声入力装置における音節標準パターンの切り
出し方式の改良に関するものである〇〈従来技術〉一般に連続的に発声された音声から音節部を抽出して音
節標準パターンとして登録する方法では、音節部の切り
出しが不正確に行なわれると、誤つ庭標準パターンを作
成することになり、認識性能に及ぼす影響が大きい。[Detailed Description of the Invention] Technical Field The present invention relates to an improvement in a method for extracting standard syllable patterns in a voice input device.〇〇〇〈Prior Art〉 Generally speaking, syllable parts are extracted from continuously uttered speech. In the method of registering as a syllable standard pattern, if the syllable portion is inaccurately extracted, an erroneous garden standard pattern will be created, which will have a large effect on recognition performance.

従来の音節の切り出し方法においては発声速度が変化す
ると音節切り出し誤り数も変化する問題点があった０こ
扛は音節切り出しのアルゴリズムが発声速度に関係なく
固定されていることに帰因している０く目的〉本発明は上記の点に鑑みてなされたものであり、登録音
声の発声速度を検出し、音節境界検出部から出力さ扛る
音節境界候補の中から上記の検出された発声速度にもと
すいて音節境界を選択して音節を切り出し、この音節の
切り出し区間を音声によりエコーバックしながらオペレ
ータとともに音節境界を決定するようにした音節標準パ
ターンの１７Ｉｖ出し方式を提供することを目的として
いる。Conventional syllable segmentation methods have the problem that the number of syllable segmentation errors changes as the speech rate changes.This error is due to the fact that the syllable segmentation algorithm is fixed regardless of the speech rate. 0 Purpose> The present invention has been made in view of the above points, and detects the speech rate of registered speech, and selects the detected speech rate from among the syllable boundary candidates output from the syllable boundary detection unit. The purpose of the present invention is to provide a 17Iv extraction method for standard syllable patterns, in which the syllable boundary is first selected, the syllable is cut out, and the syllable boundary is determined together with the operator while the syllable cut section is echoed back by voice. It is said that

〈実施例〉以下、図面を参照して不発明の詳細な説明する。<Example> Hereinafter, the invention will be described in detail with reference to the drawings.

第１図は本発明を実施した音声入力装置の全体構成を示
すブロック図である。第１図において、入力された登録
音声は、音声分析部１において、入力時刻１＃／ｃおけ
る音声信号からパワーｐ　（ｔ）、スペクトルｙ　（ｔ
）等の特徴パラメータが抽出される。FIG. 1 is a block diagram showing the overall configuration of a voice input device embodying the present invention. In FIG. 1, the input registered voice is processed by the voice analysis unit 1 from the voice signal at the input time 1#/c, with power p (t) and spectrum y (t
) etc. are extracted.

この音声分析部１において抽出さ′ｉｔた特徴パラメー
タが発声速度検出部２に入力され、該発声速度検出部２
内の無音区間検出部２１及び有音区間検出部２２によっ
て入力さｆｌたパラメータのパワーｐ（１）の強弱等に
もとすいて有音区間及び無音区間が区別される。The feature parameters extracted in the speech analysis section 1 are input to the speech rate detection section 2, and the speech rate detection section 2
A sound interval and a silent interval are distinguished based on the strength of the power p(1) of the parameter fl inputted by the silent interval detection unit 21 and the sound interval detection unit 22 in the above.

−また発声速度検出部２内の発声速度演算部２３によっ
て音節数が既知である登録用文章の音声入力の有音区間
の継続時間にもとすいて平均音節長しがＳ亀され出力さ
れる。- Also, the average syllable length is calculated by the utterance rate calculation unit 23 in the utterance rate detection unit 2 and output based on the duration of the voiced section of the voice input of the registration sentence whose number of syllables is known. .

即ち、音節の標準パターン全登録する際に音節数が既知
である登録用文章をユーザが発話して発声速度演算部２
３において平均音節ＭＬ（１／平均発声速度）を算出す
ることになる。That is, when registering all the standard patterns of syllables, the user utters a registration sentence whose number of syllables is known, and the speech rate calculation unit 2
In step 3, the average syllable ML (1/average speech rate) is calculated.

今、音節数がｎ個含まれる文章ケ発話した際の有音区間
検出部２２において検出された１番目の有音区間の継続
時間’１Ｌ（ｉｔとすると（ただしｉ二１．２．・・・
＋　ｍ　）、発声速度ｉＱ部２３においてが算出され出
力される。Now, when a sentence containing n syllables is uttered, the duration of the first voiced interval detected by the voiced interval detection unit 22 is '1L (it) (where i21.2...・
+m), the speech rate iQ unit 23 calculates and outputs.

なお、この平均音節長しの算出は音節標準パターンを登
録するための登録音声の人力動作の前に予め音節数が既
知の登録用文章を発声して平均音節長Ｌ’に算出しても
よく、また音節標準パターンを登録するための登録用文
章にあっては予め音節数が判明しているため、音節標準
パターンの登録のための発声動作中に平均音節長Ｌ’ｚ
算出してもよく、更には遂次算出される平均音節長しの
平均値を算出して、この値を後述する音節の切り出しの
際に用いるようになしてもよい。Note that this calculation of the average syllable length may be performed by uttering a registration sentence whose number of syllables is known in advance and calculating the average syllable length L' before manually operating the registration voice for registering the syllable standard pattern. In addition, since the number of syllables is known in advance in the registration text for registering the syllable standard pattern, the average syllable length L'z is calculated during the vocalization operation for registering the syllable standard pattern.
It may be calculated, or furthermore, an average value of the average syllable lengths calculated successively may be calculated and this value may be used when cutting out syllables, which will be described later.

文節境界検出部３では無音区間検出部２１において検出
された無音区間の継続時間にもとすいて、無音区間の継
続時間長が所定の長さを越えている場合を検出して、そ
の無音区間を文節境界とみなしてその旨全出力する。Based on the duration of the silent section detected by the silent section detection section 21, the phrase boundary detection section 3 detects the case where the duration of the silent section exceeds a predetermined length, and detects the silent section. is treated as a bunsetsu boundary and outputs the entire text to that effect.

音節境界検出部４では上記文節境界演出部３によって文
節毎に区切られた音声を単位として、音声分析部１で抽
出された特改パラメータを用いて音節境界の候補を出力
する（音節境界間の間隔が音節長となる）。この音節境
界検出部４において。The syllable boundary detection unit 4 outputs syllable boundary candidates using the special parameters extracted by the speech analysis unit 1, using the speech segmented into phrases by the phrase boundary production unit 3 as a unit. the interval is the syllable length). In this syllable boundary detection section 4.

第２図に示すようにある時間領域’Ｉ”　＜ｔ＜Ｔ２に
おいて音節境界が存在するか否かを決定し難い場合があ
るが、このような場合には、音節境界の最終決定は音節
境界選択部５が行なう。As shown in Figure 2, it may be difficult to determine whether a syllable boundary exists in a certain time domain 'I''< t < T2, but in such cases, the final determination of the syllable boundary is The selection unit 5 performs this.

音節境界選択部５は音節境界検出部４において検出され
た音節境界の候補の音節長と発声速度演算部２８によｒ
＋Ｌｈされた平均音節長「と全比較して音節境界を決定
する。The syllable boundary selection unit 5 calculates the syllable length of the syllable boundary candidates detected by the syllable boundary detection unit 4 and the speech rate calculation unit 28.
The syllable boundaries are determined by comparing all the syllables with the average syllable length +Lh.

今、第２図に示すように、ある時間領域”ｒ、＜ｔ〈Ｔ
２において、音節境界の決定が困難なため、音節境界検
出部３がいくつかの音節候補列Ａ、Ｂ。Now, as shown in Fig. 2, a certain time domain "r, <t<T
In No. 2, it is difficult to determine syllable boundaries, so the syllable boundary detection unit 3 selects several syllable candidate sequences A and B.

Ｃｉ・・を作成して出力したとする（ただし、音節候補
列Ａはａ個の長さＡ（１）　、　Ａ（２）　、　−、Ａ
（ａ）の音節候補から成り、音節候補列Ｂ、Ｃ，・・・
も同様とする氾この音節候補列Ａ、Ｂ、Ｃ・・・が音節
境界選択部５に入力されて、音節候補Ａ、Ｂ、Ｃ，°・
・の平均音節長りからのす：ｎＤＡ、ＤＢ＋Ｄｃ＋・・
・がそれぞれとして算出される。Suppose that Ci... is created and output (however, the syllable candidate string A has lengths of a pieces A(1), A(2), -, A
(a) Consists of syllable candidates, syllable candidate strings B, C,...
Similarly, this syllable candidate string A, B, C, .
Average syllable length of: nDA, DB+Dc+...
・is calculated as each.

ここで、文節の最初に来る音節や破裂音は平均音節長し
より短くなることが多いため、０＜ｋｔ　＜１と認定さ
れ、文節の終りの音節は長くなることが多いため、ｋ２
〉１と設定される。Here, since the syllables and plosives that come at the beginning of a phrase are often shorter than the average syllable length, it is recognized that 0 < kt < 1, and the syllable at the end of a phrase is often longer, so k2
〉1.

音節境界選択部５は、上記のようにして算出された平均
音節長しからのずれＤＡ、ＤＢ、ＤＣ９・・・の中で最
も小さな平均音節長しからのずれ葡有１゜かつ登録用文
章に含□まれる音節数と同数の音節数を有する音節候補
列を選択して音節列として出力する。The syllable boundary selection unit 5 selects the text for registration that has the smallest deviation from the average syllable length of 1° among the deviations from the average syllable length DA, DB, DC9, etc. calculated as described above. A syllable candidate string having the same number of syllables as that contained in □ is selected and output as a syllable string.

音声合成部９でば、音声出力用バッファメモリ８に記憶
さ扛ている入力音声を音節境界選択部５で選択された音
節区間毎に区切ってオペレータにエコーバックして出力
する。オペレータはこのエコーバックされた音声出力に
よって切り出された音節区間が適切か否かを判断し、適
切であれば、キー人力手段１０を操作して、その旨を音
声登録部６に知らせ、音声登録部６にて音節標準パター
ンメモリ７に切り出された音節にもとすいて音節の標準
パターンを登録する。The speech synthesis section 9 divides the input speech stored in the speech output buffer memory 8 into syllable sections selected by the syllable boundary selection section 5 and echoes them back to the operator for output. The operator determines whether or not the cut out syllable section is appropriate based on the echoed audio output, and if it is appropriate, operates the key human power means 10 to notify the audio registration unit 6 to that effect, and registers the audio. In section 6, standard patterns of syllables are also registered in the syllable standard pattern memory 7 for the cut out syllables.

一方、もしオペレータがエコーバックされた音声出力に
よって切り出された音節区間が不適切と判断したならば
、キー人力手段１０を操作して、その旨を音節境界選択
部５に知らせ、音節境界選択部５にて次に平均音節長し
からのずれが小さい音節候補列を選択せしめるようにし
て、以下同様にして、その選択された音節区間毎に区切
って入力音声全オペレータにエコーバックする。On the other hand, if the operator determines that the syllable section cut out based on the echoed back audio output is inappropriate, he operates the key manual means 10 to notify the syllable boundary selection section 5 to that effect. In step 5, a syllable candidate string with a small deviation from the average syllable length is selected, and in the same manner, each selected syllable section is divided and echoed back to all operators of the input speech.

このような動作を音節区間の切り出しが適切になるまで
繰返して、登録すべき音声入力にもとすいて音節の標準
パターンを登録する。This operation is repeated until the syllable section is properly cut out, and the standard pattern of syllables is registered as a voice input to be registered.

く効果〉以上説明【７たように、本発明によれば、登録音声の発
声速度を検出し、この検出した発声速度にもとすいて音
節境界上選択して音節全切り出し、この音節の切り出し
区間を音声によってエコーバックして音節境界を決定す
るようにしているため、オペレータの特性等に起因した
入力音声の発声速度の相違に拘わらず、正確に音節境界
を検出決定することが出来、より確実な音節部の切り出
しを行なうことが出来ると共に、切り出された音節区間
毎に区切らｆＬだ音声のエコーバックをオペレータが聞
いて、音節の切り出しが正しく行なわ牡たか否かを確認
することが出来るため、音声登録の段階における、誤っ
た音節標準パターンの登録を未然に防止することが出来
る。Effect> As explained above [7], according to the present invention, the utterance speed of the registered speech is detected, the entire syllable is cut out by selecting on the syllable boundary based on the detected utterance speed, and this syllable is cut out. Since the syllable boundaries are determined by echoing back the section with voice, it is possible to accurately detect and determine syllable boundaries, regardless of differences in the speaking speed of the input voice due to operator characteristics, etc. Not only can the syllable parts be reliably extracted, but the operator can also listen to the echo back of the separated voice for each syllable section to confirm whether or not the syllables have been extracted correctly. , it is possible to prevent the registration of incorrect syllable standard patterns at the voice registration stage.

を示すブロック図、第２図は検出された音節境界の一例
金示す図である。FIG. 2 is a block diagram showing an example of detected syllable boundaries.

１・・・音声分析部、２・・・発声速度検出部、２１・
・・無音区間検出部、２２・・・有音区間検出部、２３
・・・発声速度演算部、３゛・・文節境界検出部、４・
・・音節境界検出部、５・・・音節境界選択部、６・・
・音声登録部、７・・・音節標準パターンメモリ、８・
・・音声出力用バッファメモリ、９・・・音声合成部、
１０・・・キー人力手段。1... Voice analysis section, 2... Speech rate detection section, 21.
... Silent section detection section, 22... Sound section detection section, 23
... Speech rate calculation section, 3. ... Clause boundary detection section, 4.
...Syllable boundary detection unit, 5...Syllable boundary selection unit, 6...
・Voice registration section, 7...Syllable standard pattern memory, 8.
...Buffer memory for audio output, 9...Speech synthesis section,
10...Key human power means.

代理人　弁理士　福　士　愛　彦（他２名）−７３６− 第２　Ｆ、°１Agent Patent Attorney Fuku Aihiko (and 2 others) -736- 2nd F, °1

Claims

[Claims]

1. Detect the speech rate of the registered speech, select a syllable boundary from among the syllable boundary candidates detected by the syllable boundary detection unit based on the detected speech rate, and cut out the syllable. A method for cutting out a standard syllable pattern, characterized in that syllable boundaries are determined by echoing back sections by voice.