JPS6347799A

JPS6347799A - Rhythm control system

Info

Publication number: JPS6347799A
Application number: JP61191395A
Authority: JP
Inventors: 哲也酒寄; 佐々部　昭一; 博雄北川
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1986-08-15
Filing date: 1986-08-15
Publication date: 1988-02-29
Anticipated expiration: 2013-05-13
Also published as: JP2749804B2

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Abstract] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】韮Ｊ鰻と訪本発明は、音声の規則合成における韻律制御方式に関す
る。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a prosody control method in the rule synthesis of speech.

丸ｉｔ析音声合成において自然な韻律を付加するために、ピッチ
、振幅、リズム等を制御する韻律制御規則が不可欠であ
る。このような韻律を構成する数量を多変量統計解析的
に制御する方式についてはすでに提案したが、この方式
は以下のような欠点を持っている。In order to add natural prosody in round-the-clock speech synthesis, prosody control rules that control pitch, amplitude, rhythm, etc. are essential. A method for controlling the quantities that make up prosody using multivariate statistical analysis has already been proposed, but this method has the following drawbacks.

（イ）制御すべき韻律を構成する数量を各々の要因の線
形和で表し、それ以外のモデルを扱うことができない。(b) The quantity that constitutes the prosody to be controlled is expressed as a linear sum of each factor, and other models cannot be handled.

（ロ）定性的パラメータと定量的パラメータを同時に扱
うことができない。(b) It is not possible to handle qualitative and quantitative parameters at the same time.

目　　　　　的本発明は、上述のごとき実情に鑑みてなされたもので、
特に、音声の規則合成において自然性の高い韻律パター
ンを生成することを目的としてなされたものである。Purpose The present invention was made in view of the above-mentioned circumstances.
In particular, it was developed with the aim of generating highly natural prosodic patterns in the regular synthesis of speech.

構　　　成本発明は、上記目的を達成するために、予め用意した音
声素片のパラメータ系列を入力文字列に従って読み出し
、結合規則によって接続し、韻律規則によって韻律を付
加する音声規則合成方式において、種々のパラメータを
多変量統計解析的に処理して最適な制御値を得ることを
特徴としたものである。以下、本発明の実施例に基いて
説明する。Configuration In order to achieve the above object, the present invention is a speech rule synthesis method in which a parameter sequence of speech segments prepared in advance is read out according to an input character string, connected by a combination rule, and prosody is added by a prosody rule. This method is characterized by processing parameters using multivariate statistical analysis to obtain optimal control values. Hereinafter, the present invention will be explained based on examples.

第１図及び第２図は５本発明による韻律制御方式の実施
例を説明するための図であるが、本発明は、上述のよう
な先行技術における欠点を改良するためになされたもの
であり、上記欠点を改善するために、韻律を構成する数
量に影響を与えると考えられる種々の定）を的パラメー
タに、数量化■類分析によって定性的パラメータを加え
て、それらの任意の関数としてモデルを構成し、そのモ
デルによって制御するようにしたものである。1 and 2 are diagrams for explaining an embodiment of the prosody control method according to the present invention, but the present invention was made to improve the drawbacks of the prior art as described above. In order to improve the above-mentioned drawbacks, we added qualitative parameters through quantitative analysis to the target parameters (various constants that are thought to affect the quantities that make up the prosody), and created a model as an arbitrary function of them. is configured and controlled by that model.

第１図において、１１は定量的パラメータ部、１２は定
性的パラメータ部、１３は数量化１類分析部、】、４：
よ任意関数部、１５は韻律を構成する数量部で、韻律を
構成する数量の予測値をＺｌ、実測値をｚｊとして、下
記の（１）、（２）、（３）式を満足するような数量（
ａｊｋ）及びｆの係数を決定する。In FIG. 1, 11 is a quantitative parameter section, 12 is a qualitative parameter section, 13 is a quantification type 1 analysis section, ], 4:
15 is the quantity part that makes up the prosody, where Zl is the predicted value of the quantity that makes up the prosody, and zj is the actual value, so that the following equations (1), (2), and (3) are satisfied. quantity (
ajk) and the coefficients of f.

Ｚｊ、＝ｆ（ｘ、、　ｘｚ”’、　ｘｍｌ、　’／ｓｒ
　８／２１　”’）’ｍ２）”’　（１）Σ＝　（Ｚｉ
−ｚｉ）２→最小　　　　・・・（３）ｃ＝１ここで、δ１（ｊｋ）は個体ｉが要因アイテムｊのカテ
ゴリーＫに反応するときに１．そうでないときＯをとる
関数、ｘｊは数量化Ｉ類分析によって求めたｍ１個の数
量、ｙｌはＺに直接寄与するｍ２個の定量的パラメータ
、Ｒはアイテムの数、ｃｊはアイテムｊのカテゴリー数
、ａ　ｊｋｊはカテゴリーに対する数量、ｎはデータ数
である。Zj, = f(x,, xz”', xml, '/sr
8/21 "')'m2)"' (1)Σ= (Zi
-zi) 2 → minimum (3) c=1 Here, δ1(jk) is 1 when individual i responds to category K of factor item j. Otherwise, the function takes O, xj is m1 quantities obtained by quantification type I analysis, yl is m2 quantitative parameters that directly contribute to Z, R is the number of items, cj is the number of categories of item j , a jkj is the quantity for the category, and n is the number of data.

第２図は、本発明の一実施例を説明するための図で、図
中、２１はモーラ数の逆数部、２２は係数す部、２３は
係数Ｃ部、２４は加算部、２５は前後の音韻、長音、促
音等の定性パラメータ部。FIG. 2 is a diagram for explaining one embodiment of the present invention. In the figure, 21 is the reciprocal part of the mora number, 22 is the coefficient part, 23 is the coefficient C part, 24 is the addition part, and 25 is the front and rear parts. Qualitative parameters such as phonology, long sounds, consonants, etc.

２６は数量化１類分析部、２７は音韻継続時間長部で、
ここでは韻律を構成する数量２として音韻継続時間長を
予測することを考えている。モデルとしては、前後の音
韻、長母音や促音などの特殊音韻か否か等の定性的パラ
メータを要因アイテムにとって、数址化Ｉ類分析を行な
って求めた数量ｘ１と、発話を牲位のモーラ数の逆数ｙ
工との線形１次結合を考える。このとき関数ｆは、ｆ：　（ｘ□ｔ　ｙｚ）　＝ｘ、＋ｂｙ、＋ｃ　　　”
’　（４）と表され、上記（１）、（２）式は、という形になる。未知定数（ａｊｋ）　、　ｂ、　ｃは
最小２乗法の原理により（５）式を各定数で偏微分し。26 is the quantification type 1 analysis part, 27 is the phonological duration length part,
Here, we are considering predicting the phoneme duration length as the quantity 2 that constitutes the prosody. The model uses qualitative parameters such as the preceding and following phonemes and whether or not it is a special phoneme such as a long vowel or consonant as a factor, and the quantity x1 obtained by performing a type I analysis of the utterance, and the mora of the sacrifice position of the utterance. reciprocal of number y
Consider a linear combination with At this time, the function f is f: (x□t yz) =x, +by, +c”
'(4), and the above equations (1) and (2) have the following form. Unknown constants (ajk), b, and c are obtained by partially differentiating equation (5) with respect to each constant using the principle of least squares method.

たちのをＯとおいた連立方程式を解くことによって簡単
に求めることが出来る。It can be easily obtained by solving simultaneous equations with 0 as 0.

このように構成されたモデルによって、求めるべき音韻
における定性的パラメータとモーラ数を入力して音韻継
続時間長を求めることが出来る。With the model configured in this manner, the phoneme duration length can be determined by inputting the qualitative parameters and mora number of the phoneme to be determined.

ここではＺに直接寄与する定量的パラメータとしてモー
ラ数を用いたが、その他、モーラ位置、その音韻の平均
パワーや、ピッチなどを用いることも可能である。さら
に関数ｆの形もｙの２次関数や、Ｘとｙのかけ合わせ等
様々なものが考えられる。ただし、未知定数を決定する
連立方程式が非線形になる場合は、数値解析法を導入す
る必要がある。また、Ｚとし又ピッチ、振幅等の韻律を
構成する他の要素を予測することも同様に実現できる。Here, the number of moras is used as a quantitative parameter that directly contributes to Z, but it is also possible to use the mora position, the average power of the phoneme, the pitch, etc. Furthermore, various forms of the function f can be considered, such as a quadratic function of y or a product of X and y. However, if the simultaneous equations determining the unknown constants are nonlinear, it is necessary to introduce a numerical analysis method. Furthermore, it is also possible to predict other elements constituting prosody such as Z, pitch, and amplitude.

匁−−−ス以上の説明から明らかなように、本発明によるど、ａ拭
化Ｉ類分析を用いて定性的パラメータを数量化し、これ
を加えた定量的パラメータに対するモデルを構成し、自
然性の高い合成音を得ることができるように韻律を制御
することができる。As is clear from the above explanation, according to the present invention, qualitative parameters are quantified using a-type I analysis, a model is constructed for quantitative parameters with this added, and naturalness is The prosody can be controlled so that a high synthesized sound can be obtained.

[Brief explanation of the drawing]

第」−図及び第２図は、本発明による韻律制御方式の実
施例を説明するための構成図である。１１・・・定九（的パラメータ部、１２・・・定性的パ
ラメータ部、１３・・・数量化■類分析部、１４・・・
任意関数部、１５・・・韻律を構成する数量部、２１・
・・モーラ数の逆数部、２２・・係数す部、２３・・・
係数Ｃ部、２４・・・加算部、２５・・・定性パラメー
タ部、２６・・・数量化１類分析部、２７・・・音韻継
続時間長部。第　　１　　図１！１４第２図Figures 1 and 2 are configuration diagrams for explaining embodiments of the prosody control system according to the present invention. 11... Definite nine (material parameter part), 12... Qualitative parameter part, 13... Quantification ■ type analysis part, 14...
Arbitrary function part, 15... Quantity part forming prosody, 21.
・Reciprocal part of Mora number, 22 ・Coefficient part, 23...
Coefficient C part, 24... Addition part, 25... Qualitative parameter part, 26... Quantification type 1 analysis part, 27... Phoneme duration length part. 1st Figure 1!14 Figure 2

Claims

[Claims]

(1) In the speech rule synthesis method, which reads the parameter series of speech segments prepared in advance according to the input character string, connects them using connection rules, and adds prosody using prosody rules, various parameters are processed using multivariate statistical analysis. A prosody control method characterized by obtaining an optimal control value.

(2) Express the quantities that make up prosody (phonological duration, pitch pattern, amplitude, etc.) as a function of several quantitative parameters, take at least one of them as an external standard, and use that quantitative parameter as a function. Claim (1), characterized in that a prediction model is set by performing a quantification type I analysis of qualitative parameters that are considered to contribute to factor items, and the quantities constituting the prosody are predicted by this model. Prosody control method described in.