JP2941168B2

JP2941168B2 - Speech synthesis system

Info

Publication number: JP2941168B2
Application number: JP6127423A
Authority: JP
Inventors: 茂藤尾; 芳典匂坂
Original assignee: Ei Tei Aaru Onsei Honyaku Tsushin Kenkyusho Kk
Current assignee: Ei Tei Aaru Onsei Honyaku Tsushin Kenkyusho Kk
Priority date: 1994-06-09
Filing date: 1994-06-09
Publication date: 1999-08-25
Anticipated expiration: 2014-08-25
Also published as: JPH07334188A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は、自然な合成音声を得る
ために、確率文脈自由文法に従って韻律句境界を検出し
て基本周波数を制御する音声合成システムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech synthesis system for controlling a fundamental frequency by detecting a prosodic phrase boundary in accordance with a stochastic context-free grammar in order to obtain a natural synthesized speech.

【０００２】[0002]

【従来の技術】自然な合成音声を得るために韻律句の境
界、すなわち基本周波数の立て直しが起こる境界の推定
は重要であり、このことから、韻律句の境界の推定の研
究が進められている。2. Description of the Related Art In order to obtain a natural synthesized speech, it is important to estimate the boundaries of prosodic phrases, that is, the boundaries at which the fundamental frequency needs to be reestablished. .

【０００３】図３は、入力された音声信号において基本
周波数Ｆｏの立て直しが行われていない場合の基本周波
数に対する時間的変化を示すグラフである一方、図４
は、入力された音声信号において基本周波数Ｆｏの立て
直しが行われた場合の基本周波数に対する時間的変化を
示すグラフである。複数の単語列からなる文を発声して
いるときに、基本周波数Ｆｏの立て直しが行われていな
いときは、図３に示すように、発声音声の時間経過とと
もに基本周波数Ｆｏが低下してゆくが、一方、上記文中
の韻律句の境界時に基本周波数Ｆｏの立て直しが行われ
たときは、図４に示すように、発声音声の時間経過とと
もに基本周波数Ｆｏが常に低下せずに上昇し、すなわち
基本周波数Ｆｏの立て直しが行われる。FIG. 3 is a graph showing a temporal change with respect to the fundamental frequency when the fundamental frequency Fo is not reestablished in the input audio signal, while FIG.
Is a graph showing a temporal change with respect to the fundamental frequency when the fundamental frequency Fo is reestablished in the input audio signal. If the basic frequency Fo is not reestablished while a sentence composed of a plurality of word strings is being uttered, the basic frequency Fo decreases with the passage of time of the uttered voice, as shown in FIG. On the other hand, when the fundamental frequency Fo is reestablished at the boundary of the prosodic phrase in the above sentence, as shown in FIG. 4, the fundamental frequency Fo rises without declining with time of the uttered voice. The frequency Fo is reestablished.

【０００４】例えば、自然な音声の合成を目的として、
韻律句境界推定を含んだ韻律制御に関する研究（以下、
従来例という。）が、箱田和雄ほか，“文章音声の音調
結合型導出規則の検討”，電子情報通信学会技術報告，
ＳＰ８９−５，ｐｐ３３ー３８，１９８９年５月に開示
されている。この従来例の研究では統計的分析に基づい
たヒューリスティックな韻律制御規則、すなわち人間の
感覚に基づいた経験的に作成した韻律制御規則を用い
て、係り受け関係の情報等から韻律句境界の推定を行な
っている。For example, for the purpose of natural speech synthesis,
Study on prosody control including prosodic phrase boundary estimation
It is called a conventional example. ), Kazuo Hakoda et al., "Study of Tone Coupling Type Derivation Rule for Sentence Speech", IEICE Technical Report,
SP89-5, pp33-38, disclosed in May 1989. In this conventional study, a prosodic phrase boundary was estimated from information on dependency relations using heuristic prosodic control rules based on statistical analysis, that is, empirically created prosodic control rules based on human senses. I do.

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、係り受
け関係は統語構造や単語間の意味的関係を反映してお
り、これを正確に定式化することは難しく、人の手によ
って与えなければならず、より自然な合成音声を得るた
めには、音声合成に必要な情報が膨大なものとなり、こ
れにより、手動の処理が繁雑となるという問題点があっ
た。However, the dependency relationship reflects the syntactic structure and the semantic relationship between words, and it is difficult to formulate this accurately, and it must be given by hand. In order to obtain a more natural synthesized speech, the information necessary for speech synthesis becomes enormous, which causes a problem that manual processing becomes complicated.

【０００６】本発明の目的は以上の問題を解決し、従来
例に比較して音声合成に必要な入力情報を減少させるこ
とができ、しかも自然な合成音声を得ることができる音
声合成システムを提供することにある。An object of the present invention is to solve the above problems and to provide a speech synthesis system capable of reducing input information necessary for speech synthesis as compared with the conventional example and obtaining a natural synthesized speech. Is to do.

【０００７】[0007]

【課題を解決するための手段】本発明に係る請求項１記
載の音声合成システムは、入力された単語列に基づいて
基本周波数を制御して上記単語列の音声を合成して出力
する音声合成手段を備えた音声合成システムにおいて、
上記音声合成システムは、所定の確率文脈自由文法を、
所定のアルゴリズムに従って、予め作成された基本周波
数の立て直し位置のデータで括弧付けされたコーパスを
用いて学習することにより、韻律句の構造の情報を含む
ように学習された確率文脈自由文法を生成する第１の学
習手段を備え、上記音声合成手段は、上記第１の学習手
段によって生成された確率文脈自由文法を用いた規則を
備えた韻律制御規則に従って、上記基本周波数の立て直
しが起こる境界である韻律句の境界を上記入力された単
語列において検出して上記基本周波数を制御する制御手
段を備えたことを特徴とする。また、請求項２記載の音
声合成システムは、請求項１記載の音声合成システムに
おいて、さらに、所定の初期値の確率文脈自由文法を、
所定のアルゴリズムに従って、予め作成された係り受け
構造で括弧付けされたコーパスを用いて学習することに
より、学習された確率文脈自由文法を生成する第２の学
習手段を備え、上記第１の学習手段は、上記第２の学習
手段によって生成された確率文脈自由文法を再学習する
ことを特徴とする。さらに、請求項３記載の音声合成シ
ステムは、請求項１又は２記載の音声合成システムにお
いて、上記アルゴリズムは、インサイド・アウトサイド
・アルゴリズムであることを特徴とする。According to a first aspect of the present invention, there is provided a speech synthesis system for controlling a fundamental frequency based on an input word string to synthesize and output the speech of the word string. In a speech synthesis system comprising means,
The speech synthesis system converts a predetermined stochastic context-free grammar into:
A stochastic context-free grammar trained to include information on the structure of prosodic phrases is generated by learning using a corpus bracketed by data of the reconstructed position of the fundamental frequency created in advance according to a predetermined algorithm. A first learning unit, wherein the speech synthesizing unit is a boundary at which the rebuilding of the fundamental frequency occurs according to a prosody control rule including a rule using a stochastic context-free grammar generated by the first learning unit. Control means for detecting a boundary between prosodic phrases in the input word string and controlling the fundamental frequency is provided. A speech synthesis system according to a second aspect is the speech synthesis system according to the first aspect, further comprising a stochastic context-free grammar having a predetermined initial value.
A second learning means for generating a learned stochastic context-free grammar by learning using a corpus bracketed by a pre-created dependency structure according to a predetermined algorithm; Is characterized by re-learning the stochastic context-free grammar generated by the second learning means. Further, a speech synthesis system according to a third aspect is the speech synthesis system according to the first or second aspect, wherein the algorithm is an inside / outside algorithm.

【０００８】[0008]

【作用】以上のように構成された音声合成システムにお
いては、上記第１の学習手段は、所定の確率文脈自由文
法を、所定のアルゴリズムに従って、予め作成された基
本周波数の立て直し位置のデータで括弧付けされたコー
パスを用いて学習することにより、韻律句の構造の情報
を含むように学習された確率文脈自由文法を生成する。
次いで、上記音声合成手段は、入力された単語列に基づ
いて基本周波数を制御して上記単語列の音声を合成して
出力するが、ここで、上記音声合成手段の制御手段は、
上記第１の学習手段によって生成された確率文脈自由文
法を用いた規則を備えた韻律制御規則に従って、上記基
本周波数の立て直しが起こる境界である韻律句の境界を
上記入力された単語列において検出して上記基本周波数
を制御する。これによって、韻律句の境界をより正確に
検出してより自然な音声を合成して出力することができ
る。また、所定の初期値の確率文脈自由文法を、所定の
アルゴリズムに従って、予め作成された係り受け構造で
括弧付けされたコーパスを用いて学習することにより、
学習された確率文脈自由文法を生成する第２の学習手段
をさらに備えてもよく、上記第１の学習手段は、上記第
２の学習手段によって生成された確率文脈自由文法を再
学習するようにしてもよい。これによって、韻律句の境
界をより正確に検出してさらにより自然な音声を合成し
て出力することができる。さらに、上記アルゴリズム
は、好ましくは、インサイド・アウトサイド・アルゴリ
ズムである。In the speech synthesizing system configured as described above, the first learning means converts a predetermined stochastic context-free grammar into parentheses with data of a pre-created fundamental frequency rebuilding position according to a predetermined algorithm. By learning using the attached corpus, a probabilistic context-free grammar that is learned to include information on the structure of prosodic phrases is generated.
Next, the speech synthesizer controls a fundamental frequency based on the input word string to synthesize and output a speech of the word string. Here, the control means of the speech synthesizer includes:
According to a prosody control rule provided with a rule using a stochastic context-free grammar generated by the first learning means, a boundary of a prosodic phrase at which the fundamental frequency is reestablished is detected in the input word string. To control the fundamental frequency. This makes it possible to more accurately detect the boundaries of the prosodic phrases and synthesize and output a more natural voice. Also, by learning a stochastic context-free grammar of a predetermined initial value using a corpus bracketed by a pre-made dependency structure according to a predetermined algorithm,
The apparatus may further comprise a second learning means for generating a learned stochastic context-free grammar, wherein the first learning means re-learns the stochastic context-free grammar generated by the second learning means. You may. This makes it possible to more accurately detect the boundaries of prosodic phrases and synthesize and output a more natural voice. Further, the algorithm is preferably an inside / outside algorithm.

【０００９】[0009]

【実施例】以下、図面を参照して本発明に係る実施例に
ついて説明する。上述のように、従来、韻律句境界の推
定の要因として係り受け構造、境界前後の単語などが用
いられている。このうち重要な要因である係り受け構造
は統語構造や単語間の意味的関係を反映しており、これ
を正確に定式化することは難しい。これに対して、本実
施例においては、人間が予め与えた係り受け構造および
実際の基本周波数立て直し特性に基づいて、確率文脈自
由文法（ＳＣＦＧ）の学習を行ない、それによって得ら
れる韻律制御規則に基づいて、韻律句境界の検出推定を
行って基本周波数を制御して音声合成を実行することを
特徴としている。すなわち、ここでは、まずＳＣＦＧ確
率学習部３０は、所定の初期値確率文脈自由文法（ＳＣ
ＦＧ）３１に対して、インサイド・アウトサイド・アル
ゴリズムを用いて韻律句の構造を学習し、学習された確
率文脈自由文法（ＳＣＦＧ）３２から韻律句境界推定規
則作成部３４で、例えばニューラルネットワークを用い
て韻律句の境界の推定のための規則を作成して、その規
則を韻律制御規則３３に含ませた後、音声合成制御部１
０は、当該韻律制御規則３３に基づいて、パラメータと
境界前後の単語より韻律句境界の推定を行って音声合成
処理を実行する。Embodiments of the present invention will be described below with reference to the drawings. As described above, conventionally, a dependency structure, words before and after a boundary, and the like are used as factors for estimating a prosodic phrase boundary. Of these, the dependency structure, which is an important factor, reflects the syntactic structure and the semantic relationship between words, and it is difficult to formulate this accurately. On the other hand, in the present embodiment, learning of a stochastic context-free grammar (SCFG) is performed based on a dependency structure given by a human in advance and an actual fundamental frequency rebuilding characteristic. On the basis of this, a prosody phrase boundary is detected and estimated, and a fundamental frequency is controlled to execute speech synthesis. That is, here, the SCFG probability learning unit 30 first determines the predetermined initial value probability context-free grammar (SC
FG) 31 using the inside / outside algorithm to learn the structure of the prosodic phrase, and from the learned stochastic context-free grammar (SCFG) 32 to the prosodic phrase boundary estimation rule creating unit 34, for example, a neural network After creating a rule for estimating the boundary of the prosody phrase and including the rule in the prosody control rule 33, the speech synthesis control unit 1
0 performs speech synthesis processing by estimating a prosodic phrase boundary from parameters and words before and after the boundary based on the prosodic control rule 33.

【００１０】本実施例のＳＣＦＧの確率学習部３０にお
いて用いる、１９９０年にケイ・ラリー（K.Lari）ほか
によって提案されたインサイド・アウトサイド・アルゴ
リズム（例えば、ケイ・ラリー（K.Lari）ほか，“The
estimation of stochastic context-free grammars usi
ng the Inside-Outside Algorithm",Computer Speechan
d Language,Vol.4,pp35-56,Academic Press limited，
１９９０年参照。）について、以下に説明する。当該イ
ンサイド・アウトサイド・アルゴリズムは、入力された
ソースが、１９７９年にベイカー（Baker）によって提
案された文脈自由の隠れマルコフプロセスとしてモデル
化可能であると仮定している。このアルゴリズムは、評
価された文法が任意の度合いのあいまい表現を有するこ
とを可能にしている。いま、Ｏｍ＝Ｏ₁，Ｏ₂，…，Ｏ_T
を次の数１の形式の規則を有する確率文脈自由文法（Ｓ
ＣＦＧ）Ｇによって発生された観察シーケンスとする。The inside / outside algorithm (for example, K. Lari et al.) Proposed by K. Lari et al. In 1990 for use in the probability learning unit 30 of the SCFG of this embodiment. , “The
estimation of stochastic context-free grammars usi
ng the Inside-Outside Algorithm ", Computer Speechan
d Language, Vol.4, pp35-56, Academic Press limited,
See 1990. ) Will be described below. The inside-outside algorithm assumes that the input source can be modeled as a context-free hidden Markov process proposed by Baker in 1979. This algorithm allows the evaluated grammar to have any degree of ambiguous expression. Now, Om = O ₁ , O ₂ ,..., O _T
Is a probabilistic context-free grammar (S
CFG) Let the observation sequence be generated by G.

【００１１】[0011]

【数１】ｉ→ｊｋ，及びｉ→ｍ## EQU1 ## i → jk and i → m

【００１２】ここで、ｉ，ｊ，ｋは非終端記号のそれぞ
れに対応するそれぞれ唯一の整数であり、ｍは終端記号
に対応する整数である。この確率文脈自由文法（ＳＣＦ
Ｇ）を記述する複数のパラメータの複数のマトリックス
はＡ及びＢであり、ここで、次式のように表わすことが
できる。Here, i, j, and k are each a unique integer corresponding to each of the nonterminal symbols, and m is an integer corresponding to the terminal symbol. This stochastic context-free grammar (SCF
The matrices of parameters describing G) are A and B, where they can be expressed as:

【００１３】[0013]

【数２】ａ[i,j,k]＝Ｐ(ｉ⇒ｊｋ／Ｇ）## EQU2 ## a [i, j, k] = P (i⇒jk / G)

【数３】ｂ［ｉ，ｍ］＝Ｐ(ｉ⇒ｍ／Ｇ)[Equation 3] b [i, m] = P (i⇒m / G)

【００１４】従って、ａ［ｉ，ｊ，ｋ］は非終端記号ｉ
が非終端記号ｊ及びｋの対を発生するときの確率であ
る。同様に、ｂ［ｉ，ｍ］は非終端記号ｉが１つの終端
記号ｍを発生するときの確率を表わす。１９５６年にチ
ェムスキー（Chomky）によって提案された任意の文脈自
由文法は、１９５９年にチェムスキー（Chomky）によっ
て提案されたチェムスキーの標準形に変形することがで
きるので、これらのパラメータは任意の確率文脈自由言
語を記述するために十分である。矛盾が生じないために
は、次の数４で表される拘束条件が常に満足する必要が
ある。なお、本明細書において、例えば、ｉ＝１からｎ
までの級数和Σの形式を上付き文字と下付き文字を用い
て、Σ_i=1 ⁿと表わすものとする。Therefore, a [i, j, k] is a nonterminal symbol i
Is the probability of generating a pair of non-terminal symbols j and k. Similarly, b [i, m] represents the probability when non-terminal symbol i generates one terminal symbol m. Since the arbitrary context-free grammar proposed by Chemksky in 1956 can be transformed into the standard form of Chemsky proposed by Chemkky in 1959, these parameters are arbitrary stochastic context-free. Enough to describe the language. In order to prevent inconsistency, the constraint expressed by the following equation (4) must be always satisfied. In this specification, for example, i = 1 to n
The form of the series sum まで is expressed as Σ _{i = 1} ⁿ using superscripts and subscripts.

【００１５】[0015]

【数４】Σ_j,_kａ［ｉ，ｊ，ｋ］＋Σ_mｂ［ｉ，ｍ］＝
１，すべてのｉに対してΣ _j , _k a [i, j, k] + Σ _m b [i, m] =
1, for all i

【００１６】この拘束条件は簡単に説明すると、すべて
の非終端記号は非終端記号の対、もしくは１つの終端記
号のいずれかを発生する必要があるということを意味す
る。確率文脈自由文法（ＳＣＦＧ）に応用すると、２つ
の特定の問題を処理する必要があり、すなわち、認識の
問題と学習の問題を処理する必要がある。この認識の問
題は、次の数５に示すような、文法Ｇが与えられたとき
に観察シーケンスＯを発生する開始記号Ｓの確率の計算
に関係している。数５において、Ｏに付与されたｍはマ
トリックスを示し、以下同様である。This constraint simply means that every non-terminal must generate either a pair of non-terminals or one terminal. When applied to stochastic context-free grammar (SCFG), two specific problems need to be dealt with: recognition problems and learning problems. This recognition problem involves the calculation of the probability of a starting symbol S to generate an observation sequence O given a grammar G, as shown in Equation 5 below. In Equation 5, m given to O indicates a matrix, and so on.

【００１７】[0017]

【数５】Ｐ（Ｓ⇒＊Ｏｍ／Ｇ）[Equation 5] P (S⇒ * Om / G)

【００１８】ここで、＊は１つ又はそれ以上のステップ
からなる導出シーケンスを示している。また、⇒＊の記
号は、元の論文においては、記号⇒の上に＊を付してい
るが、本明細書においては、オンライン出願の制約上、
⇒＊と記述する。学習の問題は、学習のシーケンスＯ
⁽¹⁾，Ｏ⁽²⁾，…，Ｏ^(Q)が与えられた１組の文法規則Ｇ
を決定することに関係している。従来のマルコフモデル
アルゴリズムの前向き確率（α）と後向き確率（β）と
同様に、確率文脈自由マルコフ文法の解析を容易にする
ために、内側確率（ｅ）と外側確率（ｆ）とを定義す
る。量ｅ（ｓ，ｔ，ｉ）は観察シーケンスＯ（ｓ），
…，Ｏ（ｔ）を発生する非終端記号ｉの確率として次式
のように定義される。Here, * indicates a derived sequence consisting of one or more steps. In addition, in the original paper, the symbol ⇒ * is marked with an asterisk (*) above the symbol ⇒, but in this specification, due to restrictions of online application,
⇒ Describe as *. The problem of learning is the learning sequence O
^A set of grammar rules G given ⁽¹⁾ , O ⁽²⁾ , ..., O ^(Q)
Is concerned with determining Similar to the forward (α) and backward (β) probabilities of the conventional Markov model algorithm, an inner probability (e) and an outer probability (f) are defined to facilitate analysis of a stochastic context-free Markov grammar. . The quantity e (s, t, i) is the observation sequence O (s),
.., O (t) is defined as the probability of the non-terminal symbol i that generates O (t) as follows:

【００１９】[0019]

【数６】ｅ(s,t,i)＝Ｐ(ｉ⇒＊Ｏ(ｓ)…Ｏ(t)／Ｇ)E (s, t, i) = P (i⇒ * O (s) ... O (t) / G)

【００２０】上記量ｅを計算するための反復手順を決定
するときに、次のように２つの場合が考えられる。When determining an iterative procedure for calculating the quantity e, there are two cases as follows.

【００２１】（Ａ）場合１：（ｓ＝ｔのとき）ただ１つの観察は省略され、それ故、ｉ→ｍの形式の遷
移規則は次式のように表される。(A) Case 1: (when s = t) Only one observation is omitted, and therefore the transition rule of the form i → m is expressed as:

【００２２】[0022]

【数７】ｅ(s,s,i)＝Ｐ(ｉ⇒Ｏ(s)／Ｇ)＝ｂ[i,Ｏ(s)][Mathematical formula-see original document] e (s, s, i) = P (i => O (s) / G) = b [i, O (s)]

【００２３】（Ｂ）場合２：（ｓ≠ｉのとき）この場合において、１つを超える観察が含まれるので、
ｉ→ｊｋの形式の規則は適用される必要がある。内側確
率の計算を示す図７を参照すれば、量ｅ（ｓ，ｔ，ｉ）
は次式で表されることが明らかである。(B) Case 2: (when s ≠ i) In this case, since more than one observation is included,
Rules of the form i → jk need to be applied. Referring to FIG. 7, which shows the calculation of the inner probability, the quantity e (s, t, i)
Is clearly expressed by the following equation.

【００２４】[0024]

【数８】ｅ(s,t,i)＝Σ_j,_kΣ_r=s ^t-1ａ［i,j,k］ｅ(s,r,
j)ｅ(r+1,t,k)，すべてのｉに対して。[Equation 8] e (s, t, i) = Σ j, k Σ r = s t-1 a [i, j, k] e (s, r,
j) e (r + 1, t, k), for all i.

【００２５】従って、上記量ｅはすべてのシーケンス長
１に対してｅを決定することによって反復法により計算
され、このとき、すべてのシーケンス長は２となり、以
下同様である。次に、外側確率を次式のように定義す
る。Thus, the quantity e is calculated by an iterative method by determining e for all sequence lengths 1, where all sequence lengths are 2, and so on. Next, the outer probabilities are defined as follows.

【００２６】[0026]

【数９】ｆ(s,t,i)＝Ｐ(Ｓ⇒＊Ｏ(1)…Ｏ(s-1),ｉ，Ｏ
(t+1)…Ｏ(T)／Ｇ）F (s, t, i) = P (S⇒ * O (1) ... O (s-1), i, O
(t + 1)… O (T) / G)

【００２７】ここで、ｆ（ｓ，ｔ，ｉ）は、書き換えプ
ロセスにおいてｉが発生されるとともに、それによって
支配されていない一連の文が左側方向に対してはＯ
（１）…Ｏ（ｓ−１）であり、右側方向に対してＯ（ｔ
＋１）…Ｏ（Ｔ）である（図８参照。）。この場合にお
いて、非終端記号ｉは図９において図示されているよう
に、ｊ→ｉｋ又はｊ→ｋｉの２つの可能な設定のうちの
１つである可能である。ここで、次式のように表わすこ
とができる。Here, f (s, t, i) indicates that i is generated in the rewriting process, and a series of sentences which are not governed by i is O in the left direction.
(1)... O (s−1), and O (t) in the rightward direction.
+1)... O (T) (see FIG. 8). In this case, the non-terminal symbol i can be one of two possible settings: j → ik or j → ki, as shown in FIG. Here, it can be expressed as the following equation.

【００２８】[0028]

【数１０】ｆ(s,t,i)＝Σ_j,_k［Σ_r=1 ^s-1ｆ(r,t,j)ａ[j,
k,i]ｅ(r,s-1,k)＋Σ_r=t+1 ^Tｆ(s,r,j)ａ[j,i,k]ｅ(t+1,
r,k)］，並びにF (s, t, i) = Σ _j , _k [Σ _{r = 1} ^s−1 f (r, t, j) a [j,
k, i] e (r, s-1, k) + Σ _{r = t + 1} ^T f (s, r, j) a [j, i, k] e (t + 1,
r, k)], and

【数１１】ｆ(1,T,i)＝１，もしｉ＝Ｓであるとき；＝０，その他のとき。F (1, T, i) = 1, if i = S; = 0, otherwise.

【００２９】上記内側確率が下から上方向に計算された
後に、外側確率が上から下方向に計算される。認識プロ
セスのためには、値ｅと値ｆは次式のように文の確率を
計算するために用いることができる。After the inside probabilities have been calculated from the bottom up, the outside probabilities are calculated from the top down. For the recognition process, the values e and f can be used to calculate the sentence probabilities as:

【００３０】[0030]

【数１２】Ｐ（Ｓ⇒＊Ｏｍ／Ｇ）＝Σ_iｅ(s,t,i)ｆ(s,t,i)[Number 12] P (S⇒ * Om / G) = Σ i e (s, t, i) f (s, t, i)

【００３１】上記数１２はｓ≦ｔである任意のｓに対し
て成立する。上記数１２においてｓ＝１及びｔ＝Ｔと置
くことによって次式を得る。The above equation (12) holds for any s where s ≦ t. By placing s = 1 and t = T in Equation 12, the following equation is obtained.

【００３２】[0032]

【数１３】Ｐ（Ｓ⇒＊Ｏｍ／Ｇ）＝Σ_iｅ(1,T,i)ｆ(1,T,i) ＝ｅ(1,T,S)P (S⇒ * Om / G) = Σ _i e (1, T, i) f (1, T, i) = e (1, T, S)

【００３３】従って、数１３の左辺であるＰ（Ｓ⇒＊Ｏ
／Ｇ）は内側確率のみから計算することができる。同様
の式が、上記数１２においてｓ＝ｔと置くことによって
外側確率の項を用いて得ることができる。Therefore, P (S⇒ * O) on the left side of Expression 13
/ G) can be calculated only from the inside probabilities. A similar equation can be obtained using the outer probability term by placing s = t in Equation 12 above.

【００３４】[0034]

【数１４】Ｐ（Ｓ⇒＊Ｏｍ／Ｇ）＝Σ_iｅ(s,s,i)ｆ(s,s,i) ＝Σ_iｂ[i,Ｏ(s)]ｆ(s,s,i)[Number 14] P (S⇒ * Om / G) = Σ i e (s, s, i) f (s, s, i) = Σ i b [i, O (s)] f (s, s, i)

【００３５】確率文脈自由文法（ＳＣＦＧ）を学習する
ときの問題はもっと複雑である。次式で表される積の式
を考えることによって考察を開始する。The problem of learning a stochastic context-free grammar (SCFG) is more complicated. We begin our discussion by considering the product equation:

【００３６】[0036]

【数１５】ｅ(s,t,i)ｆ(s,t,i)＝Ｐ(Ｓ⇒＊Ｏｍ,ｉ⇒＊
Ｏ(s)…Ｏ(t)／Ｇ) ＝Ｐ(Ｓ⇒＊Ｏm/Ｇ)・Ｐ(ｉ⇒＊Ｏ(s)…Ｏ(t)/Ｓ⇒＊Ｏ
m,Ｇ)E (s, t, i) f (s, t, i) = P (S⇒ * Om, i⇒ *
O (s) ... O (t) / G) = P (S⇒ * Om / G) · P (i⇒ * O (s) ... O (t) / S⇒ * O
m, G)

【００３７】上記数１５の最後のステップはベイズの定
理を適用した結果である。ここで、The last step in Equation 15 is the result of applying Bayes' theorem. here,

【００３８】[0038]

【数１６】Ｐ＝Ｐ（Ｓ⇒＊Ｏｍ／Ｇ）[Mathematical formula-see original document] P = P (S = * Om / G)

【００３９】とすると、上記数１５から次式を得る。Then, the following equation is obtained from the above equation (15).

【００４０】[0040]

【数１７】Ｐ(ｉ⇒＊Ｏ(s)…Ｏ(t)／Ｓ⇒＊Ｏｍ，Ｇ) ＝(１／Ｐ)ｅ(s,t,i)ｆ(s,t,i)[Mathematical formula-see original document] P (i⇒ * O (s) ... O (t) / S⇒ * Om, G) = (1 / P) e (s, t, i) f (s, t, i)

【００４１】従って、次式を得る。Therefore, the following equation is obtained.

【００４２】[0042]

【数１８】Ｐ（導出においてｉが用いられるとき）＝Σ_s=1 ^TΣ_t=s ^T(１／Ｐ)ｅ(s,t,i)ｆ(s,t,i)Equation 18] P (when i is used in the _{^{derivation) = Σ s = 1 T Σ}} t = s T (1 / P) e (s, t, i) f (s, t, i)

【００４３】ここで、ある導出に、ｉ→ｊｋなる規則を
適用した場合について考える。そして、数８を数１７に
代入することによって次の数１９を得ることができる。Here, consider a case where a rule of i → jk is applied to a certain derivation. Then, by substituting Equation 8 into Equation 17, the following Equation 19 can be obtained.

【００４４】[0044]

【数１９】Ｐ(ｉ⇒ｊｋ⇒＊Ｏ(s)…Ｏ(t)/Ｓ⇒＊Ｏm,
Ｇ) ＝(1/P)Σ_r=s ^t-1a[i,j,k]e[s,r,j]e(r+1,t,k)f(s,t,i) すべてのｊ，ｋ及びｔ＞ｓに対して[Equation 19] P (i⇒jk⇒ * O (s) ... O (t) / S⇒ * Om,
G) = (1 / P) Σ _{r = s} ^t-1 a [i, j, k] e [s, r, j] e (r + 1, t, k) f (s, t, i) All For j, k and t> s

【００４５】従って、数１８及び数１９から次の数２０
を得る。Therefore, the following equation (20) is obtained from equations (18) and (19).
Get.

【００４６】[0046]

【数２０】Ｐ（ｉ→ｊｋ，ｉが用いられているとき）＝Σ_s=1 ^T-1Σ_t=s+1 ^T(1/P)Σ_ｒ＝ｓ ^ｔ−１ａ［ｉ，ｊ，
ｋ］ｅ（ｓ，ｒ，ｊ）ｅ（ｒ＋１，ｔ，ｋ）ｆ（ｓ，
ｔ，ｉ）[Mathematical formula-see original document] P (when i → jk, i is used) = { _{s = 1} ^T-1 _{} t = s +} ^1T (1 / P) _{} r =} ^st-1 a [i, j ,
k] e (s, r, j) e (r + 1, t, k) f (s,
t, i)

【００４７】次いで、次の数２１の式の定義を用いる。Next, the definition of the following equation 21 is used.

【００４８】[0048]

【数２１】ａ［ｉ，ｊ，ｋ］＝Ｐ(ｉ→ｊｋ／ｉが用い
られているとき) ＝{Ｐ(i→jk,iが用いられているとき)}/Ｐ(iが用いられ
ているとき)A [i, j, k] = P (when i → jk / i is used) = {P (when i → jk, i is used)} / P (i is used (When it is done)

【００４９】それ故、ａ［ｉ，ｊ，ｋ］に対する再評価
の式は、数１８と数２０から次の数２２で表される。Therefore, the re-evaluation equation for a [i, j, k] is expressed by the following equation 22 from equations 18 and 20.

【００５０】[0050]

【数２２】ah[i,j,k] ＝{(1/P)Σ_s=1 ^T-1Σ_t=s+1 ^TΣ_r=s ^t-1a[i,j,k]e(s,r,j)e
(r+1,t,k)f(s,t,i)}／{(1/P)Σ_s=1 ^TΣ_t=s ^Tｅ(s,t,i)ｆ
(s,t,i)}，すべてのｉ，ｊ，ｋに対してEquation 22] ah [i, j, k] = {(1 / P) Σ s = 1 T-1 Σ t = s + 1 T Σ r = s t-1 a [i, j, k] e ( s, r, j) e
(r + 1, t, k ) f (s, t, i)} / {(1 / P) Σ s = 1 T Σ t = s T e (s, t, i) f
(s, t, i)}, for all i, j, k

【００５１】ここで、ah[i,j,k]のｈは本明細書におい
ては記号ａの上に付与されるハット記号の代替記号とし
て用い、以下ｈを同様に用いる。さらに、同様の論証を
することによって、ｂ［ｉ，ｍ］に対する再評価式は次
の数２３で表わすことができる。Here, h of ah [i, j, k] is used as a substitute for the hat symbol given above the symbol a in the present specification, and h is similarly used hereinafter. Further, by making the same argument, the re-evaluation equation for b [i, m] can be expressed by the following equation (23).

【００５２】[0052]

【数２３】ｂh［ｉ，ｍ］＝{(1/P)Σ_t∈_O(t)=mｅ(t,t,i)ｆ(t,t,i)}／{(1/P)Σ
_s=1 ^TΣ_t=s ^Tｅ(s,t,i)ｆ(s,t,i)}Equation 23] bh [i, m] = { (1 / P) Σ t ∈ O (t) = m e (t, t, i) f (t, t, i)} / {(1 / P) Σ
_{^{_{s = 1 T Σ t = s}}} T e (s, t, i) f (s, t, i)}

【００５３】実際上、確率文脈自由文法（ＳＣＦＧ）の
パラメータを正確に評価するためには１つの観測のみで
は不十分である。従って、上記複数の方程式は任意の数
の観測を取り扱うことに拡張する必要がある。ここで、
次の数２４で表されるＱ個の観測値の組を有していると
仮定する。In practice, a single observation is not enough to accurately evaluate the parameters of a stochastic context-free grammar (SCFG). Therefore, the above equations need to be extended to handle any number of observations. here,
Suppose we have a set of Q observations represented by:

【００５４】[0054]

【数２４】Ｏ≡［Ｏ⁽¹⁾，Ｏ⁽²⁾，…，Ｏ^(Q)］[Equation 24] O≡ [O ⁽¹⁾ , O ⁽²⁾ ,..., O ^(Q) ]

【００５５】さらに、次の数２５及び数２６のように置
くことにする。Further, the following equations 25 and 26 are set.

【００５６】[0056]

【数２５】ｗ_q(s,t,i,j,k)＝(1/P_q)Σ_r=s ^t-1a[i,j,k]ｅ
_q(s,r,j)ｅ_q(r+1,t,k)ｆ_q(s,t,i)## EQU25 ## w _q (s, t, i, j, k) = (1 / P _q ) Σr _{= s} ^t-1 a [i, j, k] e
_q (s, r, j) e _q (r + 1, t, k) f _q (s, t, i)

【数２６】ｖ_q(s,t,i)＝(1/P_q)ｅ_q(s,t,i)ｆ_q(s,t,i)(26) v _q (s, t, i) = (1 / P _q ) e _q (s, t, i) f _q (s, t, i)

【００５７】上記複数個の観測値が独立であると仮定す
れば、ｗ_q及びｖ_qの各々から数２２及び数２３の分子及
び分母への寄与を加算することによって、次の数２７及
び数２８を得ることができる。Assuming that the plurality of observations are independent, by adding the contribution to the numerator and denominator of Equations 22 and 23 from each of w _q and v _q , 28 can be obtained.

【００５８】[0058]

【数２７】ａh[i,j,k]＝{Σ_q=1 ^QΣ_s=1 ^Tq-1Σ_t=s+1 ^Tqｗ_q
(s,t,i,j,k)}／{Σ_q=1 ^QΣ_s=1 ^TqΣ_t=s ^Tqｖ_q(s,t,i)}[Number 27] ah [i, j, k] = {Σ q = 1 Q Σ s = 1 Tq-1 Σ t = s + 1 Tq w q
(s, t, i, j , k)} / {Σ q = 1 Q Σ s = 1 Tq Σ t = s Tq v q (s, t, i)}

【数２８】ｂh[i,m]＝{Σ_q=1 ^QΣ_t∈_O(t)=mｖ_q(t,t,i)}
／{Σ_q=1 ^QΣ_s=1 ^TqΣ_t=s ^Tqｖ_q(s,t,i)}Equation 28] bh [i, m] = { Σ q = 1 Q Σ t ∈ O (t) = m v q (t, t, i)}
/ {Σ _{q = 1} ^Q Σ _{s = 1} ^Tq Σ _{t = s} ^Tq v _q (s, t, i)}

【００５９】インサイド・アウトサイド・アルゴリズム
は、次に示すように繰り返しの処理を行うときに、数１
３、数２７及び数２８を用いる。（１）上記数４によって規定された拘束条件を仮定して
ＡマトリックスとＢマトリックスに対する適当な初期値
を選択する。（２）Ｐにおける変化が所定のしきい値よりも小さくな
るまで、次の計算を繰り返す。Ａ＝…｛数２７｝；Ｂ＝…｛数２８｝；Ｐ＝…｛数１３｝。The inside-outside algorithm performs the following processing when iterative processing is performed as follows.
3, Equation 27 and Equation 28 are used. (1) An appropriate initial value for the A matrix and the B matrix is selected assuming the constraint conditions defined by the above equation (4). (2) The following calculation is repeated until the change in P becomes smaller than a predetermined threshold value. A = ... {Equation 27}; B = ... {Equation 28}; P = ... {Equation 13}.

【００６０】上記においては、インサイド・アウトサイ
ド・アルゴリズムについて説明したが、これを、学習用
コーパスとして、括弧付けされたテキストを用いた場合
のインサイド・アウトサイド・アルゴリズムによる学習
について（例えば、フェルナンド・ペレーラ（Fernando
Pereira）ほか，“inside-Outside Reestimation From
Partially Bracketed Corpora",The proceeing of Ａ
ＣＬ，１９９２年参照。）以下に説明する。インサイド
・アウトサイド・アルゴリズムの基本的な考え方は、確
定されたタイプの導出ステップの期待された頻度を評価
するために、現在の規則の確率と学習セットＷとを用い
て、これら期待された頻度の評価の適当な比のような新
しい規則の確率の評価値を計算する。これらは、最も好
都合には、相対的な頻度として表されているので、イン
サイドの確率及びアウトサイドの確率として自由に参照
したビットである。より正確には、ｗ∈Ｗなるそれぞれ
のｗに対して、インサイドの確率Ｉ_p ^w（ｉ，ｊ）は、Ａ
ｐが_iｗ_jを導出するときの尤度を評価する一方、アウト
サイドの確率Ｏ_p ^w（ｉ，ｊ）は、開始のシンボルＡ₁か
ら導出文の形式₀ｗ_iＡ_pjｗの尤度を評価する。In the above description, the inside / outside algorithm has been described. However, the inside / outside algorithm is used for learning by the inside / outside algorithm when using bracketed text as a learning corpus (for example, Fernando's algorithm). Perera (Fernando)
Pereira) and “inside-Outside Reestimation From
Partially Bracketed Corpora ", The proceeing of A
See CL, 1992. This will be described below. The basic idea of the inside-outside algorithm is to use the current rule probabilities and the learning set W to evaluate the expected frequency of the determined type derivation step, Compute an estimate of the probability of the new rule, such as the appropriate ratio of the estimates of. These are the bits most freely referred to as inside probabilities and outside probabilities, most conveniently expressed as relative frequencies. More precisely, for each w such that w∈W, the inside probability I _p ^w (i, j) is A
While p is to assess the likelihood of when deriving the _i w _j, the probability O _p ^w (i, j) of the out side, the likelihood of format ₀ w _i A _pj w derivation statement from the symbol A ₁ of the start To evaluate.

【００６１】上記インサイド・アウトサイド・アルゴリ
ズムを部分的に括弧付けされた学習テキストに適用する
ときに、括弧付けは可能な導出文に、さらには可能な句
に含ませるという拘束条件を考慮に入れる必要がある。
明らかに、インサイドの確率Ｉ_p ^w（ｉ，ｊ）とアウトサ
イドの確率Ｏ_p ^w（ｉ，ｊ）とに対する非ゼロの値は、も
し_iｗ_jはｗの括弧付けと互換性があるならば、もしくは
等価的には、もし（ｉ，ｊ）がｗの括弧付けのために有
効であるときのみに可能であるとすべきである。従っ
て、以下においては、括弧付けされた一連の文ｃ＝
（ｗ，Ｂ）のコーパスＣを仮定し、かつ、構成要素のス
パンが一連の文の括弧付けと互換性があるときのその構
成要素を含ませるために、１９７９年にベイカー（Bake
r）によって明らかにされ、１９９０年にラリ（Lari）
とヤング（Young）によって明らかにされ、さらには１
９９０年にジェリネック（Jelinek）ほかによって明ら
かにされた、インサイド及びアウトサイドの確率と規則
の確率の再評価に対する標準的な式を変形することにす
る。この目的のために、各括弧付けされた一連の文ｃ＝
（ｗ，Ｂ）に対して、次の数２９で表される補助的な関
数を定義する。When applying the above inside-outside algorithm to partially bracketed learning texts, bracketing takes into account the constraint that possible derived statements be included in possible phrases. There is a need.
Obviously, if a non-zero value for the inside of the probability I _p ^w (i, j) and the outside of the probability O _p ^w (i, j) and that if _i w _j may parentheses compatible w Alternatively, or equivalently, it should be possible only if (i, j) is valid for the bracketing of w. Thus, in the following, a series of parenthesized statements c =
In 1979, Baker assumed a corpus C of (w, B) and included that component when its span was compatible with the bracketing of a series of sentences.
r), Lari in 1990
And Young, and one more
We will modify the standard formula for re-evaluating inside and outside probabilities and rule probabilities, as revealed by Jelinek et al. In 990. For this purpose, each series of bracketed statements c =
An auxiliary function represented by the following equation 29 is defined for (w, B).

【００６２】[0062]

【数２９】ｃh(i,j)＝１，もし(i,j)はｂ∈Ｂに対して
有効であるならば; ＝０，もしそうでないならば(29) ch (i, j) = 1, if (i, j) is valid for b∈B; = 0, if not

【００６３】拡張されたアルゴリズムに対する再評価に
関する公式を以下に示す。The formula for reevaluation for the extended algorithm is shown below.

【００６４】[0064]

【数３０】Ｉ_p ^c（ｉ−１，ｉ）＝Ｕ_p,_m，ここで、ｃ＝（ｗ，Ｂ）及びｂ_m＝ｗ_iである。Equation 30] _{^{I p c (i-1,}} i) = U p, m, where is c = (w, B) and b _m = w _i.

【数３１】Ｉ_p ^c（ｉ，ｋ）＝ｃh(i,k)Σ_q,_rΣ_i＜_j＜_kＢ
_p,_q,_rＩ_q ^c(i,j)Ｉ_r ^c(j,k)Equation 31] _{^{I p c (i, k)}} = ch (i, k) Σ q, r Σ i <j <k B
_p , _q , _r I _q ^c (i, j) I _r ^c (j, k)

【数３２】Ｏ_p ^c(０,│ｃ│)＝１，もしｐ＝１ならば；＝０，もしそうでないならばEquation 32] _{^{O p c (0, │c│)}} = 1, if p = 1 if; = 0, if not,

【数３３】Ｏ_p ^c(i,k)＝ｃh(i,k)Σ_q,_r{Σ_j=0 ^i-1Ｏ_q ^c(j,
k)Ｉ_r ^c(j,i)Ｂ_q,_r,_p+Σ_j=k+1│^c│Ｏ_q ^c(i,j)Ｂ_q,_p,_rＩ_r
^c(k,j)}(33) _Op ^c (i, k) = ch (i, k) Σ _q , _r {Σ _{j = 0} ^i-1 O _q ^c (j,
_{^{k) I r c (j,}} i) B q, r, p + Σ j = k + 1 │ c │O q c (i, j) B q, p, r I r
^c (k, j)}

【数３４】Ｂh_p,_q,_r＝{Σ_c∈_C(1/P^c)Σ₀≦_i<_j<_k≦│_w│
Ｂ_p,_q,_rＩ_q ^c(i,k)Ｉ_r ^c(j,k)Ｏ_p ^c(i,k)}／(Σ_c∈_CP_p ^c/
P^c)Equation 34] _{_{_{Bh p, q, r = {}}} Σ c ∈ C (1 / P c) Σ 0 ≦ i <j <k ≦ │ w │
_{_{_{_{B p, q, r I q}}}} c (i, k) I r c (j, k) O p c (i, k)} / (Σ c ∈ C P p c /
P ^c )

【数３５】Ｕh_p,_m＝{Σ_c∈_C(1/P^c)Σ₁≦_i≦│_c│,_c=(w,
_B),_wi=bmＵ_p,_mＯ_p ^c(i-1,i)}／(Σ_c∈_ＣＰ_ｐ ^ｃ／Ｐ^ｃ）(35) U h _p , _m = {Σ _c ∈ _C (1 / P ^c ) Σ ₁ ≤ _i ≤ │ _c │, _{c = (w} ,
_{_{_{B), wi = bm U p}}} , m O p c (i-1, i)} / (Σ c ∈ C P p c / P c)

【数３６】Ｐ^ｃ＝Ｉ₁ ^c(０,│c│)P ^c = I ₁ ^c (0, │c│)

【数３７】Ｐ_p ^c＝Σ₀≦_i<_j≦│_c│Ｉ_p ^c(i,j)Ｏ_p ^c(i,j)Equation 37] _{^{_{_{P p c = Σ 0 ≦ i}}}} <j ≦ │ c │I p c (i, j) O p c (i, j)

【００６５】学習コーパスにおける各括弧付けされた文
ｃに対して、当該文ｃのより長いスパンのインサイドの
確率は、数３０及び数３１によって与えられた再現式を
有するより短いスパンに対するインサイドの確率から計
算される。上記数３１は、文ｃ＝（ｗ，Ｂ）の括弧付け
Ｂと互換性があるＡ_pから、_iｗ_kの導出式の期待された
相対的な頻度を計算することができる。乗数ｃh（ｉ，
ｋ）は、（ｉ，ｋ）がＢに対して有効であるとき、すな
わちＡ_pがＢと同様に互換性を有して_iｗ_kを導出するこ
とができる。For each bracketed sentence c in the learning corpus, the probability of the inside of the longer span of the sentence c is the probability of the inside for the shorter span having the reproduction formula given by Eqs. Is calculated from Equation 31 can be from the statement c = (w, B) there is bracketed B compatible with A _p, calculates the expected relative frequency of derivation of _i w _k. The multiplier ch (i,
k) can derive _i w _k when (i, k) is valid for B, ie, _Ap is as compatible as B.

【００６６】同様に、文ｃのより短いスパンに対するア
ウトサイドの確率は数３２及び数３３によって与えられ
た再現式を用いて、より長いスパンに対する、インサイ
ドの確率とアウトサイドの確率とから計算することがで
きる。上記コーパスにおける各文に対してインサイドの
確率とアウトサイドの確率とが一旦計算されれば、２つ
の成分からなる２値表示の規則Ｂh_p,_q,_rの再評価された
確率と、１つの成分からなる規則Ｕh_p,_mの再評価された
確率とは、括弧付けされていない一連の文の代わりに、
括弧付けされた一連の文を用いることを除いて、１９７
９年にベイカー（Baker）によって明らかにされ、１９
９０年にラリ（Lari）とヤング（Young）によって明ら
かにされ、さらには１９９０年にジェリネック（Jeline
k）ほかによって明らかにされた元の公式と同様の再評
価の公式（数３４と数３５）とによって計算される。Similarly, the outside probability with respect to the shorter span of the sentence c is calculated from the inside probability and the outside probability with respect to the longer span using the reproduction formula given by Expressions 32 and 33. be able to. If each statement inside of probability and outside of probability and is temporarily calculated for in the corpus, binary display rules Bh _p consisting of two components, _q, and probabilities revaluation of _r, 1 single The re-evaluated probabilities of the rule U h _p , _m consisting of components are, instead of a series of unbracketed statements,
197 except using a series of bracketed statements
Revealed by Baker in 9 and 19
Revealed by Lari and Young in 1990, and in 1990 by Jeline
k) Calculated by revaluation formulas (34 and 35) similar to the original formulas revealed by others.

【００６７】数３４及び数３５によって表される比の分
母は、Ｃにおける１つの括弧付けされた一連の文の互換
性のある導出式は少なくとも、非終端記号Ａ_pの１つの
拡張を含むときの確率を評価することができる。上記数
３４の分子は、Ｃにおける括弧付けされた一連の文の互
換性のある導出式はＡ_p→Ａ_qＡ_rなる規則を含むときの
確率を評価することができる一方、数３５の分子は、Ｃ
における一連の文の互換性のある導出式はＡ_pをｂ_mに書
き換えるときの確率を評価することができる。このよう
にして、上記数３４は、Ｃにおける括弧付けされた一連
の文の互換性のある導出式におけるＡ_pの書き換えはＡ_p
→Ａ_qＡ_rなる規則を使用するときの確率を評価すること
ができ、上記数３５は、Ｃにおける一連の文の互換性の
ある導出式におけるＡ_pの生起がｂ_mに書き換えるられる
ときの確率を評価することができる。これらは、２つの
成分からなる２値表示の規則の確率と、１つの成分から
なる規則の確率に対する最良の現在の評価値である。[0067] The denominator of the ratio, represented by the number 34 and number 35, when derivation that is compatible with one bracketed by a series of statements in C, including at least, one of the expansion of non-terminal symbol A _p Probability can be evaluated. The numerator of Equation 34 above can be used to evaluate the probability that the compatible derivation of a series of bracketed statements in C contains the rule A _p → A _q _Ar while the numerator of Equation 35 Is C
Derivation with a compatible set of statements can evaluate the probability of when rewriting the A _p to b _m in. Thus, Equation 34 above shows that rewriting A _p in a compatible derivation of a series of bracketed statements in C is A _p
→ A _q A _r becomes rule can evaluate the probability when using, the number 35, when the occurrence of A _p in deriving equation that is compatible with a series of statements in C are rewritten to b _m Probability can be evaluated. These are the best current estimate for the probability of a binary rule with two components and the probability of a rule with one component.

【００６８】次いで、上記再評価された確率を用いる処
理が、当該モデルが与えられた学習テキストの評価され
た確率における増加値が無視可能になるまで、もしくは
当該確率が総計無視可能な量となるまで又は、次の数３
８によって表される交差エントロピーの評価値（負の確
率の対数値）における減少値が無視可能となるときま
で、繰り返されて実行される。Next, the processing using the re-evaluated probabilities is performed until the increase in the evaluated probabilities of the learning text given to the model becomes negligible, or the probabilities become a total negligible amount. Or the next number 3
Iteratively, until the decrease in the cross-entropy estimate (log of negative probability) represented by 8 becomes negligible.

【００６９】[0069]

【数３８】Ｈh（Ｃ，Ｇ）＝−（Σ_c∈_ClogＰ^c）／（Σ_c∈_C│ｃ
│）Equation 38] Hh (C, G) = - (Σ c ∈ C logP c) / (Σ c ∈ C │c
│)

【００７０】ここで、元のアルゴリズムとの比較のため
には、上記数３８でない文法Ｇに関する括弧付けされて
いないテキストＷの交差エントロピーの評価値Ｈｈ
（Ｗ，Ｇ）を用いる必要がある。Here, for comparison with the original algorithm, the evaluation value Hh of the cross entropy of the unbracketed text W for the grammar G other than Equation 38 above
It is necessary to use (W, G).

【００７１】本実施例では、詳細上述したインサイド・
アウトサイド・アルゴリズムを使用して確率文脈自由文
法（ＳＣＦＧ）で韻律句構造の学習を行なうために、形
態素解析され係り受け構造で括弧付けされたテキストで
確率文脈自由文法（ＳＣＦＧ）の学習を行ない、得られ
たものを初期文法としてさらに自然音声での基本周波数
の立て直し位置のデータで括弧付けされたテキストを用
いて学習を行なった。In the present embodiment, the inside
To learn a prosodic phrase structure with a stochastic context-free grammar (SCFG) using an outside algorithm, learn a stochastic context-free grammar (SCFG) with text that is morphologically analyzed and bracketed by a dependency structure Using the obtained data as the initial grammar, learning was further performed using texts in parentheses with data on the repositioning position of the fundamental frequency in natural speech.

【００７２】インサイド・アウトサイド・アルゴリズム
を用いて確率文脈自由文法（ＳＣＦＧ）を学習するに
は、終端記号と非終端記号数を決定する必要がある。確
率文脈自由文法（ＳＣＦＧ）の終端記号は単語にするの
が理想であるが、全ての単語を含むコーパスは入手困難
であり、学習時間も膨大になるため現実的ではない。従
って、本実施例では、総数が品詞の種類＋数個程度とな
る終端記号を助詞を細分類して考え、２３種類の品詞と
その内の格助詞のみ７分類（が、の、に、を、で、と、
その他）して合計２９種類となる終端記号を、次の表１
に示すように使用した。また、非終端記号の数は２０で
あり、非終端記号として１から２０までの番号を用い
た。To learn a stochastic context-free grammar (SCFG) using the inside / outside algorithm, it is necessary to determine the number of terminal symbols and non-terminal symbols. Ideally, the terminal symbol of the stochastic context-free grammar (SCFG) is a word, but it is not realistic because a corpus containing all the words is difficult to obtain and the learning time is enormous. Therefore, in the present embodiment, terminal symbols whose total number is about the type of part of speech plus several are considered by subclassifying particles, and only 23 types of part of speech and their case particles are classified into seven classes (but,,, and). , In, and,
Other), the total of 29 types of terminal symbols are shown in Table 1 below.
Used as shown. The number of non-terminal symbols is 20, and numbers from 1 to 20 are used as non-terminal symbols.

【００７３】[0073]

【表１】 ─────────────────── 終端記号品詞 ─────────────────── ｔ１形容詞ｔ４普通名詞ｔ５サ変名詞ｔ６代名詞ｔ７数詞ｔ８副詞ｔ９連体詞ｔ１０接続詞ｔ１１感動詞ｔ１２助動詞ｔ１３副助詞ｔ１４接続助詞ｔ１６終助詞ｔ１７接尾語ｔ１８接頭語ｔ１９補助動詞ｔ３０固有名詞ｔ３１形容名詞ｔ３２本動詞ｔ３４準体助詞ｔ３５並列助詞ｔ３６係助詞ｔ５０格助詞 “ｇａ” ｔ５１格助詞 “ｎｏ” ｔ５２格助詞 “ｎｉ” ｔ５３格助詞 “ｗｏ” ｔ５４格助詞 “ｄｅ” ｔ５５格助詞 “ｔｏ” ｔ５６格助詞 “ｏｔｈｅｒｓ” ───────────────────[Table 1] 終端 Terminal part of speech ───────────────────t1 adjective t4 ordinary Noun t5 sa-transitive noun t6 pronoun t7 number noun t8 adverb t9 adverb t10 conjunction t11 intimate verb t12 auxiliary verb t13 adjunct t14 connecting particle t16 final particle t17 suffix t18 pronoun t30 auxiliary verb t19 auxiliary verb t19 auxiliary verb Body particle t35 Parallel particle t36 Particle t50 Case particle "ga" t51 Case particle "no" t52 Case particle "ni" t53 Case particle "wo" t54 Case particle "de" t55 Case particle "to" t56 Case particle "others" ───────────────────

【００７４】確率文脈自由文法（ＳＣＦＧ）がとらえて
いる確率的な統語構造を韻律句の境界の検出推定に用い
るために確率文脈自由文法（ＳＣＦＧ）から計算できる
以下に述べるパラメータを提案する。図５に示すよう
に、各単語について係り受けの深さｍの左枝分かれ構造
を含んだ統語構造の文の出現確率（以下、左枝分かれ構
造確率ｍという。）及び係り受けの深さｎの右枝分かれ
構造を含んだ統語構造の文の出現確率（以下、右枝分か
れ構造確率ｎという。）を確率文脈自由文法（ＳＣＦ
Ｇ）から計算し、これらの確率を韻律句の境界の検出推
定のパラメータとして用いる。The following parameters which can be calculated from the stochastic context-free grammar (SCFG) in order to use the stochastic syntactic structure captured by the stochastic context-free grammar (SCFG) for detection and estimation of prosodic phrase boundaries are proposed. As shown in FIG. 5, for each word, the appearance probability of a sentence having a syntactic structure including a left branch structure with a dependency depth m (hereinafter, referred to as a left branch structure probability m) and the right of the dependency depth n are shown. The appearance probability of a sentence having a syntactic structure including a branching structure (hereinafter referred to as right branching structure probability n) is defined as a probability context-free grammar (SCF).
G), and use these probabilities as parameters for detecting and estimating the boundaries of prosodic phrases.

【００７５】ここで、左枝分かれ構造確率ｍ（その記
号：Pleftm）及び右枝分かれ構造確率ｎ（その記号：PR
ightn）の計算方法について具体例を用いて詳細に説明
する。いま、ｔ１，ｔ４，ｔ５，ｔ６，ｔ７のような５
個の終端記号で表現可能な単語列が入力された場合、図
６の（ａ）乃至（ｎ）に示すように、１４通りの木構
造、すなわち統語構造が考えられます。Here, the left branching structure probability m (the symbol: Pleftm) and the right branching structure probability n (the symbol: PR
The calculation method of (ightn) will be described in detail using a specific example. Now, 5 like t1, t4, t5, t6, t7
When a word string that can be represented by a number of terminal symbols is input, there are 14 possible tree structures, that is, syntactic structures, as shown in (a) to (n) of FIG.

【００７６】図６の（ａ）の例では、終端記号ｔ１とｔ
４とで係り受けの深さ１の左枝分かれ構造を有し、ま
た、当該係り受けの深さ１の左枝分かれ構造と終端記号
ｔ５とで係り受けの深さ２の左枝分かれ構造を有し、さ
らに、当該係り受けの深さ２の左枝分かれ構造と終端記
号ｔ６とで係り受けの深さ３の左枝分かれ構造を有し、
またさらに、当該係り受けの深さ３の左枝分かれ構造と
終端記号ｔ７とで係り受けの深さ４の枝分かれ構造を有
している。また、図６の（ｂ）の例では、終端記号ｔ４
とｔ５とで係り受けの深さ１の左枝分かれ構造を有し、
また、当該係り受けの深さ１の左枝分かれ構造と終端記
号ｔ６とで係り受けの深さ２の左枝分かれ構造を有し、
さらに、当該係り受けの深さ２の左枝分かれ構造と終端
記号ｔ７とで係り受けの深さ３の右枝分かれ構造を有
し、またさらに、当該係り受けの深さ３の右枝分かれ構
造と終端記号ｔ１とで係り受けの深さ４の枝分かれ構造
を有している。さらに、図６の（ｃ）の例では、終端記
号ｔ４とｔ５とで係り受けの深さ１の右枝分かれ構造を
有し、また、当該係り受けの深さ１の右枝分かれ構造と
終端記号ｔ１とで係り受けの深さ２の左枝分かれ構造を
有し、さらに、当該係り受けの深さ２の左枝分かれ構造
と終端記号ｔ６とで係り受けの深さ３の左枝分かれ構造
を有し、またさらに、当該係り受けの深さ３の左枝分か
れ構造と終端記号ｔ７とで係り受けの深さ４の枝分かれ
構造を有している。以下、図６の（ｄ）乃至（ｎ）にお
いて、図示の如く枝分かれ構造を有している。In the example of FIG. 6A, the terminal symbols t1 and t1
4 has a left branching structure with a depth of 1 at the dependency, and a left branching structure at a depth of 1 with the dependency and a left branching structure at a depth of 2 at the terminating symbol t5; Further, the modification has a left branching structure with a depth of 2 and a terminating symbol t6 with a depth 3 of the modification,
Further, it has a left branched structure with a depth of 3 and a branching structure with a depth of 4 with a terminal symbol t7. In the example of FIG. 6B, the terminal symbol t4
Has a left branch structure with a dependency depth of 1 at t5 and
In addition, the modification has a left branching structure with a depth of 1 and a termination symbol t6 with a depth 2 of the modification,
Further, the left branch structure having a depth of 2 and the right end branch structure having a terminal symbol t7 have a depth of 3 and the right branch structure having a depth of 3 and the terminal symbol. At t1, it has a branching structure with a dependency depth of 4. Further, in the example of FIG. 6C, the terminal symbols t4 and t5 have a right branch structure with a dependency depth of 1, and the dependency symbol has a right branch structure with a depth of 1 and the terminal symbol t1. Has a left branching structure with a depth of 2 at the dependency, and further has a left branching structure with a depth of 2 at the depth of the dependency and a left branching structure at a depth of 3 at the terminating symbol t6; Further, the modification has a branch structure with a depth of 4 and a left branch structure with a depth of 3 and a terminal symbol t7. Hereinafter, FIGS. 6D to 6N have a branched structure as shown.

【００７７】図６に示すこれらのすべての木構造の出現
確率を確率文脈自由文法（ＳＣＦＧ）に基づいて計算し
てそれらの和を、全出現確率Pallとする。いま、例とし
て、終端記号ｔ５の左枝分かれ構造確率１（PLeft1）に
ついて考える。図６において、終端記号ｔ５を基準とし
て１個前の終端記号を含む左枝分かれ構造であるのは、
（ｂ），（ｅ），（ｊ）の３通りあり、それぞれの構造
の出現確率の和をＰとすると、次の数３９で表される。The appearance probabilities of all the tree structures shown in FIG. 6 are calculated based on the stochastic context-free grammar (SCFG), and the sum of them is defined as the total appearance probability Pall. Now, as an example, consider the left branching structure probability 1 (PLeft1) of the terminal symbol t5. In FIG. 6, the left branch structure including the terminal symbol immediately before the terminal symbol t5 is as follows.
(B), (e), and (j), where P is the sum of the appearance probabilities of the respective structures.

【００７８】[0078]

【数３９】PLeft1＝Ｐ／Ｐａｌｌ[Expression 39] PLeft1 = P / Pall

【００７９】本実施例では、確率文脈自由文法（ＳＣＦ
Ｇ）の確率の学習のために、出願人のデータベースの５
０３個の文を形態素切りして係り受け構造に基づいて人
手により括弧付けをしたコーパス（以下、係り受け情報
付きコーパスという。）と自然音声での基本周波数の立
て直し位置に基づいて括弧付けをしたテキスト（以下、
立て直し情報付きコーパスという。）を用意した。この
確率文脈自由文法（ＳＣＦＧ）の学習用コーパスの具体
例を以下に示す。In this embodiment, a stochastic context-free grammar (SCF
G) for learning the probability of 5)
A corpus (hereinafter referred to as a corpus with dependency information) in which three sentences are morphologically cut and bracketed based on a dependency structure (hereinafter referred to as a corpus with dependency information) and bracketed based on the repositioning position of the fundamental frequency in natural speech. Text (hereafter,
A corpus with rebuild information. ) Was prepared. Specific examples of the learning corpus of the stochastic context-free grammar (SCFG) will be described below.

【００８０】（Ａ）係り受け構造で括弧付けされたコー
パス（Ａ１）例文：あらゆる現実をすべて自分の方へねじ曲
げたのだ。係り受け情報付きコーパス：（（（ｔ９）
（（ｔ４）（ｔ５３）））
（（ｔ４）（（（ｔ４）（
ｔ５１）（ｔ４）（ｔ５６））（
（ｔ３２）（ｔ３２）（ｔ１２）
（ｔ３４）（ｔ１２）））））（Ａ２）例文：一週間ばかりニューヨークを取材した。係り受け情報付きコーパス：( ( ( t7 ) ( t17 ) ( t
17 ) ( t13 ) ) ( ( ( t30 ) ( t53 ) ) ( ( t5 )(
t19 ) ( t12 ) ) ) ) （Ａ３）例文：テレビゲームやパソコンでゲームをして
遊ぶ。係り受け情報付きコーパス：( ( ( ( t4 ) ( t35 )
) ( ( t4 ) ( t54 ) ) ) ( ( ( ( t4 ) ( t53 ))
( ( t32 ) ( t14 ) ) ) ( t32 ) ) ) （Ａ４）例文：物価の変動を考慮して給付水準を決める
必要がある。係り受け情報付きコーパス：( ( ( ( ( ( ( t4 )
( t51 ) ) ( ( t4 ) ( t53 ) ) ) ( ( t5 ) ( t19
) ( t14 ) ) ) ( ( ( t4 ) ( t4 ) ( t53 ) ) ( t3
2 ) ) ) ( ( t4 ) (t50 ) ) ) ( t32 ) )(A) Corpus bracketed by dependency structure (A1) Example sentence: All reality was twisted towards myself. Corpus with dependency information: (((t9)
((T4) (t53)))
((T4) (((t4) (
t51) (t4) (t56)) (
(T32) (t32) (t12)
(T34) (t12))))) (A2) Example sentence: I visited New York just for a week. Corpus with dependency information: (((t7) (t17) (t
17) (t13)) (((t30) (t53)) ((t5) (
t19) (t12)))) (A3) Example sentence: Play a game on a video game or a personal computer. Corpus with dependency information: ((((t4) (t35)
) ((t4) (t54))) ((((t4) (t53))
((t32) (t14))) (t32))) (A4) Example sentence: It is necessary to determine the benefit level in consideration of price fluctuations. Corpus with dependency information: (((((((t4)
(t51)) ((t4) (t53))) ((t5) (t19
) (t14))) (((t4) (t4) (t53)) (t3
2))) ((t4) (t50))) (t32))

【００８１】（Ｂ）基本周波数Ｆｏの立て直し位置のデ
ータで括弧付けされたコーパス以下のコーパスにおいて、（↑）は基本周波数Ｆｏの立
て直しの起こる位置であって、韻律句の境界を示す。（Ｂ１）例文：あらゆる現実を（↑）すべて（↑）自分
の方へねじ曲げたのだ。立て直し情報付きコーパス：( t9 ( t4 t53 ) )
t4 ( ( t4 t51 t4 t56 ) ( t32 t32 t12t34
t12 ) ) （Ｂ２）例文：一週間ばかり（↑）ニューヨークを取材
した。立て直し情報付きコーパス：( t7 t17 t17 t13 )
( ( t30 t53 ) ( t5 t19 t12 ) ) （Ｂ３）例文：テレビゲームやパソコンで（↑）ゲーム
をして（↑）遊ぶ。立て直し情報付きコーパス：( ( t4 t35 ) ( t4
t54 ) ) ( ( t4 t53 ) ( t32 t14 ) ) t32 （Ｂ４）例文：物価の変動を考慮して（↑）給付水準を
決める必要がある。立て直し情報付きコーパス：( ( t4 t51 ) ( t4
t53 ) ( t5 t19 t14 ) ) ( ( t4 t4 t53) t
32 ( t4 t50 ) t32 )(B) Corpus bracketed by data on the position of the fundamental frequency Fo in the following corpus. In the following corpus, (↑) indicates the position where the fundamental frequency Fo is reestablished, and indicates the boundary of the prosodic phrase. (B1) Example sentence: (↑) Everything (↑) twisted all the way to myself. Corpus with rebuild information: (t9 (t4 t53))
t4 ((t4 t51 t4 t56) (t32 t32 t12t34
t12)) (B2) Example: Just a week (↑) I visited New York. Corpus with rebuild information: (t7 t17 t17 t13)
((t30 t53) (t5 t19 t12)) (B3) Example sentence: Play (↑) a game (テレビ) on a video game or PC and play (↑). Corpus with rebuild information: ((t4 t35) (t4
t54)) ((t4 t53) (t32 t14)) t32 (B4) Example sentence: Considering changes in prices, it is necessary to determine the benefit level. Corpus with rebuild information: ((t4 t51) (t4
t53) (t5 t19 t14)) ((t4 t4 t53) t
32 (t4 t50) t32)

【００８２】図２は図１のＳＣＦＧ学習部３０によって
実行されるＳＣＦＧの確率学習処理のフローチャートで
ある。韻律句の境界を検出推定するための出現確率など
の上記のパラメータを求める確率文脈自由文法（ＳＣＦ
Ｇ）を作成するために、次の手順で学習を行なった。FIG. 2 is a flowchart of the SCFG probability learning process executed by the SCFG learning unit 30 of FIG. A stochastic context-free grammar (SCF) for obtaining the above parameters such as the probability of appearance for detecting and estimating the boundaries of prosodic phrases
In order to create G), learning was performed according to the following procedure.

【００８３】図２に示すように、まず、ステップＳ１に
おいては、ＳＣＦＧの確率学習部３０は、ランダムに出
現確率が与えられた初期値ＳＣＦＧ３１を、詳細上述の
インサイド・アウトサイド・アルゴリズムに従って、予
め上述のように作成された、係り受け構造で括弧付けさ
れたコーパスを用いて学習する。上記ステップＳ１の処
理は、具体的には、予め上述のように作成された係り受
け構造で括弧付けされたコーパスを入力として数３４お
よび数３５に従って、新しい確率を推定する。これを、
数３８で表される値の減少値が無視可能となるまで繰り
返す。As shown in FIG. 2, first, in step S1, the probability learning unit 30 of the SCFG calculates an initial value SCFG 31 to which an appearance probability is randomly given in advance according to the above-described inside / outside algorithm in detail. The learning is performed using the corpus bracketed by the dependency structure created as described above. More specifically, the process of step S1 estimates a new probability in accordance with Equations 34 and 35 with a corpus bracketed by the dependency structure created in advance as described above as an input. this,
This is repeated until the decreasing value of the value represented by Expression 38 becomes negligible.

【００８４】次いで、ステップＳ２においては、ＳＣＦ
Ｇの確率学習部３０は、上記ステップＳ１で学習された
ＳＣＦＧを詳細上述のインサイド・アウトサイド・アル
ゴリズムに従って、予め上述のように作成された、基本
周波数Ｆｏの立て直し位置のデータで括弧付けされたコ
ーパスを用いて学習して、学習されたＳＣＦＧ３２のデ
ータを得る。そして、図１に示すように、学習されたＳ
ＣＦＧ３２のデータを韻律制御規則３３に含ませる。上
記ステップＳ２の処理は、具体的には、予め上述のよう
に作成された基本周波数Ｆｏの立て直し位置のデータで
括弧付けされたコーパスを入力として、数３４および数
３５に従って、新しい確率を推定する。これを、数３８
で表される値の減少値が無視可能となるまで繰り返す。Next, in step S2, the SCF
The G probability learning unit 30 puts the SCFG learned in step S1 in parentheses with the data of the rebuilding position of the fundamental frequency Fo created in advance according to the inside / outside algorithm described above in detail. Learning is performed using a corpus to obtain learned SCFG 32 data. Then, as shown in FIG.
The data of the CFG 32 is included in the prosody control rules 33. Specifically, the process of step S2 estimates a new probability in accordance with Equations 34 and 35, using a corpus bracketed with data on the repositioning position of the fundamental frequency Fo created in advance as described above as an input. . This is given by Equation 38
Repeat until the decrease in the value represented by is negligible.

【００８５】韻律句境界推定規則作成部３４における韻
律句境界推定規則の作成は、例えば、詳細後述するよう
に、公知のニューラルネットワーク又は判別分析法（例
えば、田中豊、脇本和昌著，「多変量統計解析法」現代
数学社参照。）を用いて行うことができる。この判別分
析法は、複数の変量に関してグループ毎に得られている
過去のデータのサンプルに基づき、これらの変量の値か
ら個体がどのグループに属するかを判別予測する方法で
ある。当該判別分析法の本実施例への適用は、上記複数
の変量を、上述の左枝分かれ構造確率ｍ、右枝分かれ構
造確率ｎおよび品詞の種類であって、グループは韻律句
の境界か否かで分かれるように選定される。The creation of the prosodic phrase boundary estimating rule in the prosodic phrase boundary estimating rule creating unit 34 is performed, for example, by using a known neural network or discriminant analysis method (for example, Yutaka Tanaka and Kazumasa Wakimoto, Statistical analysis of variates "(see Gendai Mathematics). This discriminant analysis method is a method of discriminating and predicting to which group an individual belongs from the values of these variables based on a sample of past data obtained for each group with respect to a plurality of variables. The application of the discriminant analysis method to the present embodiment is based on the above-described plurality of variables, the left branching structure probability m, the right branching structure probability n, and the type of part of speech. Selected to be divided.

【００８６】さらに、音声合成制御部１０は、上述のよ
うに、韻律句の構造を含む確率文脈自由文法（ＳＣＦ
Ｇ）を備えた韻律制御規則３３に従って、詳細後述する
ように、音声合成に必要な以下に示すデータを計算して
出力する。（ａ）基本周波数に対応するピッチのデータ。（ｂ）有声／無声切換のデータ。（ｃ）振幅のデータ。（ｄ）フィルタ係数のデータ。ここで、上記学習されたＳＣＦＧ３２のデータの例を以
下の表２に示す。Further, as described above, the speech synthesis control unit 10 generates a probabilistic context-free grammar (SCF
In accordance with the prosody control rules 33 provided with G), as described in detail below, the following data required for speech synthesis is calculated and output. (A) Pitch data corresponding to the fundamental frequency. (B) Data of voiced / unvoiced switching. (C) Amplitude data. (D) Filter coefficient data. Here, an example of the learned SCFG 32 data is shown in Table 2 below.

【００８７】[0087]

【表２】 ─────────────────── 確率文脈自由文法（ＳＣＦＧ）の内容 ─────────────────── 1 → 1 1 5.677289731860455×１０^-5 1 → 1 2 0.003216677708041557 1 → 1 3 1.000394189802561×１０^-15 : : 1 → 1 19 1.015469740215695×１０^-15 1 → 1 20 1.148333794722905×１０^-15 1 → 2 1 0.0001832178839199974 1 → 2 2 0.000521748447310258 : : 1 → 20 19 1.374865835459389×１０^-15 1 → 20 20 0.001676333237389523 2 → 1 1 1.542003529882383×１０^-15 2 → 1 2 1.783061119126052×１０^-15 : : 20 → 20 19 1.586308114265936×１０^-10 20 → 20 20 2.291887552593505×１０^-6 1 → t1 1.155866936964327×１０^-15 1 → t4 1.004501712835847×１０^-15 1 → t5 1.000076431187449×１０^-15 1 → t6 1.00213816472346×１０^-15 : : 1 → t55 1.000679862632174×１０^-15 1 → t56 1.000019916972904×１０^-15 : : 20 → t55 1.196873334230712×１０^-15 20 → t56 1.128862755279862×１０^-15 ─────────────────── （注）１乃至２０：非終端記号ｔ１乃至ｔ５６：終端記号最後の数字：確率[Table 2] 内容 Contents of Stochastic Context Free Grammar (SCFG) ────────────────── ─ 1 → 1 1 5.677289731860455 × 10 ^-5 1 → 1 2 0.003216677708041557 1 → 1 3 1.000394189802561 × 10 ^-15 :: 1 → 1 19 1.015469740215695 × 10 ^-15 1 → 1 20 1.148333794722905 × 10 ^-15 1 → 2 1 0.0001832178839199974 1 → 2 2 0.000521748447310258:: 1 → 20 19 1.374865835459389 × 10 ^-15 1 → 20 20 0.001676333237389523 2 → 1 1 1.542003529882383 × 10 ^-15 2 → 1 2 1.783061119126052 × 10 ^-15 :: 20 → 20 19 1.586308114265936 × 10 ^-10 20 → 20 20 2.291887552593505 × 10 ^-6 1 → t1 1.155866936964327 × 10 ^-15 1 → t4 1.004501712835847 × 10 ^-15 1 → t5 1.000076431187449 × 10 ^-15 1 → t6 1.00213816472346 × 10 ^-15 :: 1 → t55 1.000679862632174 × 10 ^-15 1 → t56 1.000019916972904 × 10 ^-15 :: 20 → t55 1.196873334230712 × 10 ^-15 20 → t56 1.128862755279862 × 10 ^-15注 (Note) 1 to 20: Non-terminal symbol t1 to t56: Terminal symbol Last number: Probability

【００８８】上記表２において、例えば第１行目の「1
→ 1 1 5.677289731860455×１０^-5」は、非終端記号１
から非終端記号１と非終端記号１へに枝分かれする書き
換え規則の出現確率が5.677289731860455×１０^-5であ
ることを示しており、以下、同様である。In Table 2 above, for example, “1” in the first row
→ 1 1 5.677289731860455 × 10 ^-5 "is the non-terminal symbol 1
Indicates that the probability of occurrence of the rewriting rule branching to non-terminal symbol 1 and non-terminal symbol 1 is 5.677289731860455 × 10 ^-5 , and so on.

【００８９】次に、図１に示す本発明に係る一実施例で
ある音声合成システムのブロック図を参照して、発声音
声がマイクロホン１に入力された後、スピーカ２５から
合成音声が出力されるまでの構成及び動作について説明
する。Next, referring to the block diagram of the voice synthesizing system according to the embodiment of the present invention shown in FIG. 1, after the uttered voice is input to the microphone 1, the synthesized voice is output from the speaker 25. The configuration and operation up to this point will be described.

【００９０】話者の発声音声はマイクロホン１に入力さ
れて音声信号に変換された後、特徴抽出部２に入力され
る。次いで、特徴抽出部２は、入力された音声信号をＡ
／Ｄ変換した後、例えばＬＰＣ分析を実行し、対数パワ
ー、１６次ケプストラム係数、Δ対数パワー及び１６次
Δケプストラム係数を含む３４次元の特徴パラメータを
抽出する。抽出された特徴パラメータの時系列はバッフ
ァメモリ３を介して音声合成制御部１０に入力される。
音声合成制御部１０は、入力された特徴パラメータに基
づいて、上記学習されたＳＣＦＧ３２を用いた規則を含
む韻律制御規則に従って、韻律句の境界すなわち音声単
位の韻律句を検出して決定した後、決定された韻律句に
基づいて音声単位が公知の通り伸縮変形されて結合さ
れ、さらには、得られた音声単位のスペクトル特徴パラ
メータの値に基づいて、そのデータを、公知の方法によ
り、音声合成のためのピッチ、有声／無声切り換え、振
幅及びフィルタ係数のデータに変換して、それぞれパル
ス発生器２１とスイッチＳＷと振幅変更型増幅器２３と
フィルタ２４とに出力する。ここで、音声合成制御部１
０は、韻律句の境界であると検出したときは、図４に示
すように基本周波数Ｆｏの立て直しを行うように基本周
波数Ｆｏを制御してピッチ情報としてパルス発生回路２
１に出力する。The uttered voice of the speaker is input to the microphone 1, converted into a voice signal, and then input to the feature extraction unit 2. Next, the feature extraction unit 2 converts the input audio signal into A
After the / D conversion, for example, LPC analysis is performed to extract 34-dimensional feature parameters including logarithmic power, 16th-order cepstrum coefficient, Δlogarithmic power, and 16th-order Δcepstrum coefficient. The time series of the extracted feature parameters is input to the speech synthesis control unit 10 via the buffer memory 3.
Based on the input feature parameters, the speech synthesis control unit 10 detects and determines a boundary between prosodic phrases, that is, a prosodic phrase in a speech unit, according to a prosodic control rule including a rule using the learned SCFG 32, Based on the determined prosodic phrase, the speech unit is expanded and contracted and connected as is known, and further, based on the value of the obtained spectral feature parameter of the speech unit, the data is subjected to speech synthesis by a known method. , Voiced / unvoiced switching, amplitude, and filter coefficient data, and outputs the data to the pulse generator 21, switch SW, amplitude changing amplifier 23, and filter 24, respectively. Here, the speech synthesis control unit 1
When 0 is detected as a prosodic phrase boundary, the basic frequency Fo is controlled so as to reestablish the basic frequency Fo as shown in FIG.
Output to 1.

【００９１】音声合成部２０は、パルス発生回路２１と
雑音発生回路２２とスイッチＳＷと振幅変更型増幅器２
３とフィルタ２４とから構成される。パルス発生回路２
１は、有声音の励振音源であって各ピッチ周期の開始時
点で単位大きさのインパルスを発生して、スイッチＳＷ
を介して振幅変更型増幅器２３に出力する。一方、雑音
発生回路２２は、無声音の励振音源であって、無相関で
かつ一様分布を有する標準偏差１と平均値０のランダム
雑音を発生して、スイッチＳＷを介して振幅変更型増幅
器２３に出力する。従って、スイッチＳＷは有声音を発
生するときパルス発生回路２１側に切り換える一方、無
声音を発生するときは雑音発生回路２２側に切り換られ
る。さらに、振幅変更型増幅器２３は、入力される振幅
情報に基づいて入力される信号の振幅を変更しかつ増幅
してフィルタ２４に出力する。そして、フィルタ２４
は、その伝達関数に対応するフィルタ係数を入力される
フィルタ係数に設定し、入力された信号を当該設定され
たフィルタ係数でろ波した後、スピーカ３０を介して出
力する。The voice synthesizing section 20 includes a pulse generating circuit 21, a noise generating circuit 22, a switch SW, and an amplitude changing type amplifier 2.
3 and a filter 24. Pulse generation circuit 2
Reference numeral 1 denotes a voiced excitation source which generates an impulse having a unit size at the start of each pitch cycle,
To the amplitude changing type amplifier 23 through. On the other hand, the noise generation circuit 22 is a non-voiced excitation source, generates random noise of uncorrelated and uniformly distributed standard deviation 1 and average value 0, and outputs the amplitude-change type amplifier 23 through the switch SW. Output to Accordingly, the switch SW is switched to the pulse generating circuit 21 when generating a voiced sound, and is switched to the noise generating circuit 22 when generating an unvoiced sound. Further, the amplitude changing amplifier 23 changes and amplifies the amplitude of the input signal based on the input amplitude information, and outputs the amplified signal to the filter 24. And the filter 24
Sets the filter coefficient corresponding to the transfer function to the input filter coefficient, filters the input signal with the set filter coefficient, and outputs the signal through the speaker 30.

【００９２】本実施例においては、図２のＳＣＦＧの確
率学習処理においては、ステップＳ１とＳ２とをともに
備えているが、本発明はこれに限らず、確率文脈自由文
法（ＳＣＦＧ）を学習する場合は、ステップＳ１だけの
学習処理を行うように構成してもよい。In this embodiment, both the steps S1 and S2 are provided in the probability learning process of the SCFG of FIG. 2, but the present invention is not limited to this, and learns a stochastic context-free grammar (SCFG). In such a case, the learning processing of step S1 alone may be performed.

【００９３】さらに、本発明者は、本実施例で提案した
パラメータの有効性を調べるために公知のニューラルネ
ットワークを用いて韻律句の境界を検出推定した。以下
に示す韻律句の検出推定方法は、韻律句境界推定規則作
成部３４および音声合成制御部１０に適用することがで
きる。当該ニューラルネットワークの構造は４層の階層
型であって、入力層と第１中間層と第２中間層と出力層
からなる。ここで、入力層は５０個のユニットと１個の
しきいユニットから構成され、第１の中間層は２５個の
ユニットから構成され、第２中間層は２５個のユニット
から構成され、出力層は２個のユニットである。この出
力データは、学習のために韻律句の境界である（０，
１）と、韻律句の境界でない（１，０）とした教師デー
タからなる一方、入力データは以下のような合計５０個
の入力パラメータの組を作成して用いた。このとき、上
記教師データは使用した一話者の境界の状態、すなわち
韻律句の境界であるか否かの情報を用いて決定した。上
述の教師データと５０個の入力パラメータの組からニュ
ーラルネットワークを学習し、学習されたニューラルネ
ットワークを韻律句境界推定規則とした。Further, the present inventor detected and estimated the boundaries of prosodic phrases using a known neural network in order to check the effectiveness of the parameters proposed in this embodiment. The following prosody phrase detection and estimation method can be applied to the prosody phrase boundary estimation rule creating unit 34 and the speech synthesis control unit 10. The structure of the neural network is a four-layered hierarchical structure, which includes an input layer, a first intermediate layer, a second intermediate layer, and an output layer. Here, the input layer is composed of 50 units and one threshold unit, the first intermediate layer is composed of 25 units, the second intermediate layer is composed of 25 units, and the output layer is composed of 25 units. Is two units. This output data is a prosodic phrase boundary for learning (0,
On the other hand, the input data consisted of a set of a total of 50 input parameters as shown below, which was used for the input data. At this time, the teacher data was determined using the state of the boundary of the one speaker used, that is, information on whether or not the boundary was a prosodic phrase. A neural network was learned from a set of the above teacher data and 50 input parameters, and the learned neural network was used as a prosodic phrase boundary estimation rule.

【００９４】（ａ）次の各語における左枝分かれ構造確
率ｍおよび右枝分かれ構造確率ｎ、ただし、ｍ，ｎ＝
１，２，３，４，及び５以上の合計１０パラメータ（ａ−１）韻律句の境界の直前の単語の直前の自立語（ａ−２）韻律句の境界の直前の単語（ａ−３）韻律句の境界の直後の単語（ａ−４）韻律句の境界の直後の単語の直後の自立語従って、１０パラメータ×４語＝合計４０パラメータと
なる。（ｂ）韻律句の境界の直前の５単語の終端記号の種類の
５パラメータ。（ｃ）韻律句の境界の直後の５単語の終端記号の種類の
５パラメータ。(A) Left branching structure probability m and right branching structure probability n in the following words, where m, n =
A total of 10 parameters of 1, 2, 3, 4, and 5 or more (a-1) Independent word immediately before the word immediately before the boundary of the prosodic phrase (a-2) Word immediately before the boundary of the prosodic phrase (a-3 ) Word immediately after the boundary of the prosodic phrase (a-4) Independent word immediately after the word immediately after the boundary of the prosodic phrase Therefore, 10 parameters x 4 words = 40 parameters in total. (B) Five parameters of types of terminal symbols of five words immediately before the boundary of the prosodic phrase. (C) Five parameters of the type of terminal symbol of five words immediately after the prosodic phrase boundary.

【００９５】学習されたニューラルネットワークに５０
個の入力パラメータを入力し、韻律句の境界を検出推定
した。韻律句の境界の検出推定に際しては、２つの出力
データの大きさの比較しより大きいものを検出結果とし
て判断し、すなわち韻律句の境界であるか否かの検出判
断を行なった。韻律句の境界には自由度があり、話者全
員が基本周波数Ｆｏの立て直しを行なっている韻律句の
境界と、誰も基本周波数Ｆｏの立て直しを行なわない韻
律句の境界のほか、一部の話者が基本周波数Ｆｏの立て
直しを行なう境界がある。従って、すべての話者が一致
している境界について検出推定の結果を評価した。その
結果を次の表３に示す。表３に示すように、韻律句の境
界は精度良く検出推定できており、確率文脈自由文法
（ＳＣＦＧ）を用いた韻律句の境界の検出推定は可能で
あることが確認できた。また、確率文脈自由文法（ＳＣ
ＦＧ）の学習に韻律句の構造を用いているので、例え
ば、上述のようにニューラルネットワークを用いるなど
して学習されたＳＣＦＧ３２のデータに対して所定の韻
律句の構造が既知のデータを用いて評価推定して学習す
ることにより、韻律句の境界を帰納学習をすることがで
きる。The learned neural network has 50
The input parameters were input, and the prosodic phrase boundaries were detected and estimated. In detecting and estimating the boundaries of the prosodic phrases, the larger of the two output data was compared, and the larger one was determined as the detection result, that is, the detection of whether or not the boundary was the prosodic phrase was performed. There are degrees of freedom in the boundaries of the prosodic phrases, and in addition to the boundaries of the prosodic phrases in which all the speakers reset the fundamental frequency Fo, the boundaries of the prosodic phrases in which nobody resets the fundamental frequency Fo, There is a boundary where the speaker resets the fundamental frequency Fo. Therefore, the result of detection and estimation was evaluated for a boundary where all speakers are in agreement. The results are shown in Table 3 below. As shown in Table 3, it was confirmed that the boundaries of the prosodic phrases could be detected and estimated with high accuracy, and the detection and estimation of the boundaries of the prosodic phrases using the stochastic context-free grammar (SCFG) were possible. In addition, a stochastic context-free grammar (SC
Since the structure of the prosodic phrase is used in the learning of FG), for example, the data of the predetermined prosodic phrase is used for the data of the SCFG 32 learned by using the neural network as described above, for example. By performing estimation and learning, it is possible to perform inductive learning on the boundaries of prosodic phrases.

【００９６】[0096]

【表３】韻律句の境界の検出推定結果 ─────────────────────────────────── 計算データ境界の推定の誤り率［％］（誤りの数／全体の数） ────────────────────────── 全員が基本周波数の誰も基本周波数の立て直しを行っている立て直しを行っていない韻律句の境界韻律句の境界 ─────────────────────────────────── 学習後データ０．６（４／６８０）７．６（４３５／５７１５） ─────────────────────────────────── 学習していない７．１（４８／６８０）１６．９（９６４／５７１５）データ ───────────────────────────────────[Table 3] Detection and estimation results of prosodic phrase boundaries ─────────────────────────────────── Calculation data boundaries Estimated error rate [%] (number of errors / total number) 全員 Everyone is fundamental frequency The fundamental frequency has been restored. The restoration has not been performed. Prosodic phrase boundary Prosodic phrase boundary 句データ Data after learning 0.6 (4/680) 7.6 (435/5715) ─────────────────────────して Not learned 7.1 (48/680) 16.9 (964/5715) Data データ────────────────

【００９７】以上説明したように、本実施例では韻律句
の境界の検出推定の入力パラメータとして、確率文脈自
由文法（ＳＣＦＧ）より導出されたパラメータを用いて
いる。従って、韻律句の境界の検出推定するために確率
文脈自由文法（ＳＣＦＧ）の利用が有効であることがわ
かった。As described above, in this embodiment, parameters derived from stochastic context-free grammar (SCFG) are used as input parameters for detecting and estimating the boundaries of prosodic phrases. Therefore, it has been found that the use of stochastic context-free grammar (SCFG) is effective for detecting and estimating the boundaries of prosodic phrases.

【００９８】以上の実施例において、入力パラメータか
ら韻律句の境界を検出推定する手段としてニューラルネ
ットワークを用いているが、本発明はこれに限らず、公
知の判別分析法などのいくつかの要因（連続値）に基づ
いてそれらの要因に関係する事象の属性を予測する手法
を使用しても可能である。In the above embodiment, a neural network is used as means for detecting and estimating the boundaries of prosodic phrases from input parameters. However, the present invention is not limited to this. It is also possible to use a method of predicting the attribute of an event related to those factors based on the continuous value.

【００９９】また、以上の実施例において、確率文脈自
由文法（ＳＣＦＧ）の終端記号として、２３種類の品詞
とその内の格助詞のみ７分類（が、の、に、を、で、
と、その他）した合計２９種類となる終端記号、非終端
記号数として２０を用いたが、本発明はこれに限らず、
確率文脈自由文法（ＳＣＦＧ）の終端記号および非終端
記号数に制限なく利用できる。Also, in the above embodiment, as the terminal symbols of the stochastic context-free grammar (SCFG), only 23 classes of part-of-speech and their case particles are classified into seven classes (but,,,,,
And 20) were used as the number of terminal symbols and non-terminal symbols in a total of 29 types, but the present invention is not limited to this.
Any number of terminal symbols and non-terminal symbols in Stochastic Context Free Grammar (SCFG) can be used.

【０１００】以上詳述したように本発明によれば、自然
な合成音声を得るために基本周波数を制御して音声合成
を行う音声合成手段を備えた音声合成システムにおい
て、上記所定の目的のための確率文脈自由文法（ＳＣＦ
Ｇ）を用いて韻律句の境界、すなわち基本周波数の立て
直しが起こる境界を検出して上記基本周波数を制御する
制御手段を備える。また、上記制御手段は、言語情報と
韻律情報の使用による学習法によって韻律句構造を学習
して作成された確率文脈自由文法（ＳＣＦＧ）を用いた
制御している。従って、本発明に係る実施例は以下の特
有の効果を有する。（１）従来例のように、係り受け構造等を用いて韻律句
の境界を制御する方法に比べて、入力テキストに係り受
け構造の情報の付加が必要がないので、入力情報を削減
することができる。（２）ＳＣＦＧを用いた規則を含む韻律制御規則に基づ
いて韻律句の境界を検出して基本周波数を制御するの
で、より自然な合成音声を得ることができる。As described in detail above, according to the present invention, in a speech synthesis system provided with speech synthesis means for performing speech synthesis by controlling a fundamental frequency to obtain a natural synthesized speech, Stochastic context-free grammar (SCF)
A control means for controlling the fundamental frequency by detecting the boundary of the prosodic phrase, that is, the boundary at which the fundamental frequency is reestablished, using G). Further, the control means performs control using a stochastic context-free grammar (SCFG) created by learning a prosodic phrase structure by a learning method using linguistic information and prosodic information. Therefore, the embodiment according to the present invention has the following specific effects. (1) Compared to a method of controlling the boundaries of prosodic phrases using a dependency structure or the like as in the conventional example, it is not necessary to add the information of the dependency structure to the input text, so that the input information is reduced. Can be. (2) Since the fundamental frequency is controlled by detecting the boundaries of the prosodic phrases based on the prosodic control rules including the rules using SCFG, more natural synthesized speech can be obtained.

【０１０１】[0101]

【発明の効果】以上詳述したように本発明に係る請求項
１記載の音声合成システムによれば、入力された単語列
に基づいて基本周波数を制御して上記単語列の音声を合
成して出力する音声合成手段を備えた音声合成システム
において、上記音声合成システムは、所定の確率文脈自
由文法を、所定のアルゴリズムに従って、予め作成され
た基本周波数の立て直し位置のデータで括弧付けされた
コーパスを用いて学習することにより、韻律句の構造の
情報を含むように学習された確率文脈自由文法を生成す
る第１の学習手段を備え、上記音声合成手段は、上記第
１の学習手段によって生成された確率文脈自由文法を用
いた規則を備えた韻律制御規則に従って、上記基本周波
数の立て直しが起こる境界である韻律句の境界を上記入
力された単語列において検出して上記基本周波数を制御
する制御手段を備える。従って、韻律句の境界をより正
確に検出してより自然な音声を合成して出力することが
できる。また、従来例のように、係り受け構造等を用い
て韻律句の境界を制御する方法に比べて、入力テキスト
に係り受け構造の情報の付加が必要がないので、入力情
報を削減することができる。As described above in detail, according to the speech synthesizing system according to the first aspect of the present invention, the fundamental frequency is controlled based on the input word string to synthesize the speech of the word string. In the speech synthesis system provided with speech synthesis means for outputting, the speech synthesis system converts a predetermined stochastic context-free grammar into a corpus bracketed by data of a pre-created fundamental frequency rebuilding position according to a predetermined algorithm. Learning means for generating a probabilistic context-free grammar learned to include information on the structure of a prosodic phrase, wherein the speech synthesis means is generated by the first learning means. In accordance with the prosody control rules having rules using the stochastic context-free grammar, the boundaries of the prosodic phrases, which are the boundaries at which the fundamental frequency is reestablished, are added to the input word string. There detected and it comprises control means for controlling the fundamental frequency. Therefore, it is possible to more accurately detect the boundaries of prosodic phrases and synthesize and output a more natural voice. Also, compared to a method of controlling the boundaries of prosodic phrases using a dependency structure or the like as in the conventional example, it is not necessary to add information on the dependency structure to the input text, so that the input information can be reduced. it can.

【０１０２】また、請求項２記載の音声合成システムに
おいては、請求項１記載の音声合成システムにおいて、
さらに、所定の初期値の確率文脈自由文法を、所定のア
ルゴリズムに従って、予め作成された係り受け構造で括
弧付けされたコーパスを用いて学習することにより、学
習された確率文脈自由文法を生成する第２の学習手段を
備え、上記第１の学習手段は、上記第２の学習手段によ
って生成された確率文脈自由文法を再学習する。従っ
て、韻律句の境界をより正確に検出してさらにより自然
な音声を合成して出力することができる。Further, in the speech synthesis system according to the second aspect, in the speech synthesis system according to the first aspect,
Further, by learning a stochastic context-free grammar of a predetermined initial value using a corpus bracketed by a pre-made dependency structure according to a predetermined algorithm, a learned stochastic context-free grammar is generated. The first learning means re-learns the stochastic context-free grammar generated by the second learning means. Therefore, it is possible to more accurately detect the boundaries of prosodic phrases and synthesize and output a more natural voice.

[Brief description of the drawings]

【図１】本発明に係る一実施例である音声合成システ
ムのブロック図である。FIG. 1 is a block diagram of a speech synthesis system according to an embodiment of the present invention.

【図２】図１のＳＣＦＧ学習部３０によって実行され
るＳＣＦＧの確率学習処理のフローチャートである。FIG. 2 is a flowchart of an SCFG probability learning process executed by an SCFG learning unit 30 of FIG. 1;

【図３】入力された音声信号において基本周波数Ｆｏ
の立て直しが行われていない場合の基本周波数に対する
時間的変化を示すグラフである。FIG. 3 shows a fundamental frequency Fo of an input audio signal.
6 is a graph showing a temporal change with respect to a fundamental frequency when the rebuilding is not performed.

【図４】入力された音声信号において基本周波数Ｆｏ
の立て直しが行われた場合の基本周波数に対する時間的
変化を示すグラフである。FIG. 4 shows a fundamental frequency Fo of an input audio signal.
6 is a graph showing a temporal change with respect to a fundamental frequency when the rebuilding is performed.

【図５】図１の音声合成システムにおいて処理すべき
複数の単語列からなる文の一例を示す図であって、係り
受けの深さｍの左枝分かれ構造と係り受けの深さｍの右
枝分かれ構造を示す図である。5 is a diagram showing an example of a sentence composed of a plurality of word strings to be processed in the speech synthesis system of FIG. 1, wherein a left branching structure with a dependency depth m and a right branching with a dependency depth m are shown. It is a figure showing a structure.

【図６】図１の音声合成システムにおいて処理すべき
５個の単語列からなる文の一例における木構造を示す図
であって、左枝分かれ構造確率と右枝分かれ構造確率の
計算方法を示す図である。6 is a diagram showing a tree structure in an example of a sentence composed of five word strings to be processed in the speech synthesis system of FIG. 1, and showing a method of calculating a left branching structure probability and a right branching structure probability. is there.

【図７】図１の音声合成システムにおいて用いるイン
サイド・アウトサイド・アルゴリズムにおいて実行され
る内側確率の計算方法を示す図である。FIG. 7 is a diagram showing a calculation method of an inside probability executed in an inside / outside algorithm used in the speech synthesis system of FIG. 1;

【図８】図１の音声合成システムにおいて用いるイン
サイド・アウトサイド・アルゴリズムにおいて用いられ
る外側確率の定義を示す図である。8 is a diagram showing a definition of an outside probability used in an inside / outside algorithm used in the speech synthesis system of FIG. 1;

【図９】図１の音声合成システムにおいて用いるイン
サイド・アウトサイド・アルゴリズムにおいて実行され
る外側確率の計算方法を示す図である。9 is a diagram illustrating a method of calculating an outside probability executed in an inside / outside algorithm used in the speech synthesis system in FIG. 1;

[Explanation of symbols]

１…マイクロホン、２…特徴抽出部、３…バッファメモリ、１０…音声合成制御部、２０…音声合成部、２１…パルス発生回路、２２…雑音発生回路、２３…利得可変型増幅器、２４…フィルタ、２５…スピーカ、３０…ＳＣＦＧの確率学習部、３１…初期値ＳＣＦＧ、３２…学習されたＳＣＦＧ、３３…韻律制御規則、３４…韻律句境界推定規則作成部、ＳＷ…スイッチ。 DESCRIPTION OF SYMBOLS 1 ... Microphone, 2 ... Feature extraction part, 3 ... Buffer memory, 10 ... Voice synthesis control part, 20 ... Voice synthesis part, 21 ... Pulse generation circuit, 22 ... Noise generation circuit, 23 ... Variable gain type amplifier, 24 ... Filter Reference numeral 25: speaker, 30: SCFG probability learning unit, 31: initial value SCFG, 32: learned SCFG, 33: prosody control rule, 34: prosody phrase boundary estimation rule creation unit, SW: switch.

フロントページの続き (56)参考文献特開平５−134692（ＪＰ，Ａ) 特開平３−119395（ＪＰ，Ａ) 特開昭60−195596（ＪＰ，Ａ) Ｋ．ＬＡＲＩ，Ｓ．Ｊ．ＹＯＵＮＧ，”Ａｐｐｌｉｃａｔｉｏｎｏｆｓｔｏｃｈａｓｔｉｃｃｏｎｔｅｘｔ −ｆｒｅｅｇｒａｍｍａｒｓｕｓｉｎｇｔｈｅＩｎｓｉｄｅ−Ｏｕｔｓｉｄｅａｌｇｏｒｉｔｈｍ”，ＣｏｍｐｕｔｅｒＳｐｅｅｃｈａｎｄＬａｎｇｕａｇｅ，Ｖｏｌ．５，Ｎｏ．３（1991），ｐｐ237−257 Ｆ．Ｐｅｒｅｉｒａ，Ｙ．Ｓｈａｂｅｓ，”ＩＮＳＩＤＥ−ＯＵＴＳＩＤＥＲＥＥＳＴＭＡＴＩＯＮＦＲＯＭＰＡＲＴＩＡＬＬＹＢＲＡＣＫＴＥＤＣＯＲＰＡＲＡ”，Ｐｒｏｃ．30ｔｈＡｎｎｕａｌＭｅｅｔｉｎｇｏｆｔｈｅＡｓｓｏｃｉａｔｉｏｎｆｏｒＣｏｍｐｕｔａｔｉｏｎａｌＬｉｎｇｕｉｓｔｉｃｓ（1992），ｐｐ128 −135 (58)調査した分野(Int.Cl.⁶，ＤＢ名) G10L 3/00 - 9/20 ＪＩＣＳＴファイル（ＪＯＩＳ)Continuation of the front page (56) References JP-A-5-134692 (JP, A) JP-A-3-119395 (JP, A) JP-A-60-195596 (JP, A) LARI, S.M. J. YOUNG, "Application of stoichiometric context-free grammar usinng the Inside-Outside algorithm", Computer Speech and Language. 5, No. 3 (1991), pp. 237-257. Pereira, Y .; Shaves, "INSIDE-OUTSIDE RESTMATION FROM P ARTIALLY BRACTED CORPARA", Proc. 30th Annual Meeting of the Association for Computational Linguistics (1992), pp128-135 (58) Fields investigated (Int.Cl. ⁶ , DB name) G10L 3/00-9/20 JICST file (JOIS)

Claims

(57) [Claims]

1. A speech synthesis system comprising speech synthesis means for controlling a fundamental frequency based on an input word string to synthesize and output speech of the word string, wherein the speech synthesis system has a predetermined probability. By learning a context-free grammar according to a predetermined algorithm using a corpus bracketed by data of a pre-established fundamental frequency reconstruction position, a probability context learned to include information on the structure of a prosodic phrase A first learning unit for generating a free grammar, wherein the speech synthesis unit reestablishes the fundamental frequency according to a prosody control rule including a rule using a stochastic context free grammar generated by the first learning unit. Control means for controlling the fundamental frequency by detecting a boundary of a prosodic phrase, which is a boundary at which the occurrence occurs, in the input word string. Speech synthesis system.

2. The speech synthesizing means further learns a stochastic context-free grammar of a predetermined initial value according to a predetermined algorithm using a corpus bracketed by a previously created dependency structure. And a second learning means for generating the generated stochastic context-free grammar, wherein the first learning means re-learns the stochastic context-free grammar generated by the second learning means. 2. The speech synthesis system according to 1.

3. The speech synthesis system according to claim 1, wherein said algorithm is an inside / outside algorithm.