JPH0887296A

JPH0887296A - Voice synthesizer

Info

Publication number: JPH0887296A
Application number: JP6221683A
Authority: JP
Inventors: Tomoki Hamagami; 知樹濱上; Mitsuo Furumura; 光夫古村
Original assignee: Secom Co Ltd
Current assignee: Secom Co Ltd
Priority date: 1994-09-16
Filing date: 1994-09-16
Publication date: 1996-04-02

Abstract

PURPOSE: To correct the fault of a point pitch model, to generate a natural pitch pattern and to produce a high quality synthesized sound in a voice synthesizer. CONSTITUTION: The voice synthesizer is provided with a first database section 21 which stores accent type point pitch patterns, a second database section 24 which stores the rule corresponding to the combination of the accent type and phoneme and a pitch pattern generating section 8 which retrieves a point pitch pattern corresponding to the paragraph, that is an object of the voice synthesis, from the section 21 and generates a new point pitch pattern based on the rule of the section 24 from the obtained point pitch pattern. Thus, pitches are given to not only the vowel centroid point but also to a phoneme boundary and the naturality in a head word in a synthesized sound and the vowel chain in an accent kernel is improved.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、音声合成装置に関し、
特に、音声合成装置におけるピッチパタンを生成する技
術に関するものである。本願発明の音声合成装置は、例
えば、セキュリティ機器における音声合成装置として使
用される。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech synthesizer,
In particular, it relates to a technique for generating a pitch pattern in a speech synthesizer. The speech synthesizer of the present invention is used, for example, as a speech synthesizer in security equipment.

【０００２】[0002]

【従来の技術】従来、音声規則合成のピッチ制御は、母
音の中心位置におけるピッチ周波数を与え、母音間を線
形に補間する「点ピッチモデル」（特開昭５０−１２８
４０号、特許第１０８７８４８号参照）が使われてい
た。点ピッチモデルを用いた音声合成とは、日本語のア
クセント型に対応した、いくつかのパタンをあらかじめ
数値化しデータベース化しておき、合成したい文節のア
クセント型の情報を基に、その数値化された点ピッチパ
タンを検索し、合成パラメータとして利用する方法であ
る。2. Description of the Related Art Conventionally, in pitch control of speech rule synthesis, a "point-pitch model" which gives a pitch frequency at the center position of a vowel and linearly interpolates between vowels (Japanese Patent Laid-Open No. 50-128)
40, see Japanese Patent No. 1087848). Speech synthesis using the point-pitch model is a numerical digitization of several patterns corresponding to Japanese accent types in advance, which is digitized based on the accent type information of the phrase to be synthesized. In this method, the point pitch pattern is searched and used as a synthesis parameter.

【０００３】点ピッチモデルは、母音のエネルギー中心
位置におけるピッチ周波数が人間の聴覚にとって支配的
であるという特徴を利用したモデルであり、ピッチ制御
にとって、効率のよい考え方といえる。The point-pitch model is a model utilizing the characteristic that the pitch frequency at the energy center position of a vowel is dominant for human hearing, and can be said to be an efficient idea for pitch control.

【０００４】[0004]

【発明が解決しようとする課題】ところが、自然の発声
においては、同じアクセント型、同じモーラ数の文節で
あっても、その文節を構成する音韻の並びによっては、
明らかに点ピッチパタンと異なったピッチパタンをとる
事がある。これは、自然の発声系における、ピッチと調
音機構の相互作用によって、聞こえのアクセント型が、
点ピッチパタンの形と必ずしも一致しない事による。However, in natural utterance, even if the utterances of the same accent type and the same number of mora are used in the natural utterance, depending on the arrangement of the phonemes constituting the utterances,
Obviously, there may be a pitch pattern different from the point pitch pattern. This is due to the interaction of the pitch and articulatory mechanisms in the natural vocal system, resulting in a audible accent type.
This is because it does not always match the shape of the dot pitch pattern.

【０００５】その結果、文節によっては、検索した点ピ
ッチパタンと文節本来のアクセントが必ずしも一致せ
ず、変にアクセントが強調され、イントネーションがお
かしくなることがある等の不自然性が問題になってい
た。本発明は、上記従来の点ピッチモデルの欠点を改良
し、自然なピッチパタンを生成し、高品質な合成音をつ
くり出すことを目的とする。As a result, depending on the phrase, the retrieved point pitch pattern does not always match the original accent of the phrase, the accent is emphasized strangely, and the intonation may be strange. It was An object of the present invention is to improve the above-mentioned drawbacks of the conventional point-pitch model, generate a natural pitch pattern, and create a high-quality synthesized sound.

【０００６】[0006]

【課題を解決するための手段】本発明は前記問題を解決
するために、従来の点ピッチパタンの解釈を拡大し、母
音重心点のみならず、音韻境界にもピッチを与えられる
モデルとする。これを拡張点ピッチモデルと呼び、与え
られるピッチを拡張点ピッチと呼ぶ事にする。拡張点ピ
ッチは、母音重心位置に与える点ピッチと、音韻の並び
から、音韻境界のピッチ（以下、「境界ピッチ」とい
う。）を規則によって決定し、合成パラメータとして利
用する。In order to solve the above problems, the present invention expands the conventional interpretation of the point pitch pattern to provide a model in which not only the vowel center of gravity points but also the phonological boundaries are given pitches. This is called an extension point pitch model, and the given pitch is called an extension point pitch. The extension point pitch is determined by a rule from the point pitch given to the position of the center of gravity of the vowel and the arrangement of phonemes, and is used as a synthesis parameter by determining the pitch of the phoneme boundary (hereinafter referred to as “boundary pitch”).

【０００７】本発明は、これを実現するため、音声合成
装置において、アクセント型による点ピッチパタンを記
憶した第１のデータベース部と、アクセント型と音韻の
結合に対応した規則を記憶した第２のデータベース部
と、音声合成の対象となる文節に対応した点ピッチパタ
ンを第１のデータベース部より検索し、得られた点ピッ
チパタンから第２のデータベース部の規則を基に新たな
点ピッチパタンを生成するピッチパタン生成部を設け
る。In order to realize this, the present invention realizes, in a speech synthesizer, a first database section storing a point pitch pattern of an accent type and a second database section storing a rule corresponding to a combination of an accent type and a phoneme. The database unit and the point pitch pattern corresponding to the speech synthesis target phrase are searched from the first database unit, and a new point pitch pattern is obtained from the obtained point pitch pattern based on the rules of the second database unit. A pitch pattern generation unit for generating is provided.

【０００８】また、本発明は、前記第２のデータベース
部に記憶されている規則を、アクセント型と音韻の結合
とモーラ数との組合せに対応したものとすることができ
る。さらに、前記第１のデータベース部に記憶されてい
る点ピッチパタンに下降成分を加えた後、前記第２のデ
ータベース部の規則を適用し、新たな点ピッチパタンを
生成することができる。Further, according to the present invention, the rule stored in the second database section can correspond to a combination of a combination of accent type and phoneme and a mora number. Further, after adding the descending component to the point pitch pattern stored in the first database section, the rule of the second database section can be applied to generate a new point pitch pattern.

【０００９】[0009]

【作用】第１のデータベース部より検索し得られた点ピ
ッチパタンに対して第２のデータベース部の規則を適用
することにより、与えられた点ピッチと音韻の並びから
境界ピッチを規則によって決定し、合成パラメータとし
て利用する。これにより、点ピッチパタンの母音重心点
のみならず、音韻境界にもピッチが与えられ、合成音に
おける語頭、アクセント核における母音連鎖の自然性を
向上させる。By applying the rule of the second database section to the point pitch pattern obtained by searching from the first database section, the boundary pitch is determined by the rule from the given arrangement of point pitch and phoneme. , Used as a synthesis parameter. As a result, not only the vowel center point of the point pitch pattern but also the phonological boundaries are given pitches, which improves the naturalness of the beginning of a synthesized voice and the vowel chain in the accent nucleus.

【００１０】また、第２のデータベースに記憶されてい
る規則を、アクセント型と音韻の結合とモーラ数との組
合せに対応させることにより、さらには、点ピッチパタ
ンに下降成分を加えることにより、アクセント型におけ
る母音連鎖の自然性を更に向上させることができる。Further, the rule stored in the second database is made to correspond to the combination of the accent type and the phoneme combination and the mora number, and further, by adding the descending component to the point pitch pattern, the accent is added. The naturalness of the vowel chain in the pattern can be further improved.

【００１１】[0011]

【実施例】以下、本発明の実施例について図面を参照し
て説明する。図２は本発明の実施例に係る日本語音声合
成システムの構成を示す図である。図において、文章入
力装置１は、音声合成をしようとする合成文章が入力さ
れる。Embodiments of the present invention will be described below with reference to the drawings. FIG. 2 is a diagram showing the configuration of the Japanese speech synthesis system according to the embodiment of the present invention. In the figure, the text input device 1 receives a synthetic text for which voice synthesis is to be performed.

【００１２】テキスト解析部２は、この合成文章を発音
情報生成規則３を用いて解析し、音声合成処理に必要な
アクセントの情報、ポーズ、母音の無音声化などといっ
た発音情報を加えた音韻記号列に変換する。音韻継続時
間長生成部４は、テキスト解析部２によって生成された
音韻記号列について、リズム規則５により音韻継続時間
長を決定する。音韻継続時間長生成部４の出力は、音源
振幅パタン生成部６、ピッチパタン生成部８、スペクト
ルパタン生成部１１に入力される。The text analysis unit 2 analyzes this synthetic sentence using the pronunciation information generation rule 3, and adds phoneme information such as accent information, pauses, and vowel devoicing necessary for speech synthesis processing. Convert to a column. The phoneme duration generating unit 4 determines the phoneme duration of the phoneme symbol string generated by the text analyzing unit 2 according to the rhythm rule 5. The output of the phoneme duration generation unit 4 is input to the sound source amplitude pattern generation unit 6, the pitch pattern generation unit 8, and the spectrum pattern generation unit 11.

【００１３】音源振幅パタン生成部６は、音声のパワー
包絡をパワー規則７により決定する。なお、この音源振
幅パタン生成部６及びパワー規則７図の詳細について
は、本出願人が既に出願した特願平５−２４７９９４号
で説明されている。ピッチパタン生成部８は、韻律制御
規則９から各アクセント句について点ピッチパタンを決
めて、それらを補間して連続点ピッチパタンを生成す
る。The sound source amplitude pattern generator 6 determines the power envelope of the voice according to the power rule 7. The details of the sound source amplitude pattern generation unit 6 and the power rule 7 are described in Japanese Patent Application No. 5-247994 filed by the present applicant. The pitch pattern generation unit 8 determines point pitch patterns for each accent phrase from the prosody control rule 9 and interpolates them to generate continuous point pitch patterns.

【００１４】音源生成部１０は、前記パワーパタンと前
記ピッチパタンを基に音源を生成する。スペクトルパタ
ン生成部１１は、音韻性向上規則１２により、母音・子
音といった音韻の種類から、音声合成基本単位データベ
ース１３を検索し、音韻結合規則１４により各音韻のス
ペクトルを結合し、フォルマントパタンを作成する。The sound source generator 10 generates a sound source based on the power pattern and the pitch pattern. The spectrum pattern generation unit 11 searches the phonetic synthesis basic unit database 13 from the phoneme types such as vowels and consonants according to the phonological improvement rule 12, and combines the spectra of each phoneme according to the phonological combination rule 14 to create a formant pattern. To do.

【００１５】音声合成器１５は、前記音源生成部１０か
ら得られた音源情報と前記スペクトルパタン生成部１１
から得られたフォルマントパタンから合成音声を作成す
る。作成された合成音声はスピーカ１６により外部に発
声される。次に本発明の特徴部であるピッチパタン生成
部８及び韻律制御規則９について図１を用いて説明す
る。The voice synthesizer 15 includes the sound source information obtained from the sound source generation unit 10 and the spectrum pattern generation unit 11.
Create a synthetic speech from the formant pattern obtained from. The created synthetic voice is uttered to the outside by the speaker 16. Next, the pitch pattern generator 8 and the prosody control rule 9, which are the features of the present invention, will be described with reference to FIG.

【００１６】韻律制御規則９は、正規化点ピッチパタン
を記憶した第１のデータベース２１、アクセント句内下
降成分規則２２、アクセント句間下降成分規則２３及
び、変形規則を記憶した第２のデータベース２４を具備
している。最初に、ピッチパタン生成部８の概略の動作
について説明をする。ピッチパタン生成部８は、韻律制
御規則９の第１のデータベース２１に記憶された正規化
点ピッチパタンから、合成しようとしている文節のモー
ラ数及びアクセント型で決まる１つの正規化点ピッチパ
タンを検索する（ステップＳ１）。The prosody control rule 9 includes a first database 21 storing a normalized point pitch pattern, a descending component rule within accent phrase 22, a descending component rule between accent phrases 23, and a second database 24 storing modification rules. It is equipped with. First, the general operation of the pitch pattern generation unit 8 will be described. The pitch pattern generation unit 8 searches the normalized point pitch pattern stored in the first database 21 of the prosody control rule 9 for one normalized point pitch pattern determined by the mora number and accent type of the phrase to be synthesized. Yes (step S1).

【００１７】次に、ピッチパタン生成部８は、検索して
取り出した点ピッチパタンに対して、アクセント句内下
降成分規則２２を用いて、アクセント句内下降成分を付
与し（ステップＳ２）、さらに、アクセント句間下降成
分規則２３を用いて、アクセント句間下降成分を付与す
る（ステップＳ３）。最後に、ピッチパタン生成部８
は、点ピッチパタンに対して、第２のデータベース２４
に記憶された規則を用い、点ピッチパタンのアクセント
型と音韻の結合とモーラ数に対応した変形を行う（ステ
ップＳ４）。Next, the pitch pattern generating unit 8 gives the descending component within the accent phrase to the retrieved point pitch pattern using the descending component within accent phrase 22 (step S2), and , The inter-accent phrase descending component is added using the accent phrase descending component rule 23 (step S3). Finally, the pitch pattern generation unit 8
Is the second database 24 for the point pitch pattern.
Using the rules stored in step S4, the accent type of the point pitch pattern and the phoneme are combined and the transformation corresponding to the number of mora is performed (step S4).

【００１８】以上簡単に説明した各処理について、以下
に詳細に説明をする。第１のデータベース２１に記憶さ
れている正規化点ピッチパタンについて説明をする。正
規化点ピッチパタンは、あらかじめ自然発声の点ピッチ
を分析しておいて、これに正規化を施してデータベース
化してある。図３〜図６は、第１のデータベース２１に
記憶された正規化点ピッチパタンの例を示し、アクセン
ト型Ｎごとにまとめて表示されている。図３は０型正規
化点ピッチパタン、図４は１型正規化点ピッチパタン、
図５は２型正規化点ピッチパタン、図６は３型正規化点
ピッチパタンを示す。以下図示は省略するが、本例では
９型正規化点ピッチパタンまでが用意される。Each process briefly described above will be described in detail below. The normalized point pitch pattern stored in the first database 21 will be described. The normalized point pitch pattern is obtained by analyzing the point pitch of natural utterance in advance and normalizing it to create a database. 3 to 6 show examples of normalized point pitch patterns stored in the first database 21, which are collectively displayed for each accent type N. 3 is a 0-type normalized point pitch pattern, FIG. 4 is a 1-type normalized point pitch pattern,
FIG. 5 shows a type 2 normalized point pitch pattern, and FIG. 6 shows a type 3 normalized point pitch pattern. Although illustration is omitted below, in the present example, up to the 9-type normalized point pitch pattern is prepared.

【００１９】ここで、アクセント型のＮはアクセント核
の位置を表す。アクセント核とは、アクセントがついた
際に、周波数が下降する直前のモーラである。つまり、
図４の１型とはアクセント核が第１モーラであること、
図５の２型とはアクセント核が第２モーラであること、
図６の３型とはアクセント核が第３モーラであること、
図３の０型とはアクセント核がないことを意味する。Here, the accent type N represents the position of the accent nucleus. The accent nucleus is the mora just before the frequency drops when an accent is applied. That is,
Type 1 in FIG. 4 means that the accent nucleus is the first mora,
Type 2 in Fig. 5 means that the accent nucleus is the second mora,
The type 3 in FIG. 6 means that the accent nucleus is the third mora,
Type 0 in FIG. 3 means that there is no accent nucleus.

【００２０】また、図３〜図６には、１つのアクセント
型Ｎについてモーラ数Ｍの異なる複数のパタンが図示さ
れている。通常、それぞれのパタンをＭモーラＮ型と呼
ぶ。以上の説明から明らかなように、パタンの種類は、
モーラ数Ｍとアクセント型Ｎ（ただし常にＭ＞Ｎ＞＝
０）の組み合わせ数の和となる。通常、Ｍはたかだか１
＜Ｍ＜１０程度なので、パタンの総数は１から１０まで
の和（４２）を超えることはない。3 to 6 show a plurality of patterns with different mora numbers M for one accent type N. Generally, each pattern is called M-Mora N type. As is clear from the above explanation, the types of patterns are
Mora number M and accent type N (always M>N> =
It is the sum of the number of combinations of 0). Usually M is at most 1
Since <M <10, the total number of patterns does not exceed the sum (42) of 1 to 10.

【００２１】図１のステップＳ１において、ピッチパタ
ン生成部８は、上記複数記憶された正規化点ピッチパタ
ンから、合成しようとしている文節のモーラ数及びアク
セント型で決まる１つの正規化点ピッチパタンを検索す
る。次に、図１のステップＳ２及びステップＳ３のピッ
チ下降成分の付与について説明する。In step S1 of FIG. 1, the pitch pattern generation unit 8 creates one normalized point pitch pattern determined from the number of mora of the phrase to be synthesized and the accent type from the plurality of stored normalized point pitch patterns. Search for. Next, the application of the pitch lowering component in steps S2 and S3 of FIG. 1 will be described.

【００２２】自然発声から分析された点ピッチパタン
は、呼気圧の減少による緩やかなピッチ下降成分が重畳
されている。そこで合成時に、自然発声音に近づけるた
めに、下降成分を点ピッチパタンに与える。ピッチパタ
ン生成部８は、検索により取り出した点ピッチパタンに
対して、アクセント句内下降成分規則２２を用いて、ア
クセント句内での下降成分を付与する。The point pitch pattern analyzed from the natural utterance has a gradual pitch down component due to a decrease in expiratory pressure. Therefore, at the time of synthesis, a falling component is given to the point pitch pattern in order to bring it closer to a natural vocal sound. The pitch pattern generation unit 8 adds the descending component in the accent phrase to the point pitch pattern extracted by the search using the accent component descending component rule 22.

【００２３】下降成分は、１型以外のパタンについて
は、図７に示すように、１モーラ目を基準とし、右さが
りになる様に整形する。１型の場合、平均ピッチがほぼ
Ｍモーラの中央のＭ／２の位置で０となるようにシフト
させる。いずれの型についても、ダイナミックレンジは
一定値をとるようにする。さらに、ピッチパタン生成部
８は、アクセント句内下降成分を付与した点ピッチパタ
ンに対して、アクセント句間下降成分規則２３を用い
て、図８に示すようにアクセント句間での下降成分を付
与する。For patterns other than the 1st type, the descending component is shaped so that it becomes a sag to the right with reference to the 1st mora as shown in FIG. In the case of the 1st type, shift is performed so that the average pitch becomes 0 at the position of M / 2 at the center of the M mora. The dynamic range of each type should be a constant value. Further, the pitch pattern generation unit 8 uses the inter-accent phrase descending component rule 23 to attach the descending component between accent phrases to the point pitch pattern to which the in-accent phrase descending component is attached, as shown in FIG. To do.

【００２４】次に、図１のステップＳ４の変形規則の適
用について説明する。ピッチパタン生成部８は、ステッ
プＳ３までで作られた点ピッチパタンに対し、第２のデ
ータベース２４に記憶された変形規則を用いて、得られ
た点ピッチの値を基に、音韻境界に新たなピッチを与え
る、あるいは点ピッチパタンからあるピッチを削除する
というパタンの変形を行う。Next, application of the transformation rule in step S4 of FIG. 1 will be described. The pitch pattern generation unit 8 uses the transformation rules stored in the second database 24 for the point pitch patterns created up to step S3, and newly creates a phonological boundary on the basis of the value of the obtained point pitch. The pattern is transformed by giving a different pitch or deleting a certain pitch from the point pitch pattern.

【００２５】変形方法について図９〜１１を用いて説明
をする。なお、以下の説明において、Ｖは母音、Ｎは撥
音を表す。１．語頭における拡張方法の例を図９を用いて説明す
る。本例の規則は、文節のアクセント型、音韻結合、モ
ーラ数が以下の条件を満たすときに適用される。The deformation method will be described with reference to FIGS. In the following description, V is a vowel and N is a sound repellency. 1. An example of the expansion method at the beginning of a word will be described with reference to FIG. The rule of this example is applied when the accent type, phonological combination, and number of mora of bunsetsu satisfy the following conditions.

【００２６】条件（１型以外）＆（語頭におけるＶＶ，ＶＮ，ＮＶ）＆
（４モーラ以上）すなわち、アクセント型が１型以外のものであり、語頭
つまり、第１モーラと第２モーラが、Ｖ−Ｖ，Ｖ−Ｎ，
Ｎ−Ｖのいずれかで結合され、モーラ数が４以上である
という３つの条件を満たしたとき、点ピッチパタンは以
下の規則により操作される。Condition (other than type 1) & (VV, VN, NV at the beginning of word) &
(4 or more mora) That is, the accent type is other than type 1, and the beginning of the word, that is, the first mora and the second mora are V-V, V-N,
The point pitch pattern is operated according to the following rules when three conditions are satisfied such that the number of moras is four or more and the number of moras is four or more.

【００２７】操作規則ａ．第１モーラの母音重心点ピッチは、そのアクセント
句のアクセント核ピッチとする。なお、アクセント句と
は、１つの点ピッチパタンに代表される最小単位をい
う。ｂ．ただし、０型の場合は、最終モーラに一致させる。Operation Rule a. The vowel center point pitch of the first mora is the accent kernel pitch of the accent phrase. The accent phrase is a minimum unit represented by one point pitch pattern. b. However, in the case of type 0, it matches the final mora.

【００２８】ｃ．第２モーラの重心点ピッチを、第１、
第２モーラの境界点ピッチへ移動させる。ｄ．ただし、２型の場合は、元の第１モーラピッチとの
平均値を、第１、第２モーラの境界点ピッチとする。ｅ．第２モーラの重心点ピッチは破棄する以上の操作の結果、図９（ａ）の点ピッチパタンは
（ｂ）の拡張点ピッチパタンに変形される。なお、図中
のＣは子音を表す。C. The center of gravity pitch of the second mora is
Move to the boundary point pitch of the second mora. d. However, in the case of type 2, the average value with the original first mora pitch is the boundary point pitch between the first and second mora. e. Discarding the barycentric point pitch of the second mora As a result of the above operation, the point pitch pattern of FIG. 9A is transformed into the extended point pitch pattern of FIG. 9B. Note that C in the figure represents a consonant.

【００２９】２．アクセント核における拡張方法の例を
図１０を用いて説明する。本例の規則は、文節のアクセ
ント型及び音韻結合が以下の条件を満たすときに適用さ
れる。2. An example of the extension method in the accent kernel will be described with reference to FIG. The rule of this example is applied when the accent type and phonological combination of bunsetsu satisfy the following conditions.

【００３０】条件（１型以外）＆（アクセント核と後続モーラがＶＶ，Ｖ
Ｎ，ＮＶ）すなわち、アクセント型が１型以外のものであり、アク
セント核とその直後のモーラが、ＶＶ，ＶＮ，ＮＶのい
ずれかで結合されるという２つの条件を満たしたとき、
以下の規則により操作される。Condition (other than type 1) & (accent kernel and subsequent mora are VV, V
N, NV) That is, when the accent type is other than type 1, and the accent kernel and the mora immediately after that satisfy the two conditions of being connected by either VV, VN, or NV,
It is operated according to the following rules.

【００３１】操作規則ａ．アクセント核モーラと後続モーラとの境界点ピッチ
は、アクセント核の重心点ピッチをそのまま使う。ｂ．後続モーラの重心ピッチは次の例外を除き破棄す
る。ｂ１．後続モーラの次のモーラが無声化される場合。Operating Rule a. The pitch of the center of gravity of the accent kernel is used as it is as the boundary point pitch between the accent kernel mora and the subsequent mora. b. The center of gravity pitch of the following mora is discarded with the following exceptions. b1. The next mora of the following mora is devoiced.

【００３２】ｂ２．アクセント型が（アクセント句のモ
ーラ数−１）である場合。以上の操作の結果、図１０（ａ）の点ピッチパタンは
（ｂ）の拡張点ピッチパタンに変形される。３．１型アクセント句における拡張方法の例を図１１を
用いて説明する。本例の規則は、文節のアクセント型及
び音韻結合が以下の条件を満たすときに適用される。B2. When the accent type is (number of mora in accent phrase-1). As a result of the above operation, the point pitch pattern shown in FIG. 10A is transformed into the extended point pitch pattern shown in FIG. An example of the extension method in the 3.1 type accent phrase will be described with reference to FIG. The rule of this example is applied when the accent type and phonological combination of bunsetsu satisfy the following conditions.

【００３３】条件（１型である）＆（第一モーラと第二モーラがＶＶ，Ｖ
Ｎ，ＮＶ）すなわち、アクセント型が１型である場合、第１モーラ
と第２モーラが、ＶＶ，ＶＮ，ＮＶのいずれかで結合さ
れるという２つの条件を満たしたとき、以下の規則によ
り操作される。Condition (Type 1) & (First and second mora are VV, V
That is, when the accent type is type 1, when the two conditions that the first mora and the second mora are connected by VV, VN, NV are satisfied, the operation is performed according to the following rule. To be done.

【００３４】操作規則ａ．第１モーラ重心点ピッチを第１、第２モーラ境界ピ
ッチへ移動する。ｂ．第２モーラ重心点ピッチを破棄する。ｃ．ただし、２モーラ句の場合は、第２モーラの重心点
ピッチは保留する。以上の操作の結果、図１１（ａ）の点ピッチパタンは
（ｂ）の拡張点ピッチパタンに変形される。Operating Rule a. The pitch of the center of gravity of the first mora is moved to the boundary pitch of the first and second mora. b. Discard the second mora centroid pitch. c. However, in the case of the two-mora phrase, the center-of-gravity point pitch of the second mora is reserved. As a result of the above operation, the point pitch pattern of FIG. 11A is transformed into the extended point pitch pattern of FIG. 11B.

【００３５】これらの規則は、自然音声の分析値観測の
結果から、経験的に得られたものであるが、生成された
拡張点ピッチパタンは、自然音と比べて遜色ないパタン
が得られており、合成された音声は従来の点ピッチモデ
ルに比べ自然性が向上されている。なお、以上の規則
は、アクセント型と音韻の結合のみに対応した規則とす
ることもできる。These rules are obtained empirically from the result of observation of the analysis value of natural speech, but the generated extension point pitch pattern has a pattern comparable to that of natural sound. However, the synthesized speech has improved naturalness as compared with the conventional point-pitch model. It should be noted that the above rules may be rules corresponding only to the combination of accent type and phoneme.

【００３６】この様に、拡張点ピッチモデルは、従来技
術の点ピッチモデルと同じピッチパタンデータベースを
用いながら、自然音声の現象に合致する様に制御の自由
度を上げ、規則によってそれを実現したものである。上
記で上げた変形規則は、品質に大きく影響する語頭やア
クセント核におけるＶＶ結合であるが、この他にも、プ
ロミネンス制御、声質制御、促音化、無声化、発話速速
度変化によって変化する様々なピッチパタンを、本モデ
ルは表現する事が可能である。As described above, the extended point pitch model uses the same pitch pattern database as the conventional point pitch model, but the degree of freedom of control is increased so as to match the phenomenon of natural speech, and it is realized by the rule. It is a thing. The above-mentioned transformation rules are VV coupling at the beginning of a word or accent nucleus, which greatly affects the quality, but in addition to this, various prominence control, voice quality control, consonantization, devoicing, and various changes in the speech speed speed are also possible. This model can express the pitch pattern.

【００３７】[0037]

【発明の効果】以上で説明したように、本発明によれ
ば、従来方法の点ピッチパタンを用い、規則により境界
ピッチを設定する事によって、より自然音声に近いピッ
チパタンを得る事ができるようになる。その結果、合成
音の自然性は向上する。As described above, according to the present invention, it is possible to obtain a pitch pattern closer to natural speech by using the point pitch pattern of the conventional method and setting the boundary pitch according to the rule. become. As a result, the naturalness of the synthetic sound is improved.

[Brief description of drawings]

【図１】本発明の実施例のピッチパタン生成部及び韻律
制御規則の詳細を示すブロック図。FIG. 1 is a block diagram showing details of a pitch pattern generation unit and a prosody control rule according to an embodiment of the present invention.

【図２】本発明の実施例の日本語音声合成システム全体
の構成を示す図。FIG. 2 is a diagram showing the overall configuration of a Japanese speech synthesis system according to an embodiment of the present invention.

【図３】正規化点ピッチパタンを示す図（その１）。FIG. 3 is a diagram (1) showing a normalized point pitch pattern.

【図４】正規化点ピッチパタンを示す図（その２）。FIG. 4 is a diagram showing a normalized point pitch pattern (No. 2).

【図５】正規化点ピッチパタンを示す図（その３）。FIG. 5 is a diagram showing a normalized point pitch pattern (No. 3).

【図６】正規化点ピッチパタンを示す図（その４）。FIG. 6 is a diagram showing a normalized point pitch pattern (No. 4).

【図７】アクセント句内での下降成分の付与方法を説明
する図。FIG. 7 is a diagram illustrating a method of adding a descending component in an accent phrase.

【図８】アクセント句間での下降成分の付与方法を説明
する図。FIG. 8 is a diagram illustrating a method of adding a descending component between accent phrases.

【図９】図１のピッチパタン生成部により実行される語
頭における拡張方法を説明するための図。9 is a diagram for explaining an expansion method at the beginning of a word, which is executed by the pitch pattern generation unit in FIG.

【図１０】図１のピッチパタン生成部により実行される
アクセント核における拡張方法を説明するための図。10A and 10B are views for explaining an extension method in an accent kernel executed by the pitch pattern generation unit in FIG.

【図１１】図１のピッチパタン生成部により実行される
１型アクセント句における拡張方法を説明するための
図。FIG. 11 is a diagram for explaining an expansion method in a type 1 accent phrase executed by the pitch pattern generation unit in FIG. 1;

[Explanation of symbols]

８…ピッチパタン生成部９…韻律制御規則２１…第１のデータベース（正規化点ピッチパタン）２２…アクセント句内下降成分規則２３…アクセント句間下降成分規則２４…第２のデータベース（変形規則） 8 ... Pitch pattern generation unit 9 ... Prosody control rule 21 ... First database (normalized point pitch pattern) 22 ... Accent phrase descending component rule 23 ... Accent phrase descending component rule 24 ... Second database (transformation rule)

Claims

[Claims]

1. A speech synthesizer, comprising: a first database section storing a point pitch pattern according to an accent type; a second database section storing a rule corresponding to a combination of an accent type and a phoneme; and an object of speech synthesis. And a pitch pattern generation unit that generates a new point pitch pattern based on the rule of the second database unit from the obtained point pitch pattern based on the obtained point pitch pattern. A speech synthesizer characterized by:

2. The rules stored in the second database unit correspond to combinations of accent types, phoneme combinations, and mora numbers.
The described speech synthesizer.

3. The new point pitch pattern is generated by applying the rule after adding a descending component to the point pitch pattern stored in the first database section. Speech synthesizer.