JPH0279097A

JPH0279097A - Pitch control system

Info

Publication number: JPH0279097A
Application number: JP23121588A
Authority: JP
Inventors: Tetsuya Sakayori; 哲也酒寄
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1988-09-14
Filing date: 1988-09-14
Publication date: 1990-03-19

Abstract

PURPOSE:To generate a pitch pattern with high naturalness corresponding to an actual case by expanding or reducing the variation range of a fundamental frequency calculated from a pitch control rule according to information obtained from an input symbol sequence, a control signal which is supplied from outside etc. CONSTITUTION:A pitch control part 5 refers to the pitch control pattern recorded in a pitch parameter file 6 at need by using information on symbols indicating a phoneme and a pitch in the input symbol sequence to set pitch continuance length, a fundamental frequency pattern, an amplitude pattern, etc., and a mean fundamental frequency changing part 8 alters a fundamental frequency pattern set by the pitch control part 5. Then a dynamic range changing part 9 expands or reduces the variation range of the fundamental frequency calculated from the pitch control rule according to instruction codes in the input symbol sequence 3, other information obtained from the input symbol sequence 3, a control signal which is supplied from outside, etc. Consequently, the natural pitch pattern corresponding to an actual case is obtained.

Description

【発明の詳細な説明】皮４分裏本発明は、韻律制御方式、より詳細には、音声合成にお
いて自然に韻律を付加するための韻律制御方式に関する
。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a prosody control method, and more particularly to a prosody control method for naturally adding prosody in speech synthesis.

従漣」１殴音声合成においては、自然な韻律を付加するために、基
本周波数、振幅、音韻継続時間長を制御する韻律制御規
則が不可欠である。最も自然であると判断される基本周
波数パターンは、発声する文章の内容、利用目的、利用
形態、利用者の好みなど数多くの要因によって様々に異
なる。このため、平均基本周波数を変化させることによ
って、その場に応じた自然な基本周波数パターンに調整
して使用するという方式が、例えば、特開昭６１−７０
５９７号公報において知られている。In order to add natural prosody, prosodic control rules that control the fundamental frequency, amplitude, and phoneme duration length are essential in the 1-stroke speech synthesis. The fundamental frequency pattern that is judged to be the most natural varies depending on many factors, such as the content of the sentence being uttered, the purpose of use, the form of use, and the user's preferences. For this reason, a method of adjusting and using a natural fundamental frequency pattern according to the situation by changing the average fundamental frequency has been proposed, for example, in Japanese Patent Laid-Open No. 61-70.
It is known from Japanese Patent No. 597.

発声すべき文章の種類によっては、音声合成において自
然な韻律を付加するために、平均基本周波数ではなく、
基本周波数の変化範囲、即ち、ダイナミックレンジを変
化させる必要が生じる６例えば、株式市況の読み上げな
どには、広いダイナミックレンジはかえって邪魔になる
。小説などの読み上げには、狭いダイナミックレンジで
は豊かな抑揚はつかない。同じ小説でも、地の文と会話
文では必要とされるダイナミックレンジは異なる。Depending on the type of sentence to be uttered, the average fundamental frequency may be used instead of the average fundamental frequency to add natural prosody during speech synthesis.
When it becomes necessary to change the range of change of the fundamental frequency, that is, the dynamic range,6 for example, a wide dynamic range becomes a hindrance when reading out stock market conditions. When reading novels etc., rich intonation cannot be achieved with a narrow dynamic range. Even in the same novel, the dynamic range required for written text and conversational text are different.

このため、前述の平均基本周波数だけを変化させる方式
では、その場に応じた自然な韻律パターンは得られない
。For this reason, the above-mentioned method of changing only the average fundamental frequency cannot produce a natural prosodic pattern depending on the occasion.

目　　　　　的本発明は、上述のごとき実情に鑑みてなされたもので、
特に、入力記号列中に埋め込まれた命令コードやその他
の入力記号列から得られる情報、音声合成実行時に外部
から規則音声合成装置に与えられる制御信号などに応じ
て、前記韻律制御規則より計算された基本周波数の変化
範囲を拡張あるいは縮小することを可能とすることを目
的としてなされたものである。Purpose The present invention was made in view of the above-mentioned circumstances.
In particular, the prosodic control rules are calculated based on the information obtained from the instruction code embedded in the input symbol string and other input symbol strings, the control signal given to the regular speech synthesizer from the outside when executing speech synthesis, etc. This was done for the purpose of making it possible to expand or reduce the variation range of the fundamental frequency.

構　　収本発明は、上記目的を達成するために、予め用意した音
声素片のパラメータ系列を、音韻・韻律を表現する入力
記号列に従って読みだし、音声パラメータ結合規則によ
って、前記音声素片パラメータ系列を接続し、韻律制御
規則によって音韻・韻律を表現する入力記号列に応じた
音韻継続時間長、基本周波数パターン、振幅パターンな
どを計算することにより、韻律を付加する規則音声合成
装置において、入力記号列中に埋め込まれた命令コード
やその他の入力記号列から得られる情報、音声合成実行
時に外部から規則音声合成装置に与えられる制御信号な
どに応じて、前記韻律制御規則より計算された基本周波
数の変化範囲を拡張あるいは縮小することを特徴とした
ものである。以下、本発明の実施例に基づいて説明する
。SUMMARY OF THE INVENTION In order to achieve the above object, the present invention reads out a parameter sequence of a speech segment prepared in advance according to an input symbol string expressing phoneme/prosody, and uses a speech parameter combination rule to read out a parameter series of a speech segment that is prepared in advance. In a regular speech synthesizer that adds prosody, the input symbols are The fundamental frequency calculated based on the prosodic control rule is determined according to information obtained from the instruction code embedded in the string and other input symbol strings, a control signal given to the rule speech synthesizer from the outside when executing speech synthesis, etc. It is characterized by expanding or contracting the range of change. Hereinafter, the present invention will be explained based on examples.

第１図は、本発明の一実施例を説明するための構成図で
、図中、１は音声素片ファイル、２は音韻生成部、３は
入力記号列、４は音声合成器、５は韻律制御部、６は韻
律パラメータファイル、７は発話スピード変更部、８は
平均基本周波数変更部、９はダイナミックレンジ変更部
で、入力記号列には音韻及び韻律を表す記号と、音声合
成器内の制御パラメータを変更するための命令コードが
混在されている。音韻生成部２では、入力記号列３中の
音韻を表す記号の情報を主に用いて、音声素片パラメー
タファイルから必要な音声素片パラメータを選択し、個
々の音声素片パラメータを滑らかに結合して、合成音声
の音韻パラメータを生成する。韻律制御部５では、入力
記号列中の音韻と韻律を表す記号の情報を用いて、必要
に応じて韻律パラメータファイル６に記録されている韻
律制御パラメータを参照し、音律継続時間長、基本周波
数パターン、振幅パターンなどを設定する。FIG. 1 is a block diagram for explaining one embodiment of the present invention, in which 1 is a speech segment file, 2 is a phoneme generator, 3 is an input symbol string, 4 is a speech synthesizer, and 5 is a speech unit file. 6 is a prosody control section, 6 is a prosodic parameter file, 7 is an utterance speed change section, 8 is an average fundamental frequency change section, 9 is a dynamic range change section, and the input symbol string contains symbols representing phoneme and prosody, and The instruction codes for changing the control parameters are mixed. The phoneme generation unit 2 selects necessary speech segment parameters from the speech segment parameter file, mainly using information on symbols representing phonemes in the input symbol string 3, and smoothly combines the individual speech segment parameters. Then, the phonological parameters of the synthesized speech are generated. The prosody control unit 5 uses the information on the symbols representing the phoneme and prosody in the input symbol string to refer to the prosody control parameters recorded in the prosody parameter file 6 as necessary, and determines the duration of the prosody, the fundamental frequency, etc. Set the pattern, amplitude pattern, etc.

ここで設定された韻律パターンは、様々な種類の文章に
おける韻律パターンの平均値的なものである６発話スピ
ード変更部７では、韻律制御部５で設定された音韻継続
時間長を変更する。変更は入力記号列中の命令コードに
よって指示される発話スピードが実現されるように行な
われる。平均基本周波数変更部８では、韻律制御部５で
設定された基本周波数パターンを変更する。変更は入力
記号列中の命令コードによって指示される平均基本周波
数が実現されるように行なわれる。ダイナミックレンジ
変更部９では、韻律制御部５で設定された基本周波数パ
ターンを変更する。変更は入力記号列中の命令コードに
よって指示される平均基本周波数が実現されるように行
なわれる。The prosodic pattern set here is an average value of prosodic patterns in various types of sentences.6 The speech speed changing section 7 changes the phoneme duration length set by the prosody control section 5. The changes are made such that the speech speed dictated by the instruction code in the input symbol string is achieved. The average fundamental frequency changing section 8 changes the fundamental frequency pattern set by the prosody control section 5. The changes are made such that the average fundamental frequency dictated by the instruction code in the input string is achieved. The dynamic range changing section 9 changes the fundamental frequency pattern set by the prosody control section 5. The changes are made such that the average fundamental frequency dictated by the instruction code in the input string is achieved.

第２図は、本発明の他の実施例を説明するための構成図
で、図中、１０は制御信号、１１は制御信号入力部を示
す。而して、この実施例において、入力記号列は音韻及
び韻律を表す記号列である。FIG. 2 is a block diagram for explaining another embodiment of the present invention, in which 10 indicates a control signal and 11 indicates a control signal input section. In this embodiment, the input symbol string is a symbol string representing phoneme and prosody.

また、音韻生成部と韻律制御部に関する説明は第１図に
示した実施例と同様である。発話スピード変更部７では
、韻律制御部５で設定された音韻継続時間長を変更する
。変更は、実行時に制御信号入力部により入力される制
御信号によって指示される発話スピードが実現されるよ
うに行なわれる。Further, the explanation regarding the phoneme generation section and the prosody control section is the same as that of the embodiment shown in FIG. The speech speed changing unit 7 changes the phoneme duration set by the prosody control unit 5. The changes are made such that the speech speed dictated by the control signal input by the control signal input section during execution is achieved.

平均基本周波数変更部８では韻律制御部５で設定された
基本周波数パターンを変更する。変更は、実行時に制御
信号入力部より入力される制御信号によって指示される
平均基本周波数が実現されるように行なわれる。ダイナ
ミックレンジ変更部９では、韻律制御部５で設定された
基本周波数パターンを変更する。変更は、実行時に制御
信号入力部より入力される制御信号によって指示される
ダイナミックレンジが実現されるように行なわれる。The average fundamental frequency changing section 8 changes the fundamental frequency pattern set by the prosody control section 5. The changes are made in such a way that the average fundamental frequency dictated by the control signal input from the control signal input during execution is achieved. The dynamic range changing section 9 changes the fundamental frequency pattern set by the prosody control section 5. The changes are made in such a way that the dynamic range dictated by the control signal input from the control signal input section during execution is achieved.

効　　　果以上の説明から明らかなように、本発明によると、入力
記号列中に埋め込まれた命令コードやその他の入力記号
列から得られる情報、音声合成実行時に外部から規則音
声合成装置に与えられる制御信号に応じて、前記韻律制
御規則より計算された基本周波数の変化範囲を拡張ある
いは縮小することによって、その場に応じた自然性の高
い韻律パターンを生成することができる。Effects As is clear from the above explanation, according to the present invention, the information obtained from the instruction code embedded in the input symbol string and other input symbol strings, and the information given from the outside to the regular speech synthesizer during speech synthesis execution. By expanding or reducing the variation range of the fundamental frequency calculated from the prosody control rule in accordance with the control signal, a highly natural prosody pattern can be generated depending on the occasion.

[Brief explanation of the drawing]

第１図及び第２図は、それぞれ本発明の詳細な説明する
ための構成図である。１・・・音声素片ファイル、２・・・音韻生成部、３・
・・入力記号列、４・・・音声合成器、５・・・韻律制
御部、６・・・韻律パラメータファイル、７・・・発話
スピード変更部、８・・・平均基本周波数変更部、９・
・・ダイナミックレンジ変更部、１０・・・制御信号、
１１・・・制御信号入力部。第　　１　　図第　　２　　図Ｕ１１FIG. 1 and FIG. 2 are configuration diagrams for explaining the present invention in detail, respectively. 1... Speech element file, 2... Phoneme generation unit, 3.
... input symbol string, 4 ... speech synthesizer, 5 ... prosody control section, 6 ... prosody parameter file, 7 ... speech speed change section, 8 ... average fundamental frequency change section, 9・
...dynamic range changing section, 10...control signal,
11... Control signal input section. Figure 1 Figure 2 Figure U11

Claims

[Claims]

1. The parameter series of the speech segments prepared in advance is
It reads out according to the input symbol string expressing the prosody, connects the speech unit parameter series according to the speech parameter combination rule, and uses the prosody control rule to read out the phoneme duration length and basic according to the input symbol string expressing the phoneme and prosody. In a regular speech synthesizer that adds prosody by calculating frequency patterns, amplitude patterns, etc., information obtained from instruction codes embedded in input symbol strings and other input symbol strings, and rules from outside when executing speech synthesis are used. A prosody control method characterized by expanding or reducing a change range of the fundamental frequency calculated from the prosody control rule in accordance with a control signal given to a speech synthesizer.