JPH04214600A - Sound synthesizing method - Google Patents

Sound synthesizing method

Info

Publication number
JPH04214600A
JPH04214600A JP2401799A JP40179990A JPH04214600A JP H04214600 A JPH04214600 A JP H04214600A JP 2401799 A JP2401799 A JP 2401799A JP 40179990 A JP40179990 A JP 40179990A JP H04214600 A JPH04214600 A JP H04214600A
Authority
JP
Japan
Prior art keywords
pitch
section
data
pattern
fluctuation data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP2401799A
Other languages
Japanese (ja)
Inventor
Yoshimasa Sawada
沢田 喜正
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Meidensha Corp
Meidensha Electric Manufacturing Co Ltd
Original Assignee
Meidensha Corp
Meidensha Electric Manufacturing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Meidensha Corp, Meidensha Electric Manufacturing Co Ltd filed Critical Meidensha Corp
Priority to JP2401799A priority Critical patent/JPH04214600A/en
Publication of JPH04214600A publication Critical patent/JPH04214600A/en
Pending legal-status Critical Current

Links

Abstract

PURPOSE:To enhance the naturality and crispness of synthesized sounds by previously forming the pitch fluctuation data of original sounds in a synthesizing section and superposing this data on the target pitch value at the time of synthesis. CONSTITUTION:A fluctuation data section 21 is previously formed in the voice synthesizing section 9. The average pitch values of the intervals between the n points of the pitch patterns of the original sounds of the synthesized sounds and the differences between the average pitch values and the respective pitches are stored as the fluctuation data of respective pieces of the data into the fluctuation data section 21. The data of the fluctuation data section 21 obtd. in such a manner are superposed with the pitch patterns formed in a pitch pattern forming section 5 in a superposing section 22 at the time of sound synthesis and the monotonousness of the pitches is decreased.

Description

【発明の詳細な説明】[Detailed description of the invention]

【0001】0001

【産業上の利用分野】この発明は漢字かな混じりのテキ
ストから音声を合成する音声合成方法に関するものであ
る。
BACKGROUND OF THE INVENTION 1. Field of the Invention This invention relates to a speech synthesis method for synthesizing speech from text containing kanji and kana.

【0002】0002

【従来の技術】規則音声合成は、任意の単語,文章等を
漢字かな混じりのテキストより音声として合成する手段
である。図3は、一般的な音声合成装置の概要を示す説
明図である。まず、テキスト入力部1に入力されたテキ
ストを、日本語処理部2により音素記号列に変換する。 次に、この音素記号列から、韻律パターン(時間長パタ
ーン・ピッチパターン・エネルギパターン)を生成する
。すなわち、時間長パターン生成部3により、時間長パ
ターンデータベース4を参照し、音声の継続時間を示す
時間長パターンを生成する。
2. Description of the Related Art Ruled speech synthesis is a means of synthesizing arbitrary words, sentences, etc. into speech from text mixed with kanji and kana. FIG. 3 is an explanatory diagram showing an outline of a general speech synthesis device. First, the text input to the text input section 1 is converted into a phoneme symbol string by the Japanese language processing section 2. Next, a prosodic pattern (duration pattern, pitch pattern, energy pattern) is generated from this phoneme symbol string. That is, the time length pattern generation unit 3 refers to the time length pattern database 4 and generates a time length pattern indicating the duration of the audio.

【0003】同様に、ピッチパターン生成部5により、
ピッチパターンデータベース6を参照し、音声の高さを
示すピッチパターンを生成する。また同様に、エネルギ
パターン生成部7により、エネルギパターンデータベー
ス8を参照し、音声の強さを示すエネルギパターンを生
成する。このようにして得られた各音韻パターンに基づ
いて、音声合成部9により、音声データベース10を参
照し、音声波形を合成する。なお11は、合成音声を出
力する音声出力部である。
Similarly, the pitch pattern generator 5 generates
Referring to the pitch pattern database 6, a pitch pattern indicating the pitch of the voice is generated. Similarly, the energy pattern generation unit 7 refers to the energy pattern database 8 and generates an energy pattern indicating the strength of the voice. Based on each phoneme pattern obtained in this way, the speech synthesis section 9 refers to the speech database 10 and synthesizes a speech waveform. Note that 11 is an audio output unit that outputs synthesized audio.

【0004】上記のように構成された音声合成装置では
、図3に示すような処理で合成音声を出力していた。 このときの、抑揚(ピッチ)パターンは図4に示すよう
なパターンを用いていた。すなわち、図4においては1
モーラにつき4点のピッチ目標値P10〜P34を与え
、その目標値間を直線で補間して、各フレームのピッチ
を最終的に与えて音声合成部9に供給していた。
[0004] The speech synthesizer configured as described above outputs synthesized speech through processing as shown in FIG. At this time, the intonation (pitch) pattern shown in FIG. 4 was used. That is, in Figure 4, 1
Four pitch target values P10 to P34 are given for each mora, interpolation is performed between the target values in a straight line, and the pitch of each frame is finally given and supplied to the speech synthesis section 9.

【0005】[0005]

【発明が解決しようとする課題】上記のような抑揚制御
方法では、各モーラの時間長が短いとあまり影響を受け
ないが、それが長くなると、近い値のピッチで合成され
る時間が長くなってしまう。つまり、ピッチの変化が単
調になるとともに合成音が機械的になりやすく、かつ明
瞭性も低下する問題がある。また、モーラ時間長が比較
的短いときでも、実際の人間が発する声のピッチパター
ンと比較すれば、変化に乏しく単調となる問題がある。
[Problem to be solved by the invention] In the intonation control method as described above, if the time length of each mora is short, it will not be affected much, but if it becomes longer, the time taken to synthesize pitches of similar values will increase. I end up. In other words, there is a problem that the pitch changes become monotonous, the synthesized sound tends to become mechanical, and the clarity also deteriorates. Furthermore, even when the mora time length is relatively short, there is a problem in that the pitch pattern is monotonous with little variation when compared to the pitch pattern of an actual human voice.

【0006】この発明は上記の事情に鑑みてなされたも
ので、合成音のピッチに自然なゆらぎを与えることによ
り、合成音の自然性及び明瞭性を向上させるようにした
音声合成方法を提供することを目的とする。
The present invention has been made in view of the above circumstances, and provides a speech synthesis method that improves the naturalness and clarity of synthesized speech by imparting natural fluctuations to the pitch of synthesized speech. The purpose is to

【0007】[0007]

【課題を解決するための手段】この発明は上記の目的を
達成するために、漢字かな混じり文のテキスト入力を日
本語処理部で解析して音韻列に変換し、この音韻列に基
づいて時間長パターン,ピッチパターン及びエネルギパ
ターンを各データベースを参照して生成し、生成された
これらのパターンに基づいて音声合成部で合成音声を生
成する方法において、予め音声合成部内に原音のピッチ
ゆらぎデータを作成しておき、このデータを合成時前記
ピッチパターンのピッチ目標値に重畳させたことを特徴
とするものである。
[Means for Solving the Problems] In order to achieve the above object, the present invention analyzes a text input of a sentence containing kanji and kana in a Japanese processing unit, converts it into a phoneme string, and uses the phoneme string to calculate the time. In a method in which long patterns, pitch patterns, and energy patterns are generated by referring to each database, and synthesized speech is generated in a speech synthesis section based on these generated patterns, pitch fluctuation data of the original sound is stored in the speech synthesis section in advance. This data is created in advance, and this data is superimposed on the pitch target value of the pitch pattern at the time of synthesis.

【0008】[0008]

【作用】前記ピッチパターンにゆらぎデータを重畳させ
ると、ピッチの単調さが低減し、合成音の自然性が向上
する。
[Operation] When fluctuation data is superimposed on the pitch pattern, the monotony of the pitch is reduced and the naturalness of the synthesized sound is improved.

【0009】[0009]

【実施例】以下この発明の一実施例を図面に基づいて説
明するに、図3と同一部分は同一符号を付してその説明
を省略する。図1において、21は音声合成部9内に予
め作成されたゆらぎデータ部で、このゆらぎデータ部2
1は次のようにして作成される。例えば合成音「あ」の
原音のピッチパターンが図2に示すようなとき、n点間
隔のピッチ平均値と各ピッチ平均値と各ピッチとの差を
、各フレームのゆらぎデータとしてゆらぎデータ部21
に格納する。以下同様にして各音についてゆらぎデータ
を得て、それらをゆらぎデータ部21に格納する。
DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described below with reference to the drawings. The same parts as those in FIG. In FIG. 1, 21 is a fluctuation data section created in advance in the speech synthesis section 9;
1 is created as follows. For example, when the pitch pattern of the original sound of the synthesized sound ``a'' is as shown in FIG.
Store in. Thereafter, fluctuation data is obtained for each sound in the same manner and stored in the fluctuation data section 21.

【0010】上記のように得られたゆらぎデータ部21
のデータは音声合成時にピッチパターン生成部5で生成
されたピッチパターンと重畳部22で重畳される。これ
により、従来の欠点であるピッチの単調さを低減できる
ようになり、合成音の自然性が向上できるようになる。
Fluctuation data section 21 obtained as described above
The data is superimposed on the pitch pattern generated by the pitch pattern generation section 5 at the time of speech synthesis in the superimposition section 22. This makes it possible to reduce pitch monotony, which is a drawback of the conventional method, and improve the naturalness of synthesized sounds.

【0011】[0011]

【発明の効果】以上述べたように、この発明によれば、
ゆらぎデータをピッチパターンに重畳させることにより
、合成音の自然性を向上させるとともに明瞭性を高める
ことができる利点がある。
[Effects of the Invention] As described above, according to the present invention,
By superimposing the fluctuation data on the pitch pattern, there is an advantage that the naturalness of the synthesized sound can be improved and the clarity can be improved.

【図面の簡単な説明】[Brief explanation of the drawing]

【図1】  この発明の一実施例を示す概略構成図。FIG. 1 is a schematic configuration diagram showing an embodiment of the present invention.

【図2】  合成音の原音のピッチパターン図。[Fig. 2] Pitch pattern diagram of the original sound of the synthesized sound.

【図3】  一般的な音声合成装置の概略説明図。FIG. 3 is a schematic explanatory diagram of a general speech synthesis device.

【図4】  合成音のピッチパターン図。[Fig. 4] Pitch pattern diagram of synthesized sound.

【符号の説明】[Explanation of symbols]

5…ピッチパターン生成部 9…音声合成部 11…音声出力部 21…ゆらぎデータ部 22…重畳部 5...Pitch pattern generation section 9...Speech synthesis section 11...Audio output section 21... Fluctuation data section 22...Superimposed part

Claims (1)

【特許請求の範囲】[Claims] 【請求項1】  漢字かな混じり文のテキスト入力を日
本語処理部で解析して音韻列に変換し、この音韻列に基
づいて時間長パターン,ピッチパターン及びエネルギパ
ターンを各データベースを参照して生成し、生成された
これらのパターンに基づいて音声合成部で合成音声を生
成する方法において、予め合成部内に原音のピッチゆら
ぎデータを作成しておき、このデータを合成時のピッチ
目標値に重畳させたことを特徴とする音声合成方法。
[Claim 1] A text input containing kanji and kana is analyzed by a Japanese processing unit and converted into a phoneme string, and based on this phoneme string, a time length pattern, a pitch pattern, and an energy pattern are generated by referring to each database. However, in the method of generating synthesized speech in the speech synthesis section based on these generated patterns, pitch fluctuation data of the original sound is created in advance in the synthesis section, and this data is superimposed on the pitch target value during synthesis. A speech synthesis method characterized by the following.
JP2401799A 1990-12-13 1990-12-13 Sound synthesizing method Pending JPH04214600A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2401799A JPH04214600A (en) 1990-12-13 1990-12-13 Sound synthesizing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2401799A JPH04214600A (en) 1990-12-13 1990-12-13 Sound synthesizing method

Publications (1)

Publication Number Publication Date
JPH04214600A true JPH04214600A (en) 1992-08-05

Family

ID=18511628

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2401799A Pending JPH04214600A (en) 1990-12-13 1990-12-13 Sound synthesizing method

Country Status (1)

Country Link
JP (1) JPH04214600A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008010413A1 (en) * 2006-07-21 2008-01-24 Nec Corporation Audio synthesis device, method, and program

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008010413A1 (en) * 2006-07-21 2008-01-24 Nec Corporation Audio synthesis device, method, and program
US8271284B2 (en) 2006-07-21 2012-09-18 Nec Corporation Speech synthesis device, method, and program
JP5093108B2 (en) * 2006-07-21 2012-12-05 日本電気株式会社 Speech synthesizer, method, and program

Similar Documents

Publication Publication Date Title
JP2000163088A (en) Speech synthesis method and device
JPH0833744B2 (en) Speech synthesizer
JPH08335096A (en) Text voice synthesizer
JPH04214600A (en) Sound synthesizing method
JP2001134283A (en) Device and method for synthesizing speech
JPH06318094A (en) Speech rule synthesizing device
JP2740510B2 (en) Text-to-speech synthesis method
JP2573586B2 (en) Rule-based speech synthesizer
JP2703253B2 (en) Speech synthesizer
JPH037995A (en) Generating device for singing voice synthetic data
JP3218639B2 (en) Energy control method in rule speech synthesizer
JPH06161490A (en) Rhythm processing system of speech synthesizing device
JPH0511794A (en) Sound synthesizer device
JP3034911B2 (en) Text-to-speech synthesizer
JP2910587B2 (en) Speech synthesizer
JPH08171394A (en) Speech synthesizer
JPH0227397A (en) Voice synthesizing and singing device
JP2573585B2 (en) Speech spectrum pattern generator
JPH06250685A (en) Voice synthesis system and rule synthesis device
JP2573587B2 (en) Pitch pattern generator
KR100300570B1 (en) Pitch controlling method for voice synthesizer
JPH05232980A (en) Intonation modifying method for voice synthesizer
Hwang et al. An RNN-Based Spectral Information Generation for Mandarin Text-To-Speech
Ding et al. Natural tone contours in a mandarin chinese speech synthesizer
JPH06161491A (en) Continuance time length processing system of speech synthesizing device