JP3089715B2

JP3089715B2 - Speech synthesizer

Info

Publication number: JP3089715B2
Application number: JP03184467A
Authority: JP
Inventors: 由里子駿河; 紀代原
Original assignee: Panasonic Corp; Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Corp; Panasonic Holdings Corp
Priority date: 1991-07-24
Filing date: 1991-07-24
Publication date: 2000-09-18
Anticipated expiration: 2015-09-18
Also published as: JPH0527789A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は複数の異なる合成手段を
組合せた音声合成装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech synthesizing apparatus combining a plurality of different synthesizing means.

【０００２】[0002]

【従来の技術】従来の規則合成と分析合成もしくは録音
再生を組合せて音声メッセージを提供する音声合成装置
の例を図１１に示す。１は文章生成手段であり、合成す
る内容が入力される。２は合成手段選択手段であり、文
章の内容によって合成手段を選択する。３は規則合成手
段であり、言語処理手段２ａ・韻律制御手段２ｂ・パラ
メータ作成手段２ｃ・合成処理手段２ｄから構成されて
いる。４は録音再生手段であり、波形データ格納手段４
ａと読みだし制御手段４ｂから構成されている。５はＤ
／Ａ部であり、３や４で得られた音声波形から合成音声
をつくる。６は音声を出力する合成音出力端で、スピー
カ・ヘッドホン・受話器などを示す。2. Description of the Related Art FIG. 11 shows an example of a conventional speech synthesizer for providing a voice message by combining rule synthesis and analysis synthesis or recording / reproduction. Reference numeral 1 denotes a sentence generation unit, to which contents to be combined are input. A synthesizing means selecting means 2 selects a synthesizing means according to the contents of the text. Reference numeral 3 denotes a rule synthesizing unit, which is composed of a language processing unit 2a, a prosody control unit 2b, a parameter creation unit 2c, and a synthesis processing unit 2d. Numeral 4 denotes a recording / reproducing means, and a waveform data storage means 4
a and read control means 4b. 5 is D
A / A section creates a synthesized voice from the voice waveform obtained in steps 3 and 4. Reference numeral 6 denotes a synthesized sound output terminal for outputting a voice, and indicates a speaker, headphones, a receiver, and the like.

【０００３】[0003]

【発明が解決しようとする課題】規則合成と分析合成・
録音再生など異なる合成手段の合成音については、音質
にまだかなりの差がある。また録音再生についてはその
音声を使用する際、発音速度やアクセントなど全て録音
時のまま変更することはできない。この為、複数の合成
手段を用いて音声を合成する場合には、合成音をつなぐ
部分がよりなめらかになるよう何らかの処理が必要であ
る。従来の音声合成装置は、何の処理も行わずに音声を
つないで合成する為、つなぎの部分で音質に差があるた
めに違和感を感じやすく、音量が異なるために聞き取り
にくかったり、発音速度やイントネーションが異なるた
めに不自然な印象を与えるという問題点があった。SUMMARY OF THE INVENTION Rule synthesis and analytical synthesis
There is still a considerable difference in sound quality for synthesized sounds of different synthesis means such as recording and playback. Also, when using the voice for recording and reproduction, it is not possible to change all of the pronunciation speed, accent, etc. as they were at the time of recording. For this reason, when synthesizing speech using a plurality of synthesizing means, some processing is required so that the portion connecting the synthesized sounds becomes smoother. Conventional speech synthesizers synthesize and connect voices without performing any processing.Therefore, there is a difference in sound quality at the joints, making it easy to feel uncomfortable. There is a problem that an unnatural impression is given due to different intonation.

【０００４】本発明は、懸かる点に鑑みてなされたもの
で、合成音をつなぐ部分のオーバーラップを行ったり、
発音速度や音量・イントネーション等のパラメータの調
整を行うことにより、合成音のつなぎの部分のよりなめ
らかな品質の高い合成音を提供することを目的とする。[0004] The present invention has been made in view of the points involved, and it is possible to overlap portions connecting synthetic sounds,
It is an object of the present invention to provide a synthesized sound having a smoother and higher quality at a portion where the synthesized sound is connected by adjusting parameters such as a pronunciation speed, a volume, and an intonation.

【０００５】[0005]

【課題を解決するための手段】(1) 文章を生成する文章
生成手段と、音声波形を合成する複数の異なる合成手段
と、前記文章の内容によって前記合成手段を選択する合
成手段選択手段と、前記合成手段選択手段により選択さ
れた合成手段から出力される複数の合成音の波形を足し
合わせ処理するオーバーラップ手段とを備えた音声合成
装置を構成する。Means for Solving the Problems (1) Sentence generating means for generating a sentence, a plurality of different synthesizing means for synthesizing a speech waveform, and synthesizing means selecting means for selecting the synthesizing means according to the contents of the sentence; A speech synthesizing apparatus comprising an overlap unit for adding and processing waveforms of a plurality of synthesized sounds output from the synthesis unit selected by the synthesis unit selection unit.

【０００６】[0006]

【０００７】(2) また、その複数の合成音のいずれかが
パラメトリックな合成手段であった場合に、その合成手
段で作成された合成パラメータを他の音声にあわせて調
整するパラメータ調整手段と、その複数の合成音の波形
を足し合わせ処理するオーバーラップ手段とを備える。 [0007] (2) If any of the plurality of synthesized speech is a parametric synthesis means, a parameter adjustment means for adjusting together synthesis parameters created by the combining means to the other voice, And overlapping means for adding and processing the waveforms of the plurality of synthesized sounds.

【０００８】(3) 更に、重複する音声波形を足し合わせ
る際、それぞれのアンプを制御手段を用いて調節しなが
ら重ねて出力するオーバーラップ手段を備える。( 3 ) Further, when adding overlapping audio waveforms, an overlap means is provided for overlapping and outputting each amplifier while adjusting each amplifier using the control means.

【０００９】[0009]

【作用】本発明の上記構成によれば、 (1) 合成手段選択手段によって、入力された文章を合成
方法の切り替わる部分が重複するように文章を切り分け
て、それぞれ選択した合成手段によって音声を作成す
る。その後、音声の重複する部分をオーバーラップ処理
することによって、つなぎの部分のよりなめらかな合成
音を提供する。According to the above configuration of the present invention, (1) a sentence is segmented by the synthesizing means selecting means so that the part where the synthesizing method is switched overlaps, and a speech is created by the respectively selected synthesizing means. I do. Thereafter, the overlapped portion of the speech is overlapped to provide a smoother synthesized sound of the connected portion.

【００１０】(2) 合成手段選択手段によって、入力され
た文章を切り分けて、それぞれ選択した合成手段によっ
て音声を作成する。その際、パラメータ調整手段によっ
て、その複数の合成音のいづれかがパラメトリックな合
成手段であった場合にその合成手段で作成された合成パ
ラメータを他の音声にあわせて調整することによって、
つなぎの部分のよりなめらかな合成音を提供する。(2) The input sentence is cut by the synthesizing means selecting means, and a voice is created by the selected synthesizing means. At this time, if any of the plurality of synthesized sounds is a parametric synthesis means, the parameter adjustment means adjusts the synthesis parameters created by the synthesis means in accordance with other voices.
Provides a smoother synthetic sound at the joint.

【００１１】(2) 合成手段選択手段によって、入力され
た文章を合成方法の切り替わる部分が重複するように文
章を切り分けて、それぞれ選択した合成手段によって音
声を作成する。その際、パラメータ調整手段によって、
その複数の合成音のいづれかがパラメトリックな合成手
段であった場合にその合成手段で作成された合成パラメ
ータを他の音声にあわせて調整した後、さらに音声の重
複する部分をオーバーラップ処理することによって、つ
なぎの部分のよりなめらかな合成音を提供する。 (3) 第１の発明におけるオーバーラップ処理において、
その重複する音声波形を足し合わせる際、それぞれのア
ンプを制御手段を用いて調節しながら重ねて出力するこ
とにより、つなぎの部分のよりなめらかな合成音を提供
する。( 2 ) Sentences are divided by the synthesizing means selecting means so that the portions where the synthesizing methods are switched overlap each other, and speech is created by the respectively selected synthesizing means. At that time, the parameter adjustment means
If any of the synthesized sounds is a parametric synthesis means, the synthesis parameters created by the synthesis means are adjusted according to the other voices, and then the overlapping parts of the voices are overlapped. Provides a smoother synthetic sound at the junction. ( 3 ) In the overlap processing in the first invention,
When adding the overlapping voice waveforms, the respective amplifiers are superimposed and output while being adjusted by using the control means, thereby providing a smoother synthesized sound at the connecting portion.

【００１２】[0012]

【Example】

（実施例１）図１は、請求項１記載の本発明の実施例の
ブロック図である。２つの合成方法の組合せによる、銀
行の振込み確認メッセージを例にとって説明する。な
お、この２つの合成方法の組合せによる銀行の振込み確
認メッセージは、以下全ての実施例について同様に用い
る。図において、１は文章生成部であり、キーボード等
を示す。２は文章内容によって合成方法を選択する合成
手段選択部であり、固有名詞とその後に続く２・３文字
分を規則合成に、そして固有名詞以外の部分を録音再生
に、一部重複するように文章を切り分ける。３は入力さ
れた文字列を単語や文節に分割し、読みやアクセント・
品詞等を決定する言語処理部３ａと、ポーズの位置や長
さの決定・イントネーションの決定を行う韻律制御部３
ｂと、前記言語処理部と韻律制御部の結果に従って合成
パラメータを作成するパラメータ作成部３ｃと、前記パ
ラメータ作成部によって作成されたパラメータから合成
音を作成する合成処理部３ｄから構成される規則合成部
である。４は波形データ格納部４ａと、読みだし制御部
４ｂからなる録音再生部である。５は規則合成用波形デ
ータと録音再生用波形データの重複する部分を足し合わ
せ処理するオーバーラップ部で、６はこれにより完成し
た波形データから合成音をつくるＤ／Ａ部である。７は
音声を出力する合成音声出力端であり、スピーカ・ヘッ
ドホン・受話器などを示す。(Embodiment 1) FIG. 1 is a block diagram of a first embodiment of the present invention. The following describes an example of a bank transfer confirmation message by a combination of the two combining methods. The transfer confirmation message of the bank by the combination of the two combining methods is used in the same manner in all the following embodiments. In the figure, reference numeral 1 denotes a text generation unit, such as a keyboard. Numeral 2 is a synthesizing means selecting section for selecting a synthesizing method according to the text content, so that the proper noun and the following two or three characters are used for rule synthesis, and the part other than the proper noun is used for recording / reproducing. Separate sentences. 3 divides the input character string into words and phrases, and
A language processing unit 3a for determining the part of speech and the like, and a prosody control unit 3 for determining the position and length of the pose and the intonation
b, a parameter creation unit 3c for creating a synthesis parameter according to the results of the language processing unit and the prosody control unit, and a synthesis processing unit 3d for creating a synthesized sound from the parameters created by the parameter creation unit. Department. Reference numeral 4 denotes a recording / playback unit including a waveform data storage unit 4a and a reading control unit 4b. Reference numeral 5 denotes an overlap section for adding overlapping portions of the rule-synthesizing waveform data and recording / reproducing waveform data, and 6 denotes a D / A section for producing a synthesized sound from the completed waveform data. Reference numeral 7 denotes a synthesized voice output terminal that outputs voice, and indicates a speaker, headphones, a receiver, and the like.

【００１３】次に各処理の詳細について実例を用いて説
明する。「松下電器の伊藤様から振込がありました。」
という文章が入力された場合について考える。文章入力
部１により合成する文章が入ってくると、合成手段選択
部２により入力文章は固有名詞「松下電器の伊藤」とそ
れに続く「様から」の部分は規則合成、固有名詞以外の
部分「様から振込がありました。」は録音再生と分けら
れる。規則合成部３へ送られた文字列は、言語処理部３
ａ・韻律制御部３ｂ・パラメータ作成部３ｃによって以
下のように分割され、アクセント型や品詞・読みなど合
成パラメータの情報を得たあと、合成処理部３ｄによっ
て合成音がつくられる。（入力文章）「松下電器の伊藤様から」（単語分割）松下電器／の／伊藤／様／から（読み）マツシタテ゛ンキノイトウサマカラ（アクセント型）５Ｂ０２Ａ（品詞）固有名詞格助固有名詞名詞接助ここで「の」に対して与えられているアクセント型Ｂと
「から」に対して与えられているアクセント型Ａは、Ｎ
ＨＫアクセント辞書・解説付録（日本放送協会1985年）
に記載されているもので、自立語と結合して文節を構成
する際の結合アクセント核のある音節位置を示したもの
である。Next, the details of each processing will be described using actual examples. "There was a transfer from Mr. Ito of Matsushita Electric."
Consider the case where the sentence is input. When a sentence to be synthesized by the sentence input unit 1 enters, the input sentence by the synthesizing means selecting unit 2 is a proper noun "Matsushita Electric's Ito" followed by a rule synthesis, and a part other than a proper noun " There was a transfer from Mr .. "is separated from recording and playback. The character string sent to the rule synthesizing unit 3 is
a, the prosody control unit 3b, and the parameter creation unit 3c divide the data as follows, and obtain information on synthesis parameters such as accent type, part of speech, and reading, and then generate a synthesized sound by the synthesis processing unit 3d. (Input sentence) "From Matsushita Electric's Ito-sama" (word division) Matsushita Electric / no / Ito / sama / kara (reading) Matsushita Tenki No Ito Sama Kara (accent type) 5B02A (part of speech) Proper noun case Auxiliary proper nouns Nouns Auxiliary Here, the accent type B given to "no" and the accent type A given to "kara" are N
HK Accent Dictionary and Commentary Appendix (Japan Broadcasting Corporation, 1985)
This shows a syllable position with a connecting accent nucleus when combining with an independent word to form a phrase.

【００１４】録音再生部４へ送られた文字列は、波形デ
ータ格納部４ａからそれに合う音声を取り出し、読みだ
し制御部４ｂによって合成音を再生する。オーバーラッ
プ部５で、規則合成用波形データと録音再生用波形デー
タの重複する部分を足し合わせ処理した後、その波形か
らＤ／Ａ部６によって合成音をつくり合成音出力端７よ
り音声が提供される。このように、２つの波形データの
重複する部分でオーバーラップ処理を行うことにより、
つなぎの部分がより自然で違和感の少ない音声を提供す
ることができる。尚、本実施例では一例として「固有名
詞」は規則合成と設定したが、これは本発明を何ら拘束
するものではない。The character string sent to the recording / reproducing unit 4 is extracted from the waveform data storage unit 4a, and the read sound is reproduced by the reading control unit 4b. After the overlapping portion 5 adds the overlapping portions of the regular synthesis waveform data and the recording / reproduction waveform data, the D / A portion 6 creates a synthesized sound from the waveform and provides a sound from the synthesized sound output terminal 7. Is done. As described above, by performing the overlap processing on the overlapping portion of the two waveform data,
It is possible to provide a sound with a more natural connection and less discomfort. In the present embodiment, as an example, “proper noun” is set as rule composition, but this does not restrict the present invention at all.

【００１５】（実施例２）図２は、請求項１記載の本発
明の実施例のブロック図である。実施例１の構成の録音
再生部４を分析合成部に代えたものである。図におい
て、１文章生成部・２合成手段選択部・３規則合成部・
５オーバーラップ部・６Ｄ／Ａ部・７合成音声出力端
は、実施例１と同様の処理を行う。４は入力文章からパ
ラメータを作成するパラメータ格納部４ａとパラメータ
制御部４ｂと、前記パラメータ格納部とパラメータ制御
部で作成されたパラメータから合成音を作成する合成処
理部４ｃからなる分析合成部である。(Embodiment 2) FIG. 2 is a block diagram of a second embodiment of the present invention. In this embodiment, the recording / reproducing unit 4 having the configuration of the first embodiment is replaced with an analysis / synthesis unit. In the figure, 1 sentence generation section, 2 synthesis means selection section, 3 rule synthesis section,
The 5 overlap section, 6D / A section, and 7 synthesized voice output terminal perform the same processing as in the first embodiment. Reference numeral 4 denotes an analysis / synthesis unit including a parameter storage unit 4a and a parameter control unit 4b for generating parameters from input sentences, and a synthesis processing unit 4c for generating a synthesized sound from the parameters generated by the parameter storage unit and the parameter control unit. .

【００１６】次に各処理の詳細について実例を用いて説
明する。「松下電器の伊藤様から振込がありました。」
という文章が入力された場合について考える。文章入力
部１により合成する文章が入ってくると、合成手段選択
部２により入力文章は固有名詞「松下電器の伊藤」とそ
れに続く「様から」の部分は規則合成、固有名詞以外の
部分「様から振込がありました。」は分析合成と分けら
れる。規則合成部３へ送られた文字列は、言語処理部３
ａ・韻律制御部３ｂ・パラメータ作成部３ｃによって以
下のように分割され、アクセント型や品詞・読みなど合
成パラメータの情報を得たあと、合成処理部３ｄによっ
て合成音がつくられる。（入力文章）「松下電器の伊藤様から」（単語分割）松下電器／の／伊藤／様／から（読み）マツシタテ゛ンキノイトウサマカラ（アクセント型）５Ｂ０２Ａ（品詞）固有名詞格助固有名詞名詞接助ここで「の」に対して与えられているアクセント型Ｂと
「から」に対して与えられているアクセント型Ａは、Ｎ
ＨＫアクセント辞書・解説付録（日本放送協会1985年）
に記載されているもので、自立語と結合して文節を構成
する際の結合アクセント核のある音節位置を示したもの
である。Next, the details of each processing will be described using actual examples. "There was a transfer from Mr. Ito of Matsushita Electric."
Consider the case where the sentence is input. When a sentence to be synthesized by the sentence input unit 1 enters, the input sentence by the synthesizing means selecting unit 2 is a proper noun "Matsushita Electric's Ito" followed by a rule synthesis, and a part other than a proper noun " There was a transfer from us. "Is separated from analytical synthesis. The character string sent to the rule synthesizing unit 3 is
a, the prosody control unit 3b, and the parameter creation unit 3c divide the data as follows, and obtain information on synthesis parameters such as accent type, part of speech, and reading, and then generate a synthesized sound by the synthesis processing unit 3d. (Input sentence) "From Matsushita Electric's Ito-sama" (word division) Matsushita Electric / no / Ito / sama / kara (reading) Matsushita Tenki No Ito Sama Kara (accent type) 5B02A (part of speech) Proper noun case Auxiliary proper nouns Nouns Auxiliary Here, the accent type B given to "no" and the accent type A given to "kara" are N
HK Accent Dictionary and Commentary Appendix (Japan Broadcasting Corporation, 1985)
This shows a syllable position with a connecting accent nucleus when combining with an independent word to form a phrase.

【００１７】分析合成部４へ送られた文字列は、パラメ
ータ格納部４ａ、パラメータ制御部４ｂ、合成処理部４
ｃによって合成音を再生する。オーバーラップ部５で、
規則合成用波形データと分析合成用波形データの重複す
る部分を足し合わせ処理した後、その波形からＤ／Ａ部
６によって合成音をつくり合成音出力端７より音声が提
供される。このように、２つの波形データの重複する部
分でオーバーラップ処理を行うことにより、つなぎの部
分がより自然で違和感の少ない音声を提供することがで
きる。尚、本実施例では一例として「固有名詞」は規則
合成と設定したが、これは本発明を何ら拘束するもので
はない。The character string sent to the analysis / synthesis unit 4 is stored in a parameter storage unit 4a, a parameter control unit 4b, and a synthesis processing unit 4
The synthesized sound is reproduced by c. In the overlap part 5,
After the overlapping portion of the rule-synthesizing waveform data and the analysis-synthesis waveform data are added together, a synthetic sound is formed by the D / A section 6 from the waveform, and a sound is provided from the synthetic sound output terminal 7. As described above, by performing the overlapping process on the overlapping portion of the two waveform data, it is possible to provide a sound with a more natural connection portion and less discomfort. In the present embodiment, as an example, “proper noun” is set as rule composition, but this does not restrict the present invention at all.

【００１８】（実施例３）図３は、請求項１記載の本発
明の実施例のブロック図である。実施例２の構成の規則
合成部３を録音再生部に代えたものである。図におい
て、１文章生成部・２合成手段選択部・４分析合成部・
５オーバーラップ部・６Ｄ／Ａ部・７合成音声出力端
は、実施例２と同様の処理を行う。３は波形データ格納
部３ａと、読みだし制御部３ｂからなる録音再生部であ
る。(Embodiment 3) FIG. 3 is a block diagram showing an embodiment 3 of the present invention. The rule synthesizing unit 3 of the second embodiment is replaced with a recording / reproducing unit. In the figure, 1 sentence generation section, 2 synthesis means selection section, 4 analysis synthesis section,
The 5 overlap section, 6D / A section, and 7 synthesized voice output terminal perform the same processing as in the second embodiment. Reference numeral 3 denotes a recording / reproducing unit including a waveform data storage unit 3a and a reading control unit 3b.

【００１９】次に各処理の詳細について実例を用いて説
明する。「松下電器の伊藤様から振込がありました。」
という文章が入力された場合について考える。文章入力
部１により合成する文章が入ってくると、合成手段選択
部２により入力文章は固有名詞「松下電器の伊藤」とそ
れに続く「様から」の部分は分析合成、固有名詞以外の
部分「様から振込がありました。」は録音再生と分けら
れる。録音再生部３へ送られた文字列は、波形データ格
納部３ａからそれに合う音声を取り出し、読みだし制御
部３ｂによって合成音を再生する。分析合成部４へ送ら
れた文字列は、パラメータ格納部４ａ、パラメータ制御
部４ｂ、合成処理部４ｃによって合成音を再生する。オ
ーバーラップ部５で、録音再生用波形データと分析合成
用波形データの重複する部分を足し合わせ処理した後、
その波形からＤ／Ａ部６によって合成音をつくり合成音
出力端７より音声が提供される。このように、録音再生
用波形データと分析合成用波形データの重複する部分で
オーバーラップ処理を行うことにより、つなぎの部分が
より自然で違和感の少ない音声を提供することができ
る。尚、本実施例では一例として「固有名詞」は分析合
成と設定したが、これは本発明を何ら拘束するものでは
ない。Next, the details of each process will be described using actual examples. "There was a transfer from Mr. Ito of Matsushita Electric."
Consider the case where the sentence is input. When a sentence to be synthesized by the sentence input unit 1 comes in, the input sentence is input by the synthesizing means selection unit 2 into the proper noun “Ito Matsushita Electric” and the subsequent “samakara” are analyzed and synthesized. There was a transfer from Mr .. "is separated from recording and playback. The character string sent to the recording / reproducing unit 3 is extracted from the waveform data storage unit 3a, and the read control unit 3b reproduces the synthesized sound. The character string sent to the analysis / synthesis unit 4 is reproduced as a synthesized sound by the parameter storage unit 4a, the parameter control unit 4b, and the synthesis processing unit 4c. After the overlapping portion 5 adds together the overlapping portions of the recording / playback waveform data and the analysis / synthesis waveform data,
A synthetic sound is formed from the waveform by the D / A unit 6, and a sound is provided from a synthetic sound output terminal 7. As described above, by performing the overlap processing on the overlapping portion of the recording / reproducing waveform data and the analysis / synthesis waveform data, it is possible to provide a sound with a more natural connection portion and less discomfort. In this embodiment, as an example, “proper noun” is set to analysis synthesis, but this does not restrict the present invention at all.

【００２０】（参考例１）図４は、本発明の参考例１のブロック図である。請求項
１記載の実施例のオーバーラップ処理に代わり、パラメ
トリックな合成手段においてパラメータの調整を行うも
のである。「松下電器の伊藤様から振込がありまし
た。」という文章が入力された場合について考える。こ
こでは、２つの音声について差のあるパラメータはピッ
チのみであったとする。文章入力部１により合成する文
章が入ってくると、合成手段選択部２により入力文章
は、固有名詞の「松下電器の伊藤」は規則合成、固有名
詞以外の「様から振込がありました。」は録音再生へと
分けられる。規則合成部へ送られた文章は言語処理部３
ａ・韻律制御部３ｂ・パラメータ作成部３ｃによって以
下のように分割され、アクセント型や品詞・読みなど合
成パラメータの情報を得る。（入力文章）「松下電器の伊藤」（単語分割）松下電器／の／伊藤（読み）マツシタテ゛ンキノイトウ（アクセント型）５Ｂ０（品詞）固有名詞格助固有名詞ここで「の」に対して与えられているアクセント型Ｂ
は、ＮＨＫアクセント辞書・解説付録（日本放送協会
1985年）に記載されているもので、自立語と結合して文
節を構成する際の結合アクセント核のある音節位置を示
したものである。(Embodiment 1 ) FIG. 4 is a block diagram of Embodiment 1 of the present invention. Instead of the overlap processing of the first embodiment, the parameter is adjusted by a parametric synthesis means. Consider the case where the sentence "There was a transfer from Mr. Ito of Matsushita Electric" was input. Here, it is assumed that the only parameter having a difference between the two sounds is the pitch. When a sentence to be synthesized by the sentence input unit 1 enters, the input sentence by the synthesizing means selection unit 2 is a proper noun, "Matsushita Electric's Ito" is rule-synthesized, and "other than a proper noun, there was a transfer from the state." Is divided into recording and playback. The sentence sent to the rule synthesis unit is the language processing unit 3
a, the prosody control unit 3b, and the parameter creation unit 3c divide the data as follows, and obtain information on the synthesis parameters such as accent type, part of speech, and reading. (Input sentence) "Ito of Matsushita Electric" (word division) Matsushita Electric / no / Ito (reading) Matsushita Nokito Ito (accent type) 5B0 (part of speech) proper noun Gasuke proper noun Here for "no" Accent type B given
Is an NHK accent dictionary and commentary appendix (Japan Broadcasting Corporation)
1985), and shows the syllable position with the connecting accent nucleus when combining with independent words to form a phrase.

【００２１】録音再生部４へ送られた文字列は、波形デ
ータ格納部４ａからそれに合う音声を取り出し、読みだ
し制御部４ｂによって合成音を再生する。パラメータ調
整部５は、録音再生用波形データを参照して、規則合成
用パラメータと差のあったピッチのパラメータを少し下
げて調整し、調整されたパラメータから合成処理部６に
よって合成音を作成する。規則合成用波形データと録音
再生用波形データをＤ／Ａ部７によって合成音をつくり
合成音出力端８より音声が提供される。このように、規
則合成用音響パラメータを録音再生用波形データに合わ
せて調整することにより、つなぎの部分がより自然で違
和感の少ない音声を提供することができる。尚、本実施
例では一例として「固有名詞」は規則合成と設定した
が、これは本発明を何ら拘束するものではない。また、
パラメータ調整部では「ピッチ」のみを調整するパラメ
ータと設定したが、これも本発明を何ら拘束するもので
はなく、合成に用いられる全てのパラメータを調整でき
るものとする。The character string sent to the recording / reproducing unit 4 is extracted from the waveform data storage unit 4a, and the read sound is reproduced by the reading control unit 4b. The parameter adjuster 5 refers to the recording / reproducing waveform data and slightly lowers and adjusts the parameter of the pitch different from the rule synthesizing parameter, and creates a synthesized sound from the adjusted parameter by the synthesizing processor 6. . A synthetic sound is created by the D / A unit 7 from the regular synthesizing waveform data and the recording / reproducing waveform data, and a sound is provided from a synthetic sound output terminal 8. In this way, by adjusting the rule-synthesizing acoustic parameters in accordance with the recording / reproducing waveform data, it is possible to provide a sound with a more natural connection portion and less discomfort. In the present embodiment, as an example, “proper noun” is set as rule composition, but this does not restrict the present invention at all. Also,
Although the parameter adjustment unit sets the parameter to adjust only the “pitch”, this is not intended to restrict the present invention in any way, and all parameters used for synthesis can be adjusted.

【００２２】（参考例２）図５は、本発明の参考例２のブロック図である。参考例
１の録音再生に代わり、分析合成を用いたものである。
図において、１文章生成部・２合成手段選択部・３規則
合成部は参考例１と同様の処理を行う。４はパラメータ
格納部４ａと、パラメータ制御部４ｂからなる分析合成
部である。５は前記規則合成部３と分析合成部４で作成
されたパラメータを比較して、差のあるパラメータを調
整するパラメータ調整部である。６は調整されたパラメ
ータから合成音を作る合成処理部で、７はＤ／Ａ部であ
る。８は音声を出力する合成音声出力端であり、スピー
カ・ヘッドホン・受話器などを示す。(Embodiment 2 ) FIG. 5 is a block diagram of Embodiment 2 of the present invention. Reference example
In this example , analysis and synthesis are used in place of recording / reproducing of No. 1 .
In the figure, 1 sentence generating section, 2 synthesizing means selecting section, and 3 rule synthesizing section perform the same processing as in the first embodiment . Reference numeral 4 denotes an analysis / synthesis unit including a parameter storage unit 4a and a parameter control unit 4b. Reference numeral 5 denotes a parameter adjustment unit that compares parameters created by the rule synthesis unit 3 and the analysis synthesis unit 4 and adjusts parameters having differences. Reference numeral 6 denotes a synthesis processing unit for generating a synthesized sound from the adjusted parameters, and reference numeral 7 denotes a D / A unit. Reference numeral 8 denotes a synthesized voice output terminal that outputs voice, and indicates a speaker, headphones, a receiver, and the like.

【００２３】次に各処理の詳細について実例を用いて説
明する。「松下電器の伊藤様から振込がありました。」
という文章が入力された場合について考える。ここで
は、２つの音声について差のあるパラメータはピッチの
みであったとする。文章入力部１により合成する文章が
入ってくると、合成手段選択部２により入力文章は固有
名詞「松下電器の伊藤」は規則合成、固有名詞以外の部
分「様から振込がありました。」は分析合成と分けられ
る。規則合成部へ送られた文章は言語処理部３ａ・韻律
制御部３ｂ・パラメータ作成部３ｃによって以下のよう
に分割され、アクセント型や品詞・読みなど合成パラメ
ータの情報を得る。（入力文章）「松下電器の伊藤」（単語分割）松下電器／の／伊藤（読み）マツシタテ゛ンキノイトウ（アクセント型）５Ｂ０（品詞）固有名詞格助固有名詞ここで「の」に対して与えられているアクセント型Ｂ
は、ＮＨＫアクセント辞書・解説付録（日本放送協会
1985年）に記載されているもので、自立語と結合して文
節を構成する際の結合アクセント核のある音節位置を示
したものである。Next, the details of each process will be described using actual examples. "There was a transfer from Mr. Ito of Matsushita Electric."
Consider the case where the sentence is input. Here, it is assumed that the only parameter having a difference between the two sounds is the pitch. When a sentence to be synthesized by the sentence input unit 1 comes in, the input sentence by the synthesizing means selection unit 2 is a proper noun "Matsushita Electric's Ito" is rule-synthesized, and the part other than the proper noun "There was a transfer from Sama." Separated from analytical synthesis. The sentence sent to the rule synthesizing unit is divided as follows by the language processing unit 3a, the prosody control unit 3b, and the parameter creation unit 3c, and information on synthesis parameters such as accent type, part of speech, and reading is obtained. (Input sentence) "Ito of Matsushita Electric" (word division) Matsushita Electric / no / Ito (reading) Matsushita Nokito Ito (accent type) 5B0 (part of speech) proper noun Gasuke proper noun Here for "no" Accent type B given
Is an NHK accent dictionary and commentary appendix (Japan Broadcasting Corporation)
1985), and shows the syllable position with the connecting accent nucleus when combining with independent words to form a phrase.

【００２４】分析合成部４へ送られた文字列は、パラメ
ータ格納部４ａ・パラメータ制御部４ｂによってパラメ
ータを作成する。パラメータ調整部５で、規則合成用パ
ラメータと分析合成用パラメータを比較し、両方のパラ
メータを調整してピッチを合わせた後、その合成パラメ
ータから合成処理部６・Ｄ／Ａ部７によって合成音をつ
くり合成音出力端８より音声が提供される。このよう
に、規則合成と分析合成の２つの合成手段を用いる際に
パラメータ調整を行うことにより、違和感の少ない音声
を提供することができる。尚、本実施例では一例として
「固有名詞」は規則合成と設定したが、これは本発明を
何ら拘束するものではない。また、パラメータ調整部で
は「ピッチ」のみを調整するパラメータと設定したが、
これも本発明を何ら拘束するものではなく、合成に用い
られる全てのパラメータを調整できるものとする。The character string sent to the analysis / synthesis unit 4 is used to create parameters by a parameter storage unit 4a and a parameter control unit 4b. The parameter adjusting unit 5 compares the parameters for rule synthesis and the parameters for analysis and synthesis, adjusts both parameters to adjust the pitch, and then synthesizes the synthesized sound from the synthesized parameters by the synthesis processing unit 6 / D / A unit 7. A sound is provided from the structure synthesis sound output terminal 8. As described above, by adjusting the parameters when using the two synthesis means of the rule synthesis and the analysis synthesis, it is possible to provide a voice with less discomfort. In the present embodiment, as an example, “proper noun” is set as rule composition, but this does not restrict the present invention at all. In the parameter adjustment section, the parameter was set to adjust only "pitch".
This does not restrict the present invention in any way, and all parameters used for the synthesis can be adjusted.

【００２５】（参考例３）図６は、本発明の参考例３のブロック図である。参考例
２の規則合成に代わり、録音再生を用いたものである。
図において、１文章生成部・２合成手段選択部・４分析
合成部は参考例２と同様の処理を行う。３は波形データ
格納部３ａと、読みだし制御部３ｂからなる録音再生部
である。５は前記録音再生部３によって得た波形データ
を参照して、分析合成部４で作成されたパラメータと差
のあるパラメータを分析合成用パラメータにおいて調整
するパラメータ調整部である。６は調整されたパラメー
タから合成音を作る合成処理部で、７は録音再生用波形
データと分析合成波形データから合成音を作るＤ／Ａ部
である。８は音声を出力する合成音声出力端であり、ス
ピーカ・ヘッドホン・受話器などを示す。(Embodiment 3 ) FIG. 6 is a block diagram of Embodiment 3 of the present invention. Reference example
The recording / playback is used instead of the rule synthesis of 2 .
In the figure, a one-sentence generation unit, a two-synthesis-means selection unit, and a four-analysis-synthesis unit perform the same processing as in the second embodiment . Reference numeral 3 denotes a recording / reproducing unit including a waveform data storage unit 3a and a reading control unit 3b. Reference numeral 5 denotes a parameter adjustment unit that adjusts a parameter having a difference from the parameter created by the analysis / synthesis unit 4 in the parameters for analysis / synthesis with reference to the waveform data obtained by the recording / reproduction unit 3. Reference numeral 6 denotes a synthesis processing unit that generates a synthesized sound from the adjusted parameters, and reference numeral 7 denotes a D / A unit that generates a synthesized sound from recording / reproducing waveform data and analysis / synthesis waveform data. Reference numeral 8 denotes a synthesized voice output terminal that outputs voice, and indicates a speaker, headphones, a receiver, and the like.

【００２６】次に各処理の詳細について実例を用いて説
明する。「松下電器の伊藤様から振込がありました。」
という文章が入力された場合について考える。ここで
は、２つの音声について差のあるパラメータはピッチの
みであったとする。文章入力部１により合成する文章が
入ってくると、合成手段選択部２により入力文章は固有
名詞「松下電器の伊藤」は分析合成、固有名詞以外の部
分「様から振込がありました。」は録音再生と分けられ
る。録音再生部へ送られた文章は波形データ格納部３ａ
・読みだし制御部３ｂによって合成音を再生する。分析
合成部４へ送られた文字列は、パラメータ格納部４ａ・
パラメータ制御部４ｂによってパラメータを作成する。
パラメータ調整部５で、録音再生用波形データを参照し
て、そのピッチに合うように分析合成用パラメータを調
整した後、その合成パラメータから合成処理部６によっ
て分析合成用波形データが作られ、この分析合成用波形
データと録音再生用波形データはＤ／Ａ部７を通り、合
成音出力端８より音声が提供される。このように、録音
再生用波形データに合うよう、パラメトリックな合成手
段の分析合成のパラメータを調整することによって、つ
なぎの部分がより自然で違和感の少ない音声を提供する
ことができる。尚、本実施例では一例として「固有名
詞」は分析合成と設定したが、これは本発明を何ら拘束
するものではない。また、パラメータ調整部では「ピッ
チ」のみを調整するパラメータと設定したが、これも本
発明を何ら拘束するものではなく、合成に用いられる全
てのパラメータを調整できるものとする。Next, the details of each process will be described using actual examples. "There was a transfer from Mr. Ito of Matsushita Electric."
Consider the case where the sentence is input. Here, it is assumed that the only parameter having a difference between the two sounds is the pitch. When the sentence to be synthesized by the sentence input unit 1 comes in, the input sentence is analyzed and synthesized by the synthesizing means selection unit 2 for the proper noun "Matsushita Denki Ito", and the part other than the proper noun "There was a transfer from Sama." Separated from recording and playback. The sentence sent to the recording / playback unit is the waveform data storage unit 3a.
Reproducing the synthesized sound by the read control unit 3b. The character string sent to the analysis / synthesis unit 4 is stored in the parameter storage unit 4a.
The parameter is created by the parameter control unit 4b.
The parameter adjustment unit 5 refers to the recording / playback waveform data and adjusts the analysis / synthesis parameters so as to match the pitch. Then, the synthesis processing unit 6 generates the analysis / synthesis waveform data from the synthesis parameters. The analysis / synthesis waveform data and the recording / playback waveform data pass through the D / A unit 7, and a sound is provided from a synthesized sound output terminal 8. As described above, by adjusting the parameters of the analysis and synthesis of the parametric synthesis means so as to match the recording / reproducing waveform data, it is possible to provide a sound having a more natural connection portion and less discomfort. In this embodiment, as an example, “proper noun” is set to analysis synthesis, but this does not restrict the present invention at all. Further, although the parameter adjustment section sets only the parameter for adjusting the “pitch”, the parameter is not limited to the present invention at all, and all parameters used for the synthesis can be adjusted.

【００２７】（実施例４）図７は、請求項３記載の本発明の実施例のブロック図で
ある。実施例３の構成のＤ／Ａ部に代わって、オーバー
ラップ処理を行うものである。図において、１文章生成
部・４録音再生部・５パラメータ調整部・６合成処理部
は実施例３と同様の処理を行う。「松下電器の伊藤様か
ら振込がありました。」という文章が入力された場合に
ついて考える。ここでは、２つの音声について差のある
パラメータはピッチのみであったとする。図において、
合成手段選択部２により入力文章は固有名詞「松下電器
の伊藤」とそれに続く「様から」の部分は規則合成、固
有名詞以外の部分「様から振込がありました。」は録音
再生と分けられる。規則合成部３へ送られた文字列は、
言語処理部３ａ・韻律制御部３ｂ・パラメータ作成部３
ｃによって以下のように分割され、アクセント型や品詞
・読みなど合成パラメータの情報を得たあと、合成処理
部３ｄによって合成音がつくられる。（入力文章）「松下電器の伊藤様から」（単語分割）松下電器／の／伊藤／様／から（読み）マツシタテ゛ンキノイトウサマカラ（アクセント型）５Ｂ０２Ａ（品詞）固有名詞格助固有名詞名詞接助ここで「の」に対して与えられているアクセント型Ｂと
「から」に対して与えられているアクセント型Ａは、Ｎ
ＨＫアクセント辞書・解説付録（日本放送協会1985年）
に記載されているもので、自立語と結合して文節を構成
する際の結合アクセント核のある音節位置を示したもの
である。(Embodiment 4 ) FIG. 7 is a block diagram of a fourth embodiment of the present invention. The overlap processing is performed in place of the D / A unit having the configuration of the third embodiment. In the figure, one sentence generation unit, 4 recording and reproduction unit, 5 parameter adjustment unit, and 6 synthesis processing unit perform the same processing as in the third embodiment. Consider the case where the sentence "There was a transfer from Mr. Ito of Matsushita Electric" was input. Here, it is assumed that the only parameter having a difference between the two sounds is the pitch. In the figure,
According to the combining means selecting unit 2, the input sentence is divided into the proper noun "Matsushita Denki Ito" and the subsequent part of "Samakara" is rule-based synthesis, and the part other than the proper noun "Some transfer was made" is divided into recording and playback. . The character string sent to the rule synthesizing unit 3 is
Language processing unit 3a, prosody control unit 3b, parameter creation unit 3
After being divided by c as described below and obtaining information on synthesis parameters such as accent type, part of speech, and reading, a synthesis sound is created by the synthesis processing unit 3d. (Input sentence) "From Matsushita Electric's Ito-sama" (word division) Matsushita Electric / no / Ito / sama / kara (reading) Matsushita Tenki No Ito Sama Kara (accent type) 5B02A (part of speech) Proper noun case Auxiliary proper nouns Nouns Auxiliary Here, the accent type B given to "no" and the accent type A given to "kara" are N
HK Accent Dictionary and Commentary Appendix (Japan Broadcasting Corporation, 1985)
This shows a syllable position with a connecting accent nucleus when combining with an independent word to form a phrase.

【００２８】録音再生部４へ送られた文字列は、波形デ
ータ格納部４ａからそれに合う音声を取り出し、読みだ
し制御部４ｂによって合成音を再生する。パラメータ調
整部５は、録音再生用波形データを参照して、規則合成
用パラメータと差のあったピッチのパラメータを調整
し、調整されたパラメータから合成処理部６によって合
成音を作成する。さらにオーバーラップ部７で、規則合
成用波形データと録音再生用波形データの重複する部分
を足し合わせ処理して合成音をつくり合成音出力端８よ
り音声が提供される。このように、規則合成用音響パラ
メータを録音再生用波形データに合わせて調整し、さら
に規則合成用波形データと録音再生用波形データの重複
する部分をオーバーラップして出力することにより、つ
なぎの部分がより自然で違和感の少ない音声を提供する
ことができる。尚、本実施例では一例として「固有名
詞」は規則合成と設定したが、これは本発明を何ら拘束
するものではない。また、パラメータ調整部では「ピッ
チ」のみを調整するパラメータと設定したが、これも本
発明を何ら拘束するものではなく、合成に用いられる全
てのパラメータを調整できるものとする。The character string sent to the recording / reproducing unit 4 is extracted from the waveform data storage unit 4a, and the read-out control unit 4b reproduces the synthesized sound. The parameter adjuster 5 refers to the recording / reproducing waveform data, adjusts the parameter of the pitch different from the rule synthesis parameter, and creates a synthesized sound from the adjusted parameter by the synthesis processing unit 6. Further, in the overlap section 7, the overlapped portion of the regular synthesis waveform data and the recording / playback waveform data is added to produce a synthetic sound, and the sound is provided from the synthetic sound output terminal 8. In this way, the acoustic parameters for rule synthesis are adjusted in accordance with the waveform data for recording / reproduction, and the overlapping portions of the waveform data for rule synthesis and the recording / reproduction waveform data are output in an overlapping manner, so that the connection portion is obtained. Can provide a more natural sound with less discomfort. In the present embodiment, as an example, “proper noun” is set as rule composition, but this does not restrict the present invention at all. Further, although the parameter adjustment section sets only the parameter for adjusting the “pitch”, the parameter is not limited to the present invention at all, and all parameters used for the synthesis can be adjusted.

【００２９】（実施例５）図８は、請求項３記載の本発明の実施例のブロック図で
ある。実施例３の構成のＤ／Ａ部に代わって、オーバー
ラップ処理を行うものである。図において、１文章生成
部・４分析合成部・５パラメータ調整部・６合成処理部
は実施例４と同様の処理を行う。「松下電器の伊藤様か
ら振込がありました。」という文章が入力された場合に
ついて考える。ここでは、２つの音声について差のある
パラメータはピッチのみであったとする。図において、
合成手段選択部２により入力文章は固有名詞「松下電器
の伊藤」とそれに続く「様から」の部分は規則合成、固
有名詞以外の部分「様から振込がありました。」は分析
合成と分けられる。規則合成部３へ送られた文字列は、
言語処理部３ａ・韻律制御部３ｂ・パラメータ作成部３
ｃによって以下のように分割され、アクセント型や品詞
・読みなど合成パラメータの情報を得たあと、合成処理
部３ｄによって合成音がつくられる。（入力文章）「松下電器の伊藤様から」（単語分割）松下電器／の／伊藤／様／から（読み）マツシタテ゛ンキノイトウサマカラ（アクセント型）５Ｂ０２Ａ（品詞）固有名詞格助固有名詞名詞接助ここで「の」に対して与えられているアクセント型Ｂと
「から」に対して与えられているアクセント型Ａは、Ｎ
ＨＫアクセント辞書・解説付録（日本放送協会1985年）
に記載されているもので、自立語と結合して文節を構成
する際の結合アクセント核のある音節位置を示したもの
である。(Embodiment 5 ) FIG. 8 is a block diagram of a fifth embodiment of the present invention. The overlap processing is performed in place of the D / A unit having the configuration of the third embodiment. In the figure, one sentence generation part, four analysis synthesis parts, five parameter adjustment parts, and six synthesis processing parts perform the same processing as in the fourth embodiment. Consider the case where the sentence "There was a transfer from Mr. Ito of Matsushita Electric" was input. Here, it is assumed that the only parameter having a difference between the two sounds is the pitch. In the figure,
According to the combining means selecting unit 2, the input sentence is divided into the proper noun "Matsushita Electric's Ito" and the subsequent part of "from the form" is rule composition, and the part other than the proper noun "there was a transfer from the form" is divided into analysis and composition. . The character string sent to the rule synthesizing unit 3 is
Language processing unit 3a, prosody control unit 3b, parameter creation unit 3
After being divided by c as described below and obtaining information on synthesis parameters such as accent type, part of speech, and reading, a synthesis sound is created by the synthesis processing unit 3d. (Input sentence) "From Matsushita Electric's Ito-sama" (word division) Matsushita Electric / no / Ito / sama / kara (reading) Matsushita Tenki No Ito Sama Kara (accent type) 5B02A (part of speech) Proper noun case Auxiliary proper nouns Nouns Auxiliary Here, the accent type B given to "no" and the accent type A given to "kara" are N
HK Accent Dictionary and Commentary Appendix (Japan Broadcasting Corporation, 1985)
This shows a syllable position with a connecting accent nucleus when combining with an independent word to form a phrase.

【００３０】録音再生部４へ送られた文字列は、波形デ
ータ格納部４ａからそれに合う音声を取り出し、読みだ
し制御部４ｂによって合成音を再生する。パラメータ調
整部５は、録音再生用波形データを参照して、規則合成
用パラメータと差のあったピッチのパラメータを調整
し、調整されたパラメータから合成処理部６によって合
成音を作成する。さらにオーバーラップ部７で、規則合
成用波形データと分析合成用波形データの重複する部分
を足し合わせ処理して合成音をつくり合成音出力端８よ
り音声が提供される。このように、規則合成用パラメー
タと分析合成用パラメータを比較・調整し、さらに規則
合成用波形データと分析合成用波形データの重複する部
分をオーバーラップして出力することにより、つなぎの
部分がより自然で違和感の少ない音声を提供することが
できる。尚、本実施例では一例として「固有名詞」は規
則合成と設定したが、これは本発明を何ら拘束するもの
ではない。また、パラメータ調整部では「ピッチ」のみ
を調整するパラメータと設定したが、これも本発明を何
ら拘束するものではなく、合成に用いられる全てのパラ
メータを調整できるものとする。The character string sent to the recording / reproducing section 4 is extracted from the waveform data storage section 4a, and the synthesized voice is reproduced by the reading control section 4b. The parameter adjuster 5 refers to the recording / reproducing waveform data, adjusts the parameter of the pitch different from the rule synthesis parameter, and creates a synthesized sound from the adjusted parameter by the synthesis processing unit 6. Further, the overlapping section 7 adds the overlapped portions of the rule-synthesizing waveform data and the analysis-synthesizing waveform data to form a synthesized sound, and a sound is provided from the synthesized sound output terminal 8. In this way, by comparing and adjusting the rule synthesizing parameter and the analysis synthesizing parameter, and overlapping and outputting the overlapping portion of the rule synthesizing waveform data and the analysis synthesizing waveform data, the connection portion can be further improved. It is possible to provide natural and less uncomfortable voice. In the present embodiment, as an example, “proper noun” is set as rule composition, but this does not restrict the present invention at all. Further, although the parameter adjustment section sets only the parameter for adjusting the “pitch”, the parameter is not limited to the present invention at all, and all parameters used for the synthesis can be adjusted.

【００３１】（実施例６）図９は、請求項２記載の本発明の実施例のブロック図で
ある。実施例３の構成のＤ／Ａ部に代わって、オーバー
ラップ処理を行うものである。図において、１文章生成
部・４分析合成部・５パラメータ調整部・６合成処理部
は実施例５と同様の処理を行う。３は波形データ格納部
３ａと、読みだし制御部３ｂからなる録音再生部であ
る。７は録音再生用波形データと分析合成波形データの
重複する部分を足し合わせ処理するオーバーラップ部で
あり、８は音声を出力する合成音声出力端で、スピーカ
・ヘッドホン・受話器などを示す。[0031] (Embodiment 6) FIG. 9 is a block diagram of an embodiment of the present invention of claim 2 wherein. The overlap processing is performed in place of the D / A unit having the configuration of the third embodiment. In the figure, one sentence generation unit, 4 analysis and synthesis unit, 5 parameter adjustment unit, and 6 synthesis processing unit perform the same processing as in the fifth embodiment. Reference numeral 3 denotes a recording / reproducing unit including a waveform data storage unit 3a and a reading control unit 3b. Reference numeral 7 denotes an overlap unit for adding overlapping portions of the recording / reproducing waveform data and the analysis / synthesis waveform data, and reference numeral 8 denotes a synthesized voice output terminal for outputting a voice, such as a speaker, a headphone, and a receiver.

【００３２】次に各処理の詳細について実例を用いて説
明する。「松下電器の伊藤様から振込がありました。」
という文章が入力された場合について考える。ここで
は、２つの音声について差のあるパラメータはピッチの
みであったとする。合成手段選択部２により入力文章は
固有名詞「松下電器の伊藤」とそれに続く「様から」の
部分は録音再生、固有名詞以外の部分「様から振込があ
りました。」は分析合成と分けられる。録音再生部３へ
送られた文章は波形データ格納部３ａ・読みだし制御部
３ｂによって合成音を再生する。分析合成部４へ送られ
た文字列は、パラメータ格納部４ａ・パラメータ制御部
４ｂによってパラメータを作成する。パラメータ調整部
５で、録音再生用波形データを参照して、そのピッチに
合うように分析合成用パラメータを調整した後、その合
成パラメータから合成処理部６によって分析合成用波形
データが作られ、この分析合成用波形データと録音再生
用波形データはオーバーラップ部７によって、重複する
部分を足し合わせ処理されて合成音出力端８より音声が
提供される。このように、録音再生用波形データに合う
よう、パラメトリックな合成手段の分析合成のパラメー
タを調整し、さらに録音再生用波形データと分析合成用
波形データをオーバーラップして出力することによっ
て、つなぎの部分がより自然で違和感の少ない音声を提
供することができる。尚、本実施例では一例として「固
有名詞」は分析合成と設定したが、これは本発明を何ら
拘束するものではない。また、パラメータ調整部では
「ピッチ」のみを調整するパラメータと設定したが、こ
れも本発明を何ら拘束するものではなく、合成に用いら
れる全てのパラメータを調整できるものとする。Next, the details of each processing will be described using actual examples. "There was a transfer from Mr. Ito of Matsushita Electric."
Consider the case where the sentence is input. Here, it is assumed that the only parameter having a difference between the two sounds is the pitch. According to the synthesizing means selection unit 2, the input sentence is divided into the proper noun "Matsushita Electric's Ito" and the part of "samakara" following it is recorded and played back, and the part other than the proper noun "there was a transfer from sama" is divided into analysis and synthesis. . The sentence sent to the recording / reproducing unit 3 is reproduced by the waveform data storage unit 3a and the reading control unit 3b. For the character string sent to the analysis / synthesis unit 4, parameters are created by the parameter storage unit 4a and the parameter control unit 4b. The parameter adjustment unit 5 refers to the recording / playback waveform data and adjusts the analysis / synthesis parameters so as to match the pitch. Then, the synthesis processing unit 6 generates the analysis / synthesis waveform data from the synthesis parameters. The analysis and synthesis waveform data and the recording / reproduction waveform data are added together by an overlap unit 7 to perform overlapping processing, and a sound is provided from a synthesized sound output terminal 8. In this way, the parameters of the analysis and synthesis of the parametric synthesis means are adjusted to match the recording and playback waveform data, and the recording and playback waveform data and the analysis and synthesis waveform data are overlapped and output, so that the connection It is possible to provide a sound with a more natural part and a less uncomfortable feeling. In this embodiment, as an example, “proper noun” is set to analysis synthesis, but this does not restrict the present invention at all. Further, although the parameter adjustment section sets only the parameter for adjusting the “pitch”, the parameter is not limited to the present invention at all, and all parameters used for the synthesis can be adjusted.

【００３３】（実施例７）図１０は、請求項３記載の本発明の実施例のブロック図
である。請求項１記載のオーバーラップ部の詳細を示し
たものである。図において、１・２は合成音入力端Ａ・
Ｂ、３は制御部３ａ・アンプＡ３ｂ・アンプＢ３ｃから
なるオーバーラップ部である。４は音声を出力する合成
音声出力端で、スピーカ・ヘッドホン・受話器などを示
す。[0033] (Embodiment 7) FIG. 10 is a block diagram of an embodiment of the present invention described in claim 3. 3 shows details of an overlap portion according to claim 1. In the figure, reference numerals 1 and 2 denote synthesized sound input terminals A and
Reference numerals B and 3 denote overlapping sections each including a control section 3a, an amplifier A3b, and an amplifier B3c. Reference numeral 4 denotes a synthesized voice output terminal for outputting voice, which indicates a speaker, headphones, a receiver, and the like.

【００３４】次に各処理の詳細について実例を用いて説
明する。合成音入力端Ａと合成音入力端Ｂは、重複する
部分を持っているものとする。２つの異なる合成手段に
よる音声をオーバーラップして出力する場合について述
べる。合成音入力端Ａと合成音入力端Ｂより入力がある
と、それぞれアンプＡ・アンプＢに入る。この際、３ａ
の制御部により重複して出力する部分について、音量が
その部分だけ大きくならないよう制御すると共に、先に
出力する音声をフェードアウトさせ、後に出力する音声
をフェードインさせて、２つの音声が徐々に切り替わる
よう２つのアンプが制御されて音声が送られ、合成音出
力端４から音声が提供される。このように、オーバーラ
ップ部において、重複する部分の２つの音声を出力する
際、それぞれのアンプを制御して音声を徐々に切り替え
ることによって、つなぎの部分がより自然で違和感の少
ない音声を提供することができる。Next, the details of each process will be described using actual examples. It is assumed that the synthesized sound input terminal A and the synthesized sound input terminal B have overlapping portions. A case will be described in which sounds output by two different synthesizing means are output in an overlapping manner. When there is an input from the synthetic sound input terminal A and the synthetic sound input terminal B, the signals enter the amplifier A and the amplifier B, respectively. At this time, 3a
The control unit controls not to increase the volume of the overlapped output portion, fades out the audio output first, and fades in the audio output later, and gradually switches between the two audios. The two amplifiers are controlled so that the sound is transmitted, and the sound is provided from the synthetic sound output terminal 4. As described above, when two sounds in the overlapping portion are output in the overlap portion, the sound is gradually switched by controlling the respective amplifiers, thereby providing a sound with a more natural connection portion and less discomfort. be able to.

【００３５】[0035]

【発明の効果】以上のように、本発明によれば、複数の
異なる合成方式を組合せて音声を提供する際、音声合成
方式のつなぎの部分にオーバーラップ処理やパラメータ
の調整などのオプション機能を備えたことにより、より
自然な音声を合成することが可能となる。As described above, according to the present invention, when speech is provided by combining a plurality of different synthesizing methods, optional functions such as overlap processing and parameter adjustment are provided at the joint of the speech synthesizing methods. With this arrangement, a more natural speech can be synthesized.

【図面の簡単な説明】[Brief description of the drawings]

【図１】本発明の第１の実施例における音声合成装置の
ブロック図FIG. 1 is a block diagram of a speech synthesizer according to a first embodiment of the present invention.

【図２】本発明の第１の実施例における音声合成装置の
ブロック図FIG. 2 is a block diagram of a speech synthesizer according to the first embodiment of the present invention.

【図３】本発明の第１の実施例における音声合成装置の
ブロック図FIG. 3 is a block diagram of a speech synthesizer according to the first embodiment of the present invention.

【図４】本発明の理解を助けるための参考例の説明図 FIG. 4 is an explanatory diagram of a reference example to help understanding of the present invention .

【図５】本発明の理解を助けるための参考例の説明図 FIG. 5 is an explanatory diagram of a reference example to help understanding of the present invention .

【図６】本発明の理解を助けるための参考例の説明図 FIG. 6 is an explanatory view of a reference example to help understanding of the present invention .

【図７】本発明の第３の実施例における音声合成装置の
ブロック図FIG. 7 is a block diagram of a speech synthesizer according to a third embodiment of the present invention.

【図８】本発明の第３の実施例における音声合成装置の
ブロック図FIG. 8 is a block diagram of a speech synthesizer according to a third embodiment of the present invention.

【図９】本発明の第３の実施例における音声合成装置の
ブロック図FIG. 9 is a block diagram of a speech synthesizer according to a third embodiment of the present invention.

【図１０】本発明の第４の実施例における音声合成装置
のブロック図FIG. 10 is a block diagram of a speech synthesizer according to a fourth embodiment of the present invention.

【図１１】従来例の音声合成装置のブロック図FIG. 11 is a block diagram of a conventional speech synthesizer.

[Explanation of symbols]

１文章生成部２合成手段選択部３規則合成部３ａ言語処理部３ｂ韻律制御部３ｃパラメータ作成部３ｄ合成処理部４録音再生部４ａ波形データ格納部４ｂ読み出し制御部５オーバーラップ処理部６Ｄ／Ａ部７合成音出力端 Reference Signs List 1 sentence generation unit 2 synthesis means selection unit 3 rule synthesis unit 3a language processing unit 3b prosody control unit 3c parameter creation unit 3d synthesis processing unit 4 recording / playback unit 4a waveform data storage unit 4b read control unit 5 overlap processing unit 6 D / Part A 7 Synthetic sound output terminal

───────────────────────────────────────────────────── フロントページの続き (56)参考文献特開平３−7999（ＪＰ，Ａ) 特開昭62−215299（ＪＰ，Ａ) 特開平４−263299（ＪＰ，Ａ) 特開昭59−42598（ＪＰ，Ａ) 特開平４−367000（ＪＰ，Ａ) 特開昭60−63597（ＪＰ，Ａ) 特開平４−19799（ＪＰ，Ａ) 特開平１−191900（ＪＰ，Ａ) 特公平３−73000（ＪＰ，Ｂ２) 特公平３−15759（ＪＰ，Ｂ２) (58)調査した分野(Int.Cl.⁷，ＤＢ名) G10L 11/00 - 13/08 G10L 19/00 - 21/06 ＪＩＣＳＴファイル（ＪＯＩＳ)──────────────────────────────────────────────────続き Continuation of the front page (56) References JP-A-3-7999 (JP, A) JP-A-62-215299 (JP, A) JP-A-4-263299 (JP, A) JP-A-59-1984 42598 (JP, A) JP-A-4-367000 (JP, A) JP-A-60-63597 (JP, A) JP-A-4-19799 (JP, A) JP-A-1-191900 (JP, A) JP 3-73000 (JP, B2) JP 3-15759 (JP, B2) (58) Fields investigated (Int. Cl. ⁷ , DB name) G10L 11/00-13/08 G10L 19/00 -21/06 JICST file (JOIS)

Claims

(57) [Claims]

1. A text generating means for generating a text, a plurality of different synthesizing means for synthesizing a speech waveform, a synthesizing means selecting means for selecting the synthesizing means according to the contents of the text, and a selecting means by the synthesizing means selecting means. And an overlap means for adding waveforms of a plurality of synthesized sounds output from the synthesized means.

2. A sentence generating means for generating a sentence, a plurality of different synthesizing means for synthesizing a speech waveform, a synthesizing means selecting means for selecting the synthesizing means according to the contents of the sentence, and one of the plurality of synthesized sounds. Is a parametric synthesizing means, a parameter adjusting means for adjusting the synthesizing parameters created by the synthesizing means in accordance with another voice, and an overlapping means for adding and processing the waveforms of the plurality of synthesized sounds. A speech synthesizer comprising:

3. A sentence generating means for generating a sentence, a plurality of different synthesizing means for synthesizing a speech waveform, a synthesizing means selecting means for selecting the synthesizing means according to the contents of the sentence, and an overlapping speech waveform. A voice synthesizing device comprising: overlap means for overlapping and outputting the respective amplifiers while adjusting the volume using control means when matching.