JPH05260082A

JPH05260082A - Text reader

Info

Publication number: JPH05260082A
Application number: JP4054985A
Authority: JP
Inventors: Yoshinori Shiga; 芳則志賀; Yoshiyuki Hara; 義幸原; Tsuneo Nitta; 恒雄新田
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1992-03-13
Filing date: 1992-03-13
Publication date: 1993-10-08

Abstract

PURPOSE:To read out a text with the voice quality corresponding to a transmission origin in the case of receiving the text sent from the transmission side in the reception side and reading out the text with a synthetic sound based on the sound synthetic rule and to discriminate a transmitter by means of the voice quality at the time of reading the text out without a receiver becoming tired of listening to the synthetic sound. CONSTITUTION:The text inputted by an input/output device 1 together with a sender ID is transmitted as a mail through a transmission mail system 2, the mail is sent to a reception mail system 5 through a message communication system 4 and received by the system 5. The content of the mail can be outputted as needed, and when the sender ID as well as reception text is sent from an input/output device 7 to a sound synthesizer 6, a voice quality switching section 10 in the synthesizer 6 outputs the text by the synthetic sound with the voice quality corresponding to the transmitter ID.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、電子メール等の通信シ
ステムの送信側から送られたテキストを受信側で受け取
り、その受け取ったテキストを音声の規則合成により合
成音で読み上げるテキスト読み上げ装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a text-to-speech device which receives text transmitted from a transmission side of a communication system such as electronic mail on the reception side and reads the received text by synthetic speech by rule-based speech synthesis.

【０００２】[0002]

【従来の技術】音声の規則合成は、特定の人間が実際に
発声した音声を観測或いは分析することによって得られ
る音声パラメータを、子音＋母音（ＣＶ）や母音＋子音
＋母音（ＶＣＶ）、子音＋母音＋子音（ＣＶＣ）等の単
位で音声素片として予め用意しておき、入力される音韻
系列に従ってこれら音声素片を補間接続し、こうして得
られた音韻パラメータと、他方で生成されたピッチパタ
ーンからなる韻律パラメータとを合成器に送って音声を
合成するものである。2. Description of the Related Art In the rule synthesis of voices, voice parameters obtained by observing or analyzing voices actually uttered by a specific person are consonant + vowel (CV), vowel + consonant + vowel (VCV) and consonant. + Vowel + consonant (CVC) units are prepared in advance as speech units, and these speech units are interpolated and connected according to the input phoneme sequence, and the phonological parameters thus obtained and the pitch generated by the other A prosody parameter consisting of a pattern is sent to a synthesizer to synthesize a voice.

【０００３】したがって、このような音声の規則合成方
式を用いて、電子メール等の通信システムの受信側で受
け取ったテキストを読み上げる場合にも、従来は、読み
上げ装置に予め用意されている上記のような音声素片を
使って音声の合成を行っていた。Therefore, even when a text received by the receiving side of a communication system such as an electronic mail is read out by using such a rule synthesizing method for voices, conventionally, the above is prepared in advance in the reading device. I was synthesizing speech using various speech units.

【０００４】[0004]

【発明が解決しようとする課題】上記したように従来の
テキスト読み上げ装置では、装置に予め用意されている
音声素片だけを用いて、送信元等に無関係に音声の合成
を行っており、その合成音の声質は常に変わらなかっ
た。しかし、絶えず同じ声質でテキストが読み上げられ
ていると、合成音声に飽きる、聞くのに疲れるといった
問題があった。As described above, in the conventional text-to-speech apparatus, only the speech units prepared in advance in the apparatus are used to synthesize speech regardless of the sender or the like. The voice quality of the synthetic voice did not change at all times. However, if the text is read aloud constantly with the same voice quality, there is a problem that the user is tired of listening to the synthesized voice and tired of listening.

【０００５】そこで、本発明は、テキスト読み上げの際
の声質を、同テキストと共に送受信される送信元識別コ
ードに応じて自動的に切り替えることによって、受信側
システム利用者が合成音声に飽きたり疲れたりすること
がなく、さらには声質により送信者の判断も可能となる
ような、電子メール等の通信システムにおけるテキスト
読み上げ装置を提供することを第１の目的とする。Therefore, according to the present invention, the voice quality at the time of reading a text is automatically switched according to the source identification code transmitted and received together with the text, so that the receiving side system user gets tired or tired of the synthesized voice. It is a first object of the present invention to provide a text-to-speech device in a communication system such as an electronic mail that does not require the above-described processing and that allows the sender to make a judgment based on the voice quality.

【０００６】また、本発明は、送信側がテキストと共に
声質に関連するパラメータを送り、受信側でこの声質関
連パラメータを用いて音声を合成することで、送信者が
意図した声質でテキストの読み上げを行うことができ、
受信側システム利用者が合成音声に飽きたり疲れたりす
ることがなく、声質により送信者の判断も可能となるよ
うな、電子メール等の通信システムにおけるテキスト読
み上げ装置を提供することを第２の目的とする。Further, according to the present invention, the transmitting side sends a parameter relating to the voice quality together with the text, and the receiving side synthesizes the voice using the voice quality related parameter, so that the text is read aloud with the voice quality intended by the sender. It is possible,
A second object of the present invention is to provide a text-to-speech device in a communication system such as e-mail that enables the receiving-side system user not to get tired or tired of the synthesized voice and to judge the sender based on the voice quality. And

【０００７】[0007]

【課題を解決するための手段】本発明は上記課題を解決
するために、通信系の送信側で少なくともテキストを通
信媒体に送り、受信側でそのテキストを受け取り、受け
取ったテキストを音声の規則合成により合成音で読み上
げるテキスト読み上げ装置において、受信側で受け取っ
たテキストを読み上げる合成音の声質を、このテキスト
と共に送信側から送られる送信元識別コードに応じて切
り替えるための声質切替手段を受信側に備えたことを第
１の特徴とする。In order to solve the above-mentioned problems, the present invention sends at least a text to a communication medium on the transmitting side of a communication system, receives the text on the receiving side, and synthesizes the received text with a rule of voice. In a text-to-speech device that reads out with a synthetic voice according to, the receiving side is provided with voice quality switching means for switching the voice quality of the synthetic voice that reads out the text received by the receiving side according to the source identification code sent from the transmitting side together with this text. That is the first feature.

【０００８】また本発明は、送信側から受信側にテキス
トを送る際に、音声の規則合成の際必要となるパラメー
タのうち少なくとも声質に関連するパラメータを併せて
送るテキスト・声質関連パラメータ送信手段を送信側に
備え、受信側では、このテキスト・声質関連パラメータ
送信手段から送られたパラメータを用いて規則合成によ
りテキストを読み上げるようにしたことを第２の特徴と
する。Further, the present invention provides a text / voice quality related parameter transmitting means for transmitting at least a parameter related to voice quality among parameters required for rule synthesis of voices when transmitting a text from a transmission side to a reception side. A second feature is that the text is read aloud by rule synthesis using the parameters sent from the text / voice quality related parameter sending means, in preparation for the sending side.

【０００９】[0009]

【作用】上記の構成によれば、送信側から送られたテキ
ストを受信側で受け取り、そのテキストを音声の規則合
成で読み上げる際に、絶えず同じ声質の合成音で読み上
げられるのではなく、読み上げる合成音の声質が、受信
側に設けられた声質切替手段により、送信側からテキス
トと共に送られる送信元識別コードに応じて切り替えら
れる。これにより、受信側システム利用者が合成音声に
飽きたり、疲れたりすることなく、さらには声質により
送信者を知ることもできるようになる。According to the above construction, when the text sent from the sending side is received by the receiving side and the text is read aloud by rule-based speech synthesis, the text is not constantly read as a synthetic voice with the same voice quality, but is read aloud. The voice quality of the sound is switched by the voice quality switching means provided on the receiving side according to the source identification code sent from the transmitting side together with the text. This allows the receiving-side system user to know the sender by voice quality without getting tired or tired of the synthesized voice.

【００１０】また、上記の構成によれば、送信側のテキ
スト・声質関連パラメータ送信手段から受信側にテキス
トを送る際には、音声の規則合成の際必要となるパラメ
ータのうち少なくとも声質に関連するパラメータ（声質
関連パラメータ）がこのテキストと併せて送られる。受
信側では、送信側から送られたテキストを受け取り、そ
のテキストを読み上げる際には、このテキストと共に送
られた声質関連パラメータをもとに音声を規則合成す
る。Further, according to the above configuration, when the text / voice quality related parameter transmitting means of the transmitting side sends the text to the receiving side, it relates to at least the voice quality among the parameters necessary for the rule synthesis of voice. The parameters (voice quality related parameters) are sent with this text. The receiving side receives the text sent from the sending side, and when reading the text aloud, the voice is regularly synthesized based on the voice quality-related parameter sent together with the text.

【００１１】これにより、送信側システム利用者は自分
が意図した声質にて受信側でのテキスト読み上げを行わ
せることができる。また、受信側で合成される音声の声
質は送信者により異なるので、受信側システム利用者は
合成音声に飽きたり疲れたりすることなく、声質により
送信者を知ることもできる。As a result, the system user on the transmission side can read the text on the reception side with the voice quality intended by the user. Further, since the voice quality of the voice synthesized on the receiving side varies depending on the sender, the system user on the receiving side can know the sender by the voice quality without getting tired or tired of the synthesized voice.

【００１２】[0012]

【実施例】以下、図面を参照して本発明の実施例につき
説明する。（第１実施例）図１は本発明のテキスト読み上げ装置を
適用する電子メールシステムの第１実施例を示すシステ
ム構成図である。Embodiments of the present invention will be described below with reference to the drawings. (First Embodiment) FIG. 1 is a system configuration diagram showing a first embodiment of an electronic mail system to which the text-to-speech apparatus of the present invention is applied.

【００１３】図１において、１，７はテキストの入出力
等を行う入出力装置、２，５は入出力装置１，７と接続
されたメールシステムである。メールシステム２，５は
メールの処理を司る。In FIG. 1, reference numerals 1 and 7 are input / output devices for inputting / outputting texts, and 2 and 5 are mail systems connected to the input / output devices 1 and 7. The mail systems 2 and 5 control mail processing.

【００１４】４はメールシステム２，５間のメール転送
を司るメッセージ通信システム、３，６は入出力装置
１，７と接続されたテキスト読み上げのための音声合成
装置である。この音声合成装置３，６には、送信側から
テキストと共に送られた送信元を示す送信元識別コード
（以下、送信者ＩＤと称する）に応じて、テキストを読
み上げる合成音の声質を切り替えるための声質切替部１
０が設けられている。Reference numeral 4 is a message communication system which controls mail transfer between the mail systems 2 and 5, and 3 and 6 are voice synthesizing devices for text reading aloud which are connected to the input / output devices 1 and 7. The voice synthesizers 3 and 6 are provided to switch the voice quality of the synthetic voice read out from the text in accordance with a sender identification code (hereinafter referred to as sender ID) indicating the sender sent together with the text from the sender. Voice quality switching unit 1
0 is provided.

【００１５】図２は図１における音声合成装置６（３）
のブロック構成図である。音声合成装置６（３）は、例
えば４つの音声素片記憶部１１〜１４を有している。音
声素片記憶部１１には、母音へのわたりの途中までを含
む子音の音声パラメータからなる子音素片Ｃｖと母音か
ら子音へ移るときの母音過渡部素片ｖｃが記憶されてい
る。また、音声素片記憶部１２〜１４には、それぞれ声
質の異なる母音の音声パラメータからなる母音素片が記
憶されている。FIG. 2 shows the speech synthesizer 6 (3) in FIG.
It is a block configuration diagram of. The speech synthesizer 6 (3) has, for example, four speech element storage units 11-14. The voice unit storage unit 11 stores a consonant unit Cv including a voice parameter of a consonant including a halfway through a vowel, and a vowel transition unit unit vc when a vowel changes to a consonant. In addition, the voice unit storage units 12 to 14 store vowel units made up of voice parameters of vowels having different voice qualities.

【００１６】音声合成装置６（３）はまた、音声素片記
憶部１２〜１４のうちの１つを選択する音声素片切替部
１５と、送信者ＩＤとこの送信者ＩＤで示される送信者
に固有の母音素片が記憶されている音声素片記憶部に割
り当てられた番号（素片記憶部番号）との対応関係が記
述された送信者ＩＤ／素片記憶部番号対応テーブル１６
と、素片切替制御部１７とを有している。この素片切替
制御部１７は、図１の入出力装置７（１）から渡される
送信側からの送信者ＩＤをもとに送信者ＩＤ／素片記憶
部番号対応テーブル１６を参照して対応する素片記憶部
番号を求め、その番号により音声素片切替部１５を制御
する。音声素片切替部１５、送信者ＩＤ／素片記憶部番
号対応テーブル１６及び素片切替制御部１７は、図１に
おける声質切替部１０を構成する。The voice synthesizer 6 (3) also includes a voice unit switching unit 15 for selecting one of the voice unit storage units 12 to 14, a sender ID, and a sender indicated by the sender ID. Sender ID / segment storage unit number correspondence table 16 in which the correspondence with the number (segment storage unit number) assigned to the speech unit storage unit storing the vowel segment unique to
And a segment switching control unit 17. This segment switching control unit 17 responds by referring to the sender ID / segment storage unit number correspondence table 16 based on the sender ID sent from the transmission side from the input / output device 7 (1) in FIG. The phoneme unit switching unit 15 is controlled by the phoneme unit storage unit number to be obtained. The voice unit switching unit 15, the sender ID / unit storage unit number correspondence table 16, and the unit switching control unit 17 configure the voice quality switching unit 10 in FIG.

【００１７】音声合成装置６（３）は更に、図１の入出
力装置７（１）から渡される送信側からのテキストの言
語解析に基づく音韻系列、アクセント情報の生成、各音
韻長の決定等を行う言語解析・音韻長決定部１８と、言
語解析・音韻長決定部１８で生成されたアクセント情報
と音韻長に基づく韻律パラメータの生成を行う韻律パラ
メータ生成部１９と、音韻パラメータ生成部２０と、合
成器フィルタ２１とを有している。The voice synthesizer 6 (3) further generates a phoneme sequence based on the language analysis of the text sent from the input / output device 7 (1) of FIG. A linguistic analysis / phoneme length determination unit 18, a prosody parameter generation unit 19 that generates prosodic parameters based on the accent information and phoneme length generated by the language analysis / phoneme length determination unit 18, and a phonological parameter generation unit 20. , Synthesizer filter 21.

【００１８】音韻パラメータ生成部２０は、言語解析・
音韻長決定部１８で生成された音韻系列と音韻長に従
い、音声素片記憶部１１に記憶されている音声素片及び
声素片切替部１５によって選択された母音素片記憶部１
２〜１４のうちの１つに記憶されている音声素片を用い
て音韻パラメータを生成する。また合成器フィルタ２１
は、韻律パラメータ生成部１９からの韻律パラメータと
音韻パラメータ生成部２０からの音韻パラメータをもと
に、合成音声を生成する。The phonological parameter generator 20 includes a language analysis /
The vowel unit storage unit 1 selected by the voice unit and voice unit switching unit 15 stored in the voice unit storage unit 11 according to the phoneme sequence and the phoneme length generated by the phoneme length determination unit 18.
A phoneme parameter stored in one of 2 to 14 is used to generate a phoneme parameter. Also, the synthesizer filter 21
Generates synthetic speech based on the prosody parameter from the prosody parameter generation unit 19 and the phonology parameter from the phonology parameter generation unit 20.

【００１９】ここで、図１及び図２の構成の動作を、入
出力装置１及びメールシステム２側からメールを送信
し、メールシステム５及び入出力装置７側で受けて、音
声合成装置６にてテキスト読み上げを行う場合を例に説
明する。Here, the operation of the configuration shown in FIGS. 1 and 2 is transmitted from the input / output device 1 and the mail system 2 side, received by the mail system 5 and the input / output device 7 side, and sent to the voice synthesizer 6. An example will be described in which text is read aloud.

【００２０】まず入出力装置１で入力したテキストを送
信側メールシステム２を通じてメールとして送信する
と、そのメールはメッセージ通信システム４を介して受
信側メールシステム５に送られる。送信の際、メールは
その中にテキスト（テキスト情報）と共に送信元を示す
識別コードである送信者ＩＤを持って送り出される。First, when the text input by the input / output device 1 is sent as a mail through the sending mail system 2, the mail is sent to the receiving mail system 5 through the message communication system 4. At the time of transmission, the mail is sent with a text (text information) and a sender ID, which is an identification code indicating the sender, in the mail.

【００２１】受信側メールシステム５に送られたメール
は、同システム５で受け取られる。このシステム５が受
け取ったメールの内容は入出力装置７で必要なときに出
力することができる。図１の構成では、このとき、入出
力装置７から音声合成装置６に受信テキストと共に送信
者ＩＤを渡すことで、同合成装置６内の声質切替部１０
の声質切替動作により、そのテキストを送信者ＩＤに対
応した声質の合成音声で出力することができる。The mail sent to the receiving mail system 5 is received by the mail system 5. The contents of the mail received by the system 5 can be output by the input / output device 7 when necessary. In the configuration of FIG. 1, at this time, the sender / receiver ID is passed from the input / output device 7 to the voice synthesizer 6 together with the received text, so that the voice quality switching unit 10 in the synthesizer 6 is transferred.
By the voice quality switching operation of, the text can be output as a synthesized voice having a voice quality corresponding to the sender ID.

【００２２】この音声合成装置６での音声合成処理につ
いて以下に説明する。まず、音声素片記憶部１１及び音
声素片記憶部１２〜１４に記憶される音声素片の作成方
法について詳述する。The voice synthesizing process in the voice synthesizing device 6 will be described below. First, a method of creating the speech units stored in the speech unit storage unit 11 and the speech unit storage units 12 to 14 will be described in detail.

【００２３】音声素片の作成にあたっては、まず、発声
リストに従ってアナウンサ等が発声した音声データを用
意する。そして、この音声データに２０ｍｓｅｃ程度の
一定時間長の時間窓を掛け、１０ｍｓｅｃ程度の一定時
間シフトをしながら各窓内でケプストラム分析を行う。When creating a voice segment, first, voice data produced by an announcer or the like is prepared according to a voice production list. Then, this sound data is multiplied by a time window having a fixed time length of about 20 msec, and cepstrum analysis is performed in each window while shifting the fixed time of about 10 msec.

【００２４】次に、各フレームのパワースペクトラムや
音声パワーを見ながら、素片として切り出したいフレー
ム範囲に対応するケプストラムパラメータを抜き出し、
音声素片とする。図３（ａ）は、音声データの１つの音
韻からＣｖ，ｖｃ素片を切り出している例を示す。この
ように過渡区間に関しては、比較的広い範囲で（フレー
ム数を多く）切り出しを行う。Ｃｖ，ｖｃ素片は上述し
たように音声素片記憶部１１に記憶される。Next, while looking at the power spectrum and voice power of each frame, the cepstrum parameter corresponding to the frame range to be cut out as a segment is extracted,
Let it be a speech unit. FIG. 3A shows an example in which Cv and vc segments are cut out from one phoneme of voice data. In this way, the transitional section is cut out in a relatively wide range (the number of frames is large). The Cv and vc units are stored in the voice unit storage unit 11 as described above.

【００２５】一方、定常区間である母音部は１フレーム
分のケプストラムパラメータのみ切り出す。ここでは、
３名の発声者の音声データ中の各母音（日本語の場合、
/a/,/i/,/u/,/e/,/o/ ）から前記のようにケプストラム
パラメータの切り出しを行い、それぞれ音声素片記憶部
１２〜１４に記憶させる。具体的には、音声素片記憶部
１２，１３には、送信者となる得る特定の２名が発声し
た音声から作成された音声素片（母音素片）が記憶さ
れ、音声素片記憶部１４には、アナウンサが発声した音
声から作成された音声素片（母音素片）が記憶される。On the other hand, the vowel part, which is a stationary section, cuts out only one frame of cepstrum parameter. here,
Vowels in the voice data of three speakers (in Japanese,
The cepstrum parameters are cut out from / a /, / i /, / u /, / e /, / o /) as described above, and stored in the speech unit storage units 12 to 14, respectively. Specifically, the speech unit storage units 12 and 13 store speech units (vowel units) created from the voices uttered by two specific senders, and the speech unit storage units. In 14, a voice unit (vowel unit) created from the voice uttered by the announcer is stored.

【００２６】上記特定の２名と対応する音声素片記憶部
１２，１３との関係を示す情報は、この２名の発声者に
固有の送信者ＩＤ（例えば送信者ＩＤ＃１，＃２とす
る）と対応する音声素片記憶部１２，１３に割り当てら
れた番号との対の形で、送信者ＩＤ／素片記憶部番号対
応テーブル１６に登録される。The information indicating the relationship between the specific two persons and the corresponding speech segment storage sections 12 and 13 includes sender IDs (for example, sender IDs # 1 and # 2) unique to the two speakers. Yes) and the corresponding numbers assigned to the voice unit storage units 12 and 13 are registered in the sender ID / unit storage unit number correspondence table 16.

【００２７】さて、図２に戻って、図１の入出力装置７
から音声合成装置６に渡されたメール中のテキストは、
この音声合成装置６内の言語解析・音韻長決定部１８に
導かれる。言語解析・音韻長決定部１８は、このテキス
トを対象として言語解析を行い、音韻系列、アクセント
情報を生成し、それに各音韻長を決定する。Now, returning to FIG. 2, the input / output device 7 of FIG.
The text in the mail passed from the voice synthesizer 6 to
It is guided to the language analysis / phoneme length determination unit 18 in the speech synthesizer 6. The linguistic analysis / phoneme length determination unit 18 linguistically analyzes this text to generate a phoneme sequence and accent information, and determines each phoneme length thereof.

【００２８】また、図１の入出力装置７から音声合成装
置６に渡されたメール中の送信者ＩＤは、この音声合成
装置６に設けられた声質切替部１０内の素片切替制御部
１７に導かれる。素片切替制御部１７は、この送信者Ｉ
Ｄをもとに送信者ＩＤ／素片記憶部番号対応テーブル１
６を参照し、この送信者ＩＤと対をなして登録されてい
る素片記憶部番号を求める。The sender ID in the mail delivered from the input / output device 7 of FIG. 1 to the voice synthesizing device 6 has a segment switching control unit 17 in the voice quality switching unit 10 provided in the voice synthesizing device 6. Be led to. The element switching control unit 17 determines that the sender I
Sender ID / segment storage unit number correspondence table 1 based on D
6, the unit storage unit number registered in pair with this sender ID is obtained.

【００２９】この結果、送信者ＩＤがＩＤ＃１であれ
ば、音声素片記憶部１２を示す素片記憶部番号が求めら
れ、ＩＤ＃２であれば、音声素片記憶部１３を示す素片
記憶部番号が求められる。素片切替制御部１７は、求め
た素片記憶部番号を素片切替制御部１７に与える。As a result, if the sender ID is ID # 1, the unit number indicating the voice unit storage unit 12 is obtained, and if the sender ID is ID # 2, the unit indicating the voice unit storage unit 13 is obtained. One-sided storage unit number is required. The segment switching control unit 17 gives the obtained segment storage unit number to the segment switching control unit 17.

【００３０】これに対し、送信者ＩＤがＩＤ＃１または
ＩＤ＃２以外の場合、即ち入出力装置７から渡された送
信者ＩＤが送信者ＩＤ／素片記憶部番号対応テーブル１
６に登録されていない場合には、素片切替制御部１７
は、同制御部１７内に予め用意されている音声素片記憶
部１４を示す素片記憶部番号を素片切替制御部１７に与
える。On the other hand, when the sender ID is other than ID # 1 or ID # 2, that is, the sender ID passed from the input / output device 7 is the sender ID / element storage unit number correspondence table 1
If not registered in No. 6, the segment switching control unit 17
Gives the segment switching control unit 17 a segment storage unit number indicating the speech unit storage unit 14 prepared in advance in the control unit 17.

【００３１】音声素片切替部１５は、素片切替制御部１
７から与えられる素片記憶部番号をもとに、同番号の指
定する音声素片記憶部１２〜１４のうちの１つを選択し
て、音韻パラメータ生成部２０に切替接続する。これに
より、入出力装置７から渡された送信者ＩＤがＩＤ＃１
であれば音声素片記憶部１２が、ＩＤ＃２であれば音声
素片記憶部１３が、それ以外であれば音声素片記憶部１
４が、音韻パラメータ生成部２０に接続されることにな
る。The speech unit switching unit 15 is a unit switching control unit 1.
Based on the phoneme unit storage unit number given from 7, one of the phoneme unit storage units 12 to 14 designated by the same number is selected and connected to the phoneme parameter generation unit 20 by switching. As a result, the sender ID passed from the input / output device 7 is ID # 1.
If so, the speech unit storage unit 12 is used. If ID # 2, the speech unit storage unit 13 is used. Otherwise, the speech unit storage unit 1 is used.
4 will be connected to the phoneme parameter generation unit 20.

【００３２】さて、言語解析・音韻長決定部１８での言
語解析によって生成されたアクセント情報は音韻長と共
に韻律パラメータ生成部１９に渡され、音韻系列（読み
に関する情報）は音韻長と共に音韻パラメータ生成部２
０に渡される。The accent information generated by the linguistic analysis in the linguistic analysis / phoneme length determination unit 18 is passed to the prosody parameter generation unit 19 together with the phoneme length, and the phoneme sequence (information regarding reading) is generated together with the phoneme length. Part 2
Passed to 0.

【００３３】韻律パラメータ生成部１９は、言語解析・
音韻長決定部１８から渡されたアクセント情報と音韻
長、さらには自身が保持する送信者に無関係の基本ピッ
チ（例えば、音声素片記憶部１４に記憶されている母音
素片のもととなった音声の発声者の基本ピッチ）に応じ
てピッチパターンからなる韻律パラメータを生成する。The prosody parameter generator 19 is for language analysis /
The accent information and the phoneme length passed from the phoneme length determination unit 18, and a basic pitch irrelevant to the sender, which is held by the phoneme length determination unit 18 (for example, becomes a source of a vowel unit stored in the voice unit storage unit 14). A prosody parameter consisting of a pitch pattern is generated according to the basic pitch of the speaker of the voice.

【００３４】一方、音韻パラメータ生成部２０は、言語
解析・音韻長決定部１８から渡された音韻系列（読みの
音韻系列）をもとに、必要な音声素片を、母音素片につ
いては音声素片切替部１５によって切替接続されている
音声素片記憶部１２〜１４のうちの１つから、子音素片
（ここでは、Ｃｖ素片及びｖｃ素片）については音声素
片記憶部１１から、それぞれ読み出し、これらを言語解
析・音韻長決定部１８から渡された各音韻長に従って補
間接続して音韻パラメータを生成する。On the other hand, the phonological parameter generation unit 20 determines the necessary speech units for the vowel units based on the phoneme sequence (phonetic sequence of reading) passed from the language analysis / phoneme length determination unit 18. From one of the voice unit storage units 12 to 14 that are switched and connected by the unit switching unit 15, from the voice unit storage unit 11 for consonant units (here, Cv unit and vc unit). , And each of them is read out and interpolated according to each phoneme length passed from the language analysis / phoneme length determination unit 18 to generate a phoneme parameter.

【００３５】したがって、入出力装置７から渡された送
信者ＩＤがＩＤ＃１であれば音声素片記憶部１１及び
（そのＩＤ＃１の送信者が発声した音声より作成された
母音素片が記憶されている）音声素片記憶部１２から、
ＩＤ＃２であれば音声素片記憶部１１及び（そのＩＤ＃
２の送信者が発声した音声より作成された母音素片が記
憶されている）音声素片記憶部１３から、それ以外であ
れば音声素片記憶部１１及び（アナウンサ発声の音声よ
り作成された母音素片が記憶されている）音声素片記憶
部１４から、それぞれ必要な音声素片が読み出され、音
韻パラメータの生成に用いられる。Therefore, if the sender ID passed from the input / output device 7 is ID # 1, the voice unit storage unit 11 and (the vowel unit created from the voice uttered by the sender of the ID # 1 (Stored) from the speech unit storage unit 12,
If it is ID # 2, the speech unit storage unit 11 and (the ID #
2 is stored from the voice unit storage unit 13 (where the vowel units generated from the voices uttered by the sender are stored), and the voice unit storage unit 11 and the voice unit voices generated by the announcer (otherwise) are stored. The necessary speech units are read out from the speech unit storage unit 14 (in which the vowel units are stored) and used to generate phonological parameters.

【００３６】ここで、音韻パラメータ生成部２０による
音韻パラメータ生成のための素片間の接続は、図３
（ｂ）に示すように行われる。即ち、フレームの繰り返
し区間と補間区間を挿入・調節して、言語解析・音韻長
決定部１８で決定された各音韻長に合わせながら素片間
が接続されていく。ここでは、補間方法としてケプスト
ラムパラメータ各次数の線形補間を用いている。Here, the connection between the pieces for the phoneme parameter generation by the phoneme parameter generator 20 is shown in FIG.
It is performed as shown in (b). That is, by inserting and adjusting the repeating section and the interpolation section of the frame, the phonemes are connected while matching the phoneme lengths determined by the language analysis / phoneme length determination unit 18. Here, linear interpolation of each degree of the cepstrum parameter is used as the interpolation method.

【００３７】音韻パラメータ生成部２０によって生成さ
れた音韻パラメータは合成器フィルタ２１に供給され
る。この合成器フィルタ２１には、韻律パラメータ生成
部１９によって生成された韻律パラメータも供給され
る。合成器フィルタ２１は、例えばＬＭＡフィルタ（対
数振幅特性近似フィルタ）であり、韻律パラメータ生成
部１９からの韻律パラメータをもとに音源パルスを生成
し、音韻パラメータ生成部２０からの音韻パラメータを
フィルタ係数として合成音声を作り出す。The phoneme parameters generated by the phoneme parameter generator 20 are supplied to the synthesizer filter 21. The prosody parameter generated by the prosody parameter generating unit 19 is also supplied to the synthesizer filter 21. The synthesizer filter 21 is, for example, an LMA filter (logarithmic amplitude characteristic approximation filter), generates a sound source pulse based on the prosody parameter from the prosody parameter generation unit 19, and uses the phonology parameter from the phonology parameter generation unit 20 as filter coefficients. Produces a synthetic voice as.

【００３８】このようにして合成された音声は、母音部
分だけが、選択された母音素片（即ち、声質切替部１０
内の音声素片切替部１５によって音韻パラメータ生成部
２０に切替接続された音声素片記憶部１２〜１４のうち
の１つより読み出した母音素片）から得られる声質とな
る。ところが、母音部は音声の声質に最も影響を与える
部分である。In the voice synthesized in this way, only the vowel part is selected (ie, the voice quality switching section 10).
The voice quality is obtained from the vowel unit read out from one of the voice unit storage units 12 to 14 which is switched and connected to the phoneme parameter generation unit 20 by the voice unit switching unit 15 therein. However, the vowel part is the part that most affects the voice quality.

【００３９】したがって、送信者ＩＤがＩＤ＃１または
ＩＤ＃２の特定の送信者から送られたメール（中のテキ
スト）を読み上げる場合には、母音部以外に、送信者に
無関係の者（アナウンサ）が発声した音声から作成され
た音声素片（が記憶された音声素片記憶部１１の情報）
を共通に用いていても、その送信者の声質で読み上げを
行うことができる。即ち本実施例によれば、母音部以外
については、送信者ＩＤがＩＤ＃１またはＩＤ＃２の特
定の送信者が発声した音声から作成された音声素片（が
記憶される音声素片記憶部）を用意していなくても、そ
の特定の送信者の声質で読み上げを行うことができる。Therefore, when reading a mail (text inside) sent from a specific sender whose sender ID is ID # 1 or ID # 2, in addition to the vowel part, a person unrelated to the sender (announcer) ) The voice unit created from the voice uttered by () (the information of the voice unit storage unit 11 in which is stored)
Can be read aloud with the voice quality of the sender, even if is commonly used. That is, according to the present embodiment, except for the vowel part, the speech unit stored is a speech unit created from a voice uttered by a specific sender whose sender ID is ID # 1 or ID # 2. It is possible to read aloud with the voice quality of the specific sender even if no section is prepared.

【００４０】これに対し、メールがＩＤ＃１またはＩＤ
＃２以外の送信者ＩＤの送信者からのものであれば、上
記したようにアナウンサ発声の音声から作成された音声
素片が記憶された音声素片記憶部１１及び音声素片記憶
部１４が用いられ、一定の声質での読み上げが行われ
る。On the other hand, the mail is ID # 1 or ID
If it is from a sender having a sender ID other than # 2, the voice unit storage unit 11 and the voice unit storage unit 14 storing the voice unit created from the voice of the announcer as described above are stored. It is used to read aloud with a certain voice quality.

【００４１】なお、上記第１実施例で述べた音声合成装
置６（３）における声質切り替えのための手段は同実施
例に限定されるものではなく、例えば、合成音声の高品
質化のために、音声素片記憶部１１〜１４内の各音声素
片と同時に、音源残差信号を保持しておき、合成の際こ
れを合成器フィルタ２１に渡すようにしてもよい。The means for switching the voice quality in the voice synthesizing device 6 (3) described in the first embodiment is not limited to that of the same embodiment. For example, in order to improve the quality of the synthesized voice. The sound source residual signal may be held at the same time as each of the speech units in the speech unit storage units 11 to 14 and passed to the synthesizer filter 21 at the time of synthesis.

【００４２】また、上記第１実施例では、韻律パラメー
タ生成部１９での韻律パラメータ生成に、送信者に無関
係の基本ピッチを用いているが、これに限るものではな
い。例えば、音声素片記憶部１２〜１４に記憶されてい
る母音素片のもととなった音声の３名の発声者の基本ピ
ッチ（がそれぞれ記憶される３つの基本ピッチ記憶部）
を用意し、音声素片切替部１５によって選択された音声
素片記憶部に対応する基本ピッチを用いるようにしても
よい。このようにした場合、合成音声の声質が、母音素
片のもととなった音声の発声者の声質により近くなる。Further, in the first embodiment, the basic pitch irrelevant to the sender is used for the prosody parameter generation in the prosody parameter generation unit 19, but the present invention is not limited to this. For example, the basic pitches of the three utterers of the voice that is the source of the vowel element stored in the speech element storage sections 12 to 14 (three basic pitch storage sections in which are stored)
May be prepared, and the basic pitch corresponding to the voice unit storage unit selected by the voice unit switching unit 15 may be used. In such a case, the voice quality of the synthesized voice becomes closer to the voice quality of the speaker of the voice that is the source of the vowel segment.

【００４３】（第２実施例）次に、本発明の第２実施例
について説明する。図４は本発明のテキスト読み上げ装
置を適用する電子メールシステムの第２実施例を示すシ
ステム構成図であり、図１と同一部分には同一符号を付
して説明を省略する。(Second Embodiment) Next, a second embodiment of the present invention will be described. FIG. 4 is a system configuration diagram showing a second embodiment of an electronic mail system to which the text-to-speech apparatus of the present invention is applied. The same parts as those in FIG.

【００４４】図４において、４１，４７はテキストの入
出力等を行う入出力装置である。この入出力装置４１，
４７が図１の入出力装置１，７と異なるのは、テキスト
を送る際に、音声の規則合成の際必要となるパラメータ
のうち少なくとも声質に関連するパラメータ、例えばそ
の送信者に固有の母音素片、基本ピッチ及び各母音の残
差信号（音源残差信号）を併せて送るテキスト・声質関
連パラメータ送出部５０を有している点である。４３，
４６は入出力装置４１，４７と接続された音声合成装置
である。この音声合成装置４３，４６は、送信側から送
られたテキストを、同テキストと共に送られた声質関連
パラメータを用いて規則合成により読み上げる。In FIG. 4, reference numerals 41 and 47 denote input / output devices for inputting / outputting text and the like. This input / output device 41,
47 is different from the input / output devices 1 and 7 of FIG. 1 in that at the time of sending a text, at least a parameter related to voice quality among parameters necessary for rule synthesis of voices, for example, a vowel phoneme peculiar to the sender. On the other hand, it has a text / voice quality related parameter sending unit 50 that sends together the basic pitch and the residual signal (sound source residual signal) of each vowel. 43,
Reference numeral 46 is a voice synthesizer connected to the input / output devices 41 and 47. The voice synthesizers 43 and 46 read the text sent from the sender side by rule synthesis using the voice quality related parameters sent together with the text.

【００４５】図５は図４における音声合成装置４６（４
３）のブロック構成図である。音声合成装置４６（４
３）は、図１の入出力装置４７（４１）から渡される送
信側からの基本ピッチを記憶するための基本ピッチ記憶
部５１と、同じく母音素片を記憶するための外部音声素
片記憶部５２と、同じく各母音の音源残差信号を記憶す
るための音源残差記憶部５３と、前記第１実施例で述べ
た子音素片Ｃｖと母音過渡部素片ｖｃが記憶されている
（図２の音声素片記憶部１１に相当する）音声素片記憶
部５４と、制御部５５とを有している。制御部５５は、
上記記憶部５１〜５３に対する情報記憶を制御する。FIG. 5 shows the voice synthesizer 46 (4 in FIG. 4).
It is a block block diagram of 3). Speech synthesizer 46 (4
3) is a basic pitch storage unit 51 for storing the basic pitch from the transmitting side delivered from the input / output device 47 (41) in FIG. 1 and an external speech unit storage unit for similarly storing vowel units. 52, a sound source residual storage unit 53 for similarly storing the sound source residual signal of each vowel, the consonant element Cv and the vowel transient section element vc described in the first embodiment (FIG. It has a voice unit storage unit 54 (corresponding to the second voice unit storage unit 11) and a control unit 55. The control unit 55
Information storage in the storage units 51 to 53 is controlled.

【００４６】音声合成装置４６（４３）はまた、入出力
装置４７（４１）から渡される送信側からのテキストの
言語解析に基づく音韻系列、アクセント情報の生成、各
音韻長の決定等を行う言語解析・音韻長決定部５８と、
言語解析・音韻長決定部５８で生成されたアクセント情
報及び音韻長と基本ピッチ記憶部５１に記憶された基本
ピッチに基づく韻律パラメータの生成を行う韻律パラメ
ータ生成部５９と、音韻パラメータ生成部６０と、合成
器フィルタ６１とを有している。音韻パラメータ生成部
６０は、言語解析・音韻長決定部５８で生成された音韻
系列及び音韻長に従い、外部音声素片記憶部５２及び音
声素片記憶部５４に記憶されている音声素片を用いて音
韻パラメータを生成する。また合成器フィルタ６１は例
えばＬＭＡフィルタであり、韻律パラメータ生成部５９
からの韻律パラメータ、音韻パラメータ生成部６０から
の音韻パラメータ及び音源残差記憶部５３からの音源残
差信号をもとに、合成音声を生成する。The speech synthesizer 46 (43) also produces a phoneme sequence and accent information based on the linguistic analysis of the text sent from the input / output device 47 (41) from the sender, and determines the phoneme length. An analysis / phoneme length determination unit 58,
A prosodic parameter generation unit 59 that generates prosodic parameters based on the accent information and the phoneme length generated by the language analysis / phoneme length determination unit 58 and the basic pitch stored in the basic pitch storage unit 51, and a phoneme parameter generation unit 60. , Synthesizer filter 61. The phoneme parameter generation unit 60 uses the speech units stored in the external speech unit storage unit 52 and the speech unit storage unit 54 according to the phoneme sequence and the phoneme length generated by the language analysis / phoneme length determination unit 58. To generate phonological parameters. The synthesizer filter 61 is, for example, an LMA filter, and the prosody parameter generating unit 59.
Based on the prosody parameter from the sound source, the phoneme parameter from the phoneme parameter generation unit 60, and the sound source residual signal from the sound source residual storage unit 53, synthetic speech is generated.

【００４７】ここで、図４及び図５の構成の動作を、入
出力装置４１及びメールシステム２側からメールを送信
し、メールシステム５及び入出力装置４７側で受けて、
音声合成装置４６にてテキスト読み上げを行う場合を例
に説明する。Here, the operations of the configurations shown in FIGS. 4 and 5 are transmitted from the input / output device 41 and the mail system 2 side and received by the mail system 5 and the input / output device 47 side.
An example will be described in which the voice synthesizer 46 reads a text.

【００４８】まず入出力装置４１で入力したテキストを
送信する際には、同装置４１内のテキスト・声質関連パ
ラメータ送出部５０により、同テキストと共に、同テキ
ストの送信者に固有の声質関連パラメータ、例えばその
送信者の発声した音声により作成された母音素片、送信
者の声の基本ピッチ及び各母音の残差信号（音源残差信
号）を含む情報が併せてメールシステム２に送出され
る。First, when the text input by the input / output device 41 is transmitted, the text / voice quality related parameter sending section 50 in the same device 41, together with the text, the voice quality related parameters unique to the sender of the text, For example, information including the vowel element created by the voice uttered by the sender, the basic pitch of the voice of the sender, and the residual signal (sound source residual signal) of each vowel is sent to the mail system 2.

【００４９】テキスト・声質関連パラメータ送出部５０
から送出されたテキスト及び声質関連パラメータ（母音
素片、基本ピッチ及び各母音の音源残差信号）を含む情
報は、送信側メールシステム２を通じてメールとして送
信され、メッセージ通信システム４を介して受信側メー
ルシステム５に送られる。Text / voice quality related parameter sending unit 50
Information including the text and voice quality-related parameters (vowel element, basic pitch, and sound source residual signal of each vowel) sent from the sender is sent as mail via the sender mail system 2 and received via the message communication system 4 to the receiver. It is sent to the mail system 5.

【００５０】受信側メールシステム５に送られたメール
は、同システム５で受け取られる。このシステム５が受
け取ったメールの内容は入出力装置４７で必要なときに
出力することができる。図４の構成では、このとき、入
出力装置４７から音声合成装置４６に対し、受信したテ
キストと共に受信した声質関連パラメータ（母音素片、
基本ピッチ及び各母音の音源残差信号）を渡すことで、
同合成装置４６により、そのテキストを送信者に対応し
た声質の合成音声で出力することができる。The mail sent to the receiving mail system 5 is received by the mail system 5. The contents of the mail received by the system 5 can be output by the input / output device 47 when necessary. In the configuration of FIG. 4, at this time, the input / output device 47 instructs the voice synthesizer 46 to receive the voice-related parameters (vowel segment,
By passing the basic pitch and the sound source residual signal of each vowel,
The synthesizer 46 can output the text as a synthesized voice having a voice quality corresponding to the sender.

【００５１】この音声合成装置４６での音声合成処理に
ついて以下に説明する。まず、図４の入出力装置４７か
ら音声合成装置４６に渡されたメール中の声質関連パラ
メータ（母音素片、基本ピッチ及び各母音の音源残差信
号）は、この音声合成装置４６に設けられた図５に示す
制御部５５に導かれる。制御部５５は、入出力装置４７
から渡された（送信者に固有の）声質関連パラメータの
うち、母音素片は外部音声素片記憶部５２に、基本ピッ
チは基本ピッチ記憶部５１に、各母音の音源残差信号は
音源残差記憶部５３に、それぞれ格納する。The voice synthesizing process in the voice synthesizing device 46 will be described below. First, the voice quality-related parameters (vowel element, basic pitch and sound source residual signal of each vowel) in the mail passed from the input / output device 47 of FIG. 4 to the voice synthesizer 46 are provided in the voice synthesizer 46. 5 is led to the control unit 55 shown in FIG. The controller 55 controls the input / output device 47.
Among the voice quality-related parameters (unique to the sender) passed from the vowel segment, the vowel segment is stored in the external speech segment storage unit 52, the basic pitch is stored in the basic pitch storage unit 51, and the sound source residual signal of each vowel is the sound source residual signal. The difference is stored in the difference storage unit 53.

【００５２】また、図４の入出力装置４７から音声合成
装置４６に渡されたメール中のテキストは、この音声合
成装置４６に設けられた図５に示す言語解析・音韻長決
定部５８に導かれる。言語解析・音韻長決定部５８は、
このテキストを対象として言語解析を行い、音韻系列、
アクセント情報及び各音韻長を求める。言語解析・音韻
長決定部５８で求められたアクセント情報は音韻長と共
に韻律パラメータ生成部５９に渡され、音韻系列は音韻
長と共に音韻パラメータ生成部６０に渡される。The text in the mail passed from the input / output device 47 of FIG. 4 to the voice synthesizer 46 is guided to the language analysis / phoneme length determination unit 58 shown in FIG. Get burned. The language analysis / phoneme length determination unit 58
A linguistic analysis is performed on this text, and a phoneme sequence,
Accent information and phoneme lengths are obtained. The accent information obtained by the language analysis / phoneme length determination unit 58 is transferred to the prosody parameter generation unit 59 together with the phoneme length, and the phoneme sequence is transferred to the phoneme parameter generation unit 60 together with the phoneme length.

【００５３】韻律パラメータ生成部５９は、言語解析・
音韻長決定部５８から渡されたアクセント情報及び音韻
長と、基本ピッチ記憶部５１に記憶されている送信者に
固有の基本ピッチに応じてピッチパターンからなる韻律
パラメータを生成する。The prosody parameter generating section 59 is for language analysis /
A prosody parameter including a pitch pattern is generated according to the accent information and the phoneme length passed from the phoneme length determination unit 58 and the basic pitch unique to the sender stored in the basic pitch storage unit 51.

【００５４】一方、音韻パラメータ生成部６０は、言語
解析・音韻長決定部５８から渡された音韻系列（読みの
音韻系列）に従い、必要な音声素片を、母音素片につい
ては外部音声素片記憶部５２から、子音素片（ここで
は、Ｃｖ素片及びｖｃ素片）については音声素片記憶部
５４から、それぞれ読み出し、これらを言語解析・音韻
長決定部５８から渡された各音韻長に従って補間接続し
て音韻パラメータを生成する。この素片間の接続は、前
記第１実施例で図３（ｂ）を参照して説明したように行
われる。On the other hand, the phonological parameter generation unit 60, according to the phonological sequence (phonetic sequence of reading) passed from the language analysis / phonological length determination unit 58, outputs necessary speech units and, for vowel units, external speech units. From the storage unit 52, the consonant units (here, the Cv unit and the vc unit) are read from the voice unit storage unit 54, and these phoneme lengths are passed from the language analysis / phoneme length determination unit 58. To interpolate and generate phonological parameters. The connection between the pieces is performed as described with reference to FIG. 3B in the first embodiment.

【００５５】音韻パラメータ生成部６０によって生成さ
れた音韻パラメータは合成器フィルタ６１に供給され
る。この合成器フィルタ６１には、韻律パラメータ生成
部５９によって生成された韻律パラメータも供給され
る。合成器フィルタ６１は、韻律パラメータ生成部５９
からの韻律パラメータと音源残差記憶部５３に記憶され
ている各母音の音源残差信号ををもとにフィルタ音源信
号を生成し、音韻パラメータ生成部６０からの音韻パラ
メータをフィルタ係数として合成音声を作り出す。The phoneme parameters generated by the phoneme parameter generator 60 are supplied to the synthesizer filter 61. The prosody parameter generated by the prosody parameter generating unit 59 is also supplied to the synthesizer filter 61. The synthesizer filter 61 includes a prosody parameter generation unit 59.
Based on the prosody parameter from the sound source residual storage unit 53 and the sound source residual signal of each vowel stored in the sound source residual storage unit 53, and the synthesized speech is generated using the phonological parameter from the phonological parameter generation unit 60 as a filter coefficient. To produce.

【００５６】このように本実施例（第２実施例）によれ
ば、音声の声質に最も影響の大きい母音部音声は、送信
側から送られて外部音声素片記憶部５２に記憶された母
音素片、即ち送信者自身の音声データから作成された母
音素片をもとに作られる。したがって、母音部以外に
は、送信者に無関係の者（アナウンサ）が発声した音声
から作成された音声素片（が記憶された音声素片記憶部
５４の情報）を共通に用いていても、受信側ではメール
送信者の声質で音声を合成することができる。As described above, according to the present embodiment (second embodiment), the vowel part voice having the greatest influence on the voice quality of the voice is sent from the transmitting side and stored in the external voice unit storage part 52. It is created based on a segment, that is, a vowel segment created from the voice data of the sender himself. Therefore, in addition to the vowel part, even if a voice unit (information of the voice unit storage unit 54 in which the voice unit is stored) created from a voice uttered by a person (announcer) irrelevant to the sender is commonly used, On the receiving side, the voice can be synthesized with the voice quality of the mail sender.

【００５７】なお、上記第２実施例では、声質関連パラ
メータとして、母音素片、基本ピッチ及び音源残差信号
の３種を用いているが、例えば母音素片の１種のみ或い
は母音素片を含む２種のみを用いてもよく、さらに発声
速度などの別のパラメータを加えても構わない。In the second embodiment, as the voice quality-related parameters, three kinds of vowel pieces, the basic pitch and the sound source residual signal are used. However, for example, only one kind of vowel pieces or vowel pieces are used. It is possible to use only two types including the above, and further to add another parameter such as the speaking rate.

【００５８】以上、本発明の第１及び第２実施例につい
て説明したが、本発明はこれらの実施例に限定されるも
のではない。例えば、合成パラメータの種類や音声素片
接続方法についても限定はなく、ケプストラムパラメー
タ以外でもＬＰＣ（Linear Predictive Coding）等他の
合成パラメータを使用しても構わない。要するに本発明
はその要旨を逸脱しない範囲で種々変形して実施するこ
とができる。Although the first and second embodiments of the present invention have been described above, the present invention is not limited to these embodiments. For example, there is no limitation on the type of synthesis parameter or the method of connecting speech units, and other synthesis parameters such as LPC (Linear Predictive Coding) other than the cepstrum parameter may be used. In short, the present invention can be variously modified and implemented without departing from the scope of the invention.

【００５９】[0059]

【発明の効果】以上詳述したように、本発明のテキスト
読み上げ装置によれば、受信側でテキストを受け取り合
成音で読み上げる際、テキストと共に送られてくる送信
元識別コードによって、読み上げる合成音の声質を切り
替える構成としたので、受信側となっている利用者が合
成音声に飽きたり、疲れたりすることなく、さらには声
質により送信者を判断することもできるようになる。As described above in detail, according to the text-to-speech apparatus of the present invention, when the receiving side receives the text and reads it out with the synthetic voice, the synthetic voice to be read out is transmitted by the sender identification code sent together with the text. Since the voice quality is switched, the user on the receiving side can judge the sender based on the voice quality without getting tired or tired of the synthesized voice.

【００６０】また、本発明によれば、送信側から受信側
にテキストを送る際に、音声の規則合成の際必要となる
声質に関わるパラメータを併せて送り、受信側ではその
パラメータを規則合成の際に用いる構成としたので、送
信元が意図した声質でテキストの読み上げを行うことが
でき、テキストの送信者によって読み上げの声質が異な
り、受信側利用者は合成音声に飽きたり疲れたりするこ
となく、さらには声質により送信者を判断することもで
きる。Further, according to the present invention, when the text is sent from the sending side to the receiving side, the parameter relating to the voice quality necessary for the rule synthesis of voice is also sent, and the parameter is rule-synthesized at the receiving side. Since it is configured to be used when reading, the text can be read aloud with the voice quality intended by the sender, and the voice quality of the reading differs depending on the sender of the text, so that the receiving user does not get tired or tired of the synthesized voice. Moreover, the sender can be judged by the voice quality.

[Brief description of drawings]

【図１】本発明のテキスト読み上げ装置を適用する電子
メールシステムの第１実施例を示すシステム構成図。FIG. 1 is a system configuration diagram showing a first embodiment of an electronic mail system to which a text-to-speech device of the present invention is applied.

【図２】図１における音声合成装置６（３）の構成を示
すブロック図。FIG. 2 is a block diagram showing the configuration of a speech synthesizer 6 (3) in FIG.

【図３】音声素片の切り出しと音声素片接続を説明する
ための図。FIG. 3 is a diagram for explaining clipping of voice units and connection of voice units.

【図４】本発明のテキスト読み上げ装置を適用する電子
メールシステムの第２実施例を示すシステム構成図。FIG. 4 is a system configuration diagram showing a second embodiment of an electronic mail system to which the text-to-speech device of the present invention is applied.

【図５】図４における音声合成装置４６（４３）の構成
を示すブロック図。5 is a block diagram showing a configuration of a speech synthesizer 46 (43) in FIG.

[Explanation of symbols]

１，７，４１，４７…入出力装置、２，５…メールシス
テム、３，６，４３，４６…音声合成装置、４…メッセ
ージ通信システム、１０…声質切替部、１１〜１４，５
４…音声素片記憶部、１５…音声素片切替部、１６…送
信者ＩＤ／素片記憶部番号対応テーブル、１７…素片切
替制御部、１８，５８…言語解析・音韻長決定部、１
９，５９…韻律パラメータ生成部、２０，６０…音韻パ
ラメータ生成部、２１，６１…合成器フィルタ、５０…
テキスト・声質関連パラメータ送出部、５１…基本ピッ
チ記憶部、５２…外部音声素片記憶部、５３…音源残差
記憶部。1, 7, 41, 47 ... Input / output device, 2, 5 ... Mail system, 3, 6, 43, 46 ... Voice synthesizing device, 4 ... Message communication system, 10 ... Voice quality switching unit, 11-14, 5
4 ... Speech element storage unit, 15 ... Speech element switching unit, 16 ... Sender ID / unit storage unit number correspondence table, 17 ... Element switching control unit, 18, 58 ... Language analysis / phoneme length determination unit, 1
9, 59 ... Prosodic parameter generation unit, 20, 60 ... Phonological parameter generation unit 21, 61 ... Synthesizer filter, 50 ...
Text / voice quality related parameter sending unit, 51 ... Basic pitch storage unit, 52 ... External speech unit storage unit, 53 ... Sound source residual storage unit.

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁵ 識別記号庁内整理番号ＦＩ技術表示箇所Ｇ１０Ｌ 5/04 Ｆ 8946−5Ｈ ─────────────────────────────────────────────────── ─── Continuation of the front page (51) Int.Cl. ⁵ Identification code Office reference number FI technical display location G10L 5/04 F 8946-5H

Claims

[Claims]

1. A transmission side of a communication system sends at least a text and a transmission source identification code to a communication medium, the reception side receives the text and the transmission source identification code, and the received text is read aloud by synthetic speech by rule synthesis of voice. In the text reading device, the text reading device is provided with a voice quality switching unit for switching the voice quality of the read synthetic speech according to the received transmission source identification code.

2. A text-to-speech device in which at least a text is sent to a communication medium at a transmission side of a communication system, the text is received at a reception side, and the received text is read aloud by a synthetic voice by rule-based synthesis of speech, from a transmission side to a reception side. When sending the text to, the text / voice quality related parameter sending means for sending at least a parameter related to the voice quality among the parameters necessary for the rule synthesis of the voice is provided, and the text / voice quality related parameter is sent to the receiving side. A text-to-speech device, which reads out the text by rule composition using the parameter sent from the sending means.