JPH09251373A

JPH09251373A - Sound synthesis method/device

Info

Publication number: JPH09251373A
Application number: JP8058866A
Authority: JP
Inventors: Naoto Iwahashi; 直人岩橋; Keiichi Yamada; 敬一山田; Satoshi Miyazaki; 敏宮崎
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1996-03-15
Filing date: 1996-03-15
Publication date: 1997-09-22

Abstract

PROBLEM TO BE SOLVED: To set only a necessary part in an electronic mail to be synthesized to sound by recognizing prescribed indication information contained in an input sentence and extracting a part which is sound-synthesized from the input sentence in accordance with the indication information so as to generate synthesis sound. SOLUTION: A command recognition processing part 1 recognizes prescribed indication information contained in the input sentence and extracts the part which is synthesized to sound from the input sentence in accordance with indication information. Namely, the command recognition processing part 1 recognizes (retrieves)a sound command '∥speech' mentioned at the head of a line when a text file is inputted. When the command recognition processing part 1 recognizes the sound command '∥speech', it extracts the line where the sound command '∥speech' is mentioned on the head from the text file and supplies it to the text sound synthesis part 2 of a poststage.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、音声合成方法およ
び音声合成装置に関する。特に、理解し易い合成音を得
ることができるようにする音声合成方法および音声合成
装置に関する。TECHNICAL FIELD The present invention relates to a voice synthesizing method and a voice synthesizing apparatus. In particular, the present invention relates to a voice synthesizing method and a voice synthesizing device that enable to obtain a synthesized voice that is easy to understand.

【０００２】[0002]

【従来の技術】図８は、従来のテキスト音声合成装置の
一例の構成を示している。この音声合成装置は、テキス
ト音声合成部２、音声出力部５、およびスピーカ６から
構成されている。2. Description of the Related Art FIG. 8 shows the configuration of an example of a conventional text-to-speech synthesizer. This voice synthesizer is composed of a text voice synthesizer 2, a voice output unit 5, and a speaker 6.

【０００３】音声合成すべき、例えば日本語などのテキ
ストファイルとされた漢字仮名混じり文（入力文）は、
テキスト音声合成部２に供給される。テキスト音声合成
部２は、言語処理部３および音声合成部４から構成され
ており、テキストファイルは、言語処理部３に入力され
る。言語処理部３では、まず、テキストファイルを構成
する入力文中の各語句に対して、漢字の読みが付され
る。さらに、言語処理部３では、入力文の統語構造が解
析され、その解析結果に基づいて、アクセントに関する
アクセント情報が付加される。その後、音声合成部４に
おいて、入力文に付された読みおよびアクセント情報に
基づいて、韻律制御が行われながら、入力文に対応する
合成音が生成される。A sentence (input sentence) mixed with kana and kana, which is a text file of Japanese language, for example, to be speech-synthesized,
It is supplied to the text-to-speech synthesis unit 2. The text-to-speech synthesis unit 2 is composed of a language processing unit 3 and a speech synthesis unit 4, and the text file is input to the language processing unit 3. In the language processing unit 3, first, the kanji reading is added to each word / phrase in the input sentence that forms the text file. Furthermore, the language processing unit 3 analyzes the syntactic structure of the input sentence, and adds accent information regarding the accent based on the analysis result. After that, the voice synthesis unit 4 generates a synthetic sound corresponding to the input sentence while performing prosody control based on the reading and accent information attached to the input sentence.

【０００４】即ち、音声合成部４では、例えば入力文に
付された読みに対応する音素片データが、アクセント情
報その他に基づいて、強調や、抑揚、ポーズなどの入力
文の文章の内容に即した韻律制御を行いながら接続され
る。That is, in the speech synthesis unit 4, for example, phoneme unit data corresponding to the reading attached to the input sentence is matched with the content of the sentence of the input sentence such as emphasis, intonation, and pause based on the accent information and the like. It is connected while performing the prosody control.

【０００５】具体的には、音声合成部４では、アクセン
ト情報その他に基づいて、合成音に適当な抑揚や強調部
分、ポーズを付加するための韻律情報が生成され、音素
片データが、韻律情報に基づいて接続される。即ち、韻
律情報に、例えば入力文のピッチパターンや、入力文を
構成する各音韻の継続時間、各音韻のパワーなどが含ま
れているときは、まず、ピッチパターンに基づいて、音
素片データを接続する間隔が調整され（音素片データの
ピッチ周期が調整され）、また、音韻の継続時間に基づ
いて、その音韻に対応する音素片データを繰り返し接続
する回数が制御される。さらに、音韻のパワーに基づい
て、その音韻に対応する音素片データの振幅が制御され
る。Specifically, the voice synthesis unit 4 generates prosodic information for adding appropriate intonation, emphasis, and pause to the synthesized voice based on the accent information and the like, and the phoneme piece data is converted into prosodic information. Are connected based on. That is, when the prosody information includes, for example, the pitch pattern of the input sentence, the duration of each phoneme that constitutes the input sentence, the power of each phoneme, and the like, first, phoneme segment data is obtained based on the pitch pattern. The connection interval is adjusted (the pitch period of the phoneme piece data is adjusted), and the number of times of repeatedly connecting the phoneme piece data corresponding to the phoneme is controlled based on the phoneme duration. Further, the amplitude of the phoneme piece data corresponding to the phoneme is controlled based on the power of the phoneme.

【０００６】以上のようにして音素片データを、韻律情
報に基づいて接続して得られた音声波形は、音声出力部
５に供給される。音声出力部５は、例えばＤ／Ａ変換器
およびアンプなどを内蔵しており、テキスト音声合成部
２（音声合成部４）からの音声データをＤ／Ａ変換し、
さらに、そのレベルを適正に調整して、スピーカ６に供
給する。これにより、スピーカ６からは、入力文に対応
した合成音が出力される。The speech waveform obtained by connecting the phoneme unit data based on the prosody information as described above is supplied to the speech output unit 5. The voice output unit 5 includes, for example, a D / A converter and an amplifier, and D / A converts the voice data from the text voice synthesis unit 2 (voice synthesis unit 4).
Further, the level is adjusted appropriately and supplied to the speaker 6. As a result, the speaker 6 outputs a synthesized sound corresponding to the input sentence.

【０００７】ところで、最近では、インターネットが急
速に普及し、メッセージのやりとりを電子メール（E-ma
il）で行うことが多くなってきた。電子メールは、相手
が不在かどうかに拘らず送信することができ、また、相
手方からすれば、送信されてきた電子メールは、いつで
も見ることができるので、電話のように、自身または相
手方のいずれかが不在であるために連絡をとることがで
きないといったようなことがない。[0007] By the way, recently, the Internet has spread rapidly, and messages are exchanged by electronic mail (E-ma).
il) is getting more done. E-mail can be sent regardless of whether the other party is absent, and the sent e-mail can be viewed at any time by the other party. There is no such thing as not being able to contact you because you are absent.

【０００８】しかしながら、電子メールを見るには、コ
ンピュータなどの端末が必要であり、従って、例えば外
出先から自身宛の電子メールを確認することは困難であ
った。However, a terminal such as a computer is required to view the e-mail, and therefore it is difficult to confirm the e-mail addressed to itself from a place where the user is out.

【０００９】そこで、いわゆるパソコン通信サービスを
提供しているＮＩＦＴＹ−Ｓｅｒｖｅ（商標）などで
は、電子メールの合成音による読み上げサービスが行わ
れている。このサービスによれば、ユーザが、電話機に
よって、センタ局にアクセスすると、自身宛の電子メー
ルが合成音により読み上げられるようになされており、
これにより、ユーザは、コンピュータがなくても、電子
メールを確認することができるようになされている。Therefore, NIFTY-Serve (trademark), which provides a so-called personal computer communication service, provides a reading service using a synthesized voice of electronic mail. According to this service, when the user accesses the center station by telephone, the e-mail addressed to himself is read aloud by the synthesized voice,
This allows the user to check the e-mail even without a computer.

【００１０】このような電子メールの合成音による読み
上げは、図８に示したような音声合成装置によって行わ
れる。The reading of the electronic mail by the synthesized voice is performed by the voice synthesizer as shown in FIG.

【００１１】[0011]

【発明が解決しようとする課題】ところで、図８の音声
合成装置においては、入力されたテキストファイルを処
理単位として音声合成処理が行われる。即ち、１つのテ
キストファイルが、音声合成装置に入力されると、その
テキストファイルに含まれるすべてのテキストデータが
音声合成の対象となり、従って、音声合成装置に、テキ
ストファイルとしての、例えば電子メールが入力された
場合には、その電子メールに含まれるすべてのテキスト
データを対象に、上述したような音声合成処理が行われ
る。By the way, in the voice synthesizing apparatus of FIG. 8, the voice synthesizing process is performed with the input text file as a processing unit. That is, when one text file is input to the speech synthesizer, all the text data included in the text file are subject to speech synthesis. Therefore, the speech synthesizer receives an e-mail as a text file, for example. When input, the above-described voice synthesis processing is performed on all text data included in the electronic mail.

【００１２】しかしながら、電子メールには、例えば図
９に示すように、日本語の漢字仮名混じり文の他に、記
号を組み合わせて作成された図や、差出人の細かな感情
を表すための、いわゆるフェイスマーク（例えば、（＾
＾）など）、電子メールが転送されてくるまでの経路を
表す情報などを含むメールヘッダ、差出人の、いわゆる
署名なども含まれており、このような部分まで合成音と
すると、その内容が理解し難くなる課題があった。However, as shown in FIG. 9, for example, a figure created by combining symbols in addition to Japanese kanji / kana mixed sentences, and so-called so-called e-mail for expressing the detailed feelings of the sender are included in the e-mail. Face mark (for example, (^
(^) Etc.), a mail header including information indicating the route until the email is transferred, a so-called signature of the sender, etc. are also included. There was a problem that became difficult to do.

【００１３】本発明は、このような状況に鑑みてなされ
たものであり、理解し易い合成音を得ることができるよ
うにするものである。The present invention has been made in view of such a situation, and it is possible to obtain a synthesized voice that is easy to understand.

【００１４】[0014]

【課題を解決するための手段】請求項１に記載の音声合
成方法は、入力文に含まれる所定の指示情報を認識し、
その指示情報に対応して、入力文から、音声合成する部
分を抽出し、その音声合成する部分に対応する合成音の
みを生成することを特徴とする。A speech synthesis method according to claim 1 recognizes predetermined instruction information included in an input sentence,
It is characterized in that a part to be voice-synthesized is extracted from the input sentence in accordance with the instruction information, and only a synthesized sound corresponding to the part to be voice-synthesized is generated.

【００１５】請求項４に記載の音声合成装置は、入力文
に含まれる所定の指示情報を認識する認識手段と、認識
手段により認識された指示情報に対応して、入力文か
ら、音声合成する部分を抽出する抽出手段と、抽出手段
により抽出された音声合成する部分に対応する合成音を
生成する音声合成手段とを備えることを特徴とする。According to a fourth aspect of the present invention, a voice synthesizing device synthesizes voice from an input sentence corresponding to the recognizing means for recognizing predetermined instruction information contained in the input sentence and the instruction information recognized by the recognizing means. The present invention is characterized by comprising extraction means for extracting a part and voice synthesis means for generating a synthetic sound corresponding to the part for voice synthesis extracted by the extraction means.

【００１６】請求項１に記載の音声合成方法において
は、入力文に含まれる所定の指示情報を認識し、その指
示情報に対応して、入力文から、音声合成する部分を抽
出し、その音声合成する部分に対応する合成音のみを生
成するようになされている。In the voice synthesizing method according to the first aspect, the predetermined instruction information included in the input sentence is recognized, the portion to be voice-synthesized is extracted from the input sentence corresponding to the instruction information, and the voice thereof is extracted. Only the synthesized sound corresponding to the part to be synthesized is generated.

【００１７】請求項４に記載の音声合成装置において
は、認識手段は、入力文に含まれる所定の指示情報を認
識し、抽出手段は、認識手段により認識された指示情報
に対応して、入力文から、音声合成する部分を抽出する
ようになされている。音声合成手段は、抽出手段により
抽出された音声合成する部分に対応する合成音を生成す
るようになされている。In the speech synthesizer according to the fourth aspect, the recognizing means recognizes the predetermined instruction information included in the input sentence, and the extracting means inputs the instruction information corresponding to the instruction information recognized by the recognizing means. From the sentence, the part to be synthesized by voice is extracted. The voice synthesizing unit is configured to generate a synthetic voice corresponding to the part to be voice-synthesized extracted by the extracting unit.

【００１８】[0018]

【発明の実施の形態】以下に、本発明の実施例を説明す
るが、その前に、特許請求の範囲に記載の発明の各手段
と以下の実施例との対応関係を明らかにするために、各
手段の後の括弧内に、対応する実施例（但し、一例）を
付加して、本発明の特徴を記述すると、次のようにな
る。DESCRIPTION OF THE PREFERRED EMBODIMENTS Embodiments of the present invention will be described below, but before that, in order to clarify the correspondence between each means of the invention described in the claims and the following embodiments. The features of the present invention are described as follows by adding a corresponding embodiment (however, an example) in parentheses after each means.

【００１９】即ち、請求項４に記載の音声合成装置は、
入力文に対応する合成音を生成する音声合成装置であっ
て、入力文に含まれる所定の指示情報を認識する認識手
段（例えば、図１や図３に示すコマンド認識処理部１な
ど）と、認識手段により認識された指示情報に対応し
て、入力文から、音声合成する部分を抽出する抽出手段
（例えば、図１や図３に示すコマンド認識処理部１な
ど）と、抽出手段により抽出された音声合成する部分に
対応する合成音を生成する音声合成手段（例えば、図１
や図３に示すテキスト音声合成部２など）とを備えるこ
とを特徴とする。That is, the speech synthesizer according to the fourth aspect is
A voice synthesizing device for generating a synthetic voice corresponding to an input sentence, comprising a recognition means for recognizing predetermined instruction information included in the input sentence (for example, a command recognition processing unit 1 shown in FIGS. 1 and 3), Corresponding to the instruction information recognized by the recognition means, an extraction means (for example, the command recognition processing unit 1 shown in FIG. 1 or FIG. 3) for extracting a part to be voice-synthesized from the input sentence and the extraction means. Voice synthesizing means for generating a synthesized voice corresponding to the part to be voice-synthesized (for example, FIG.
And the text-to-speech synthesis unit 2 shown in FIG. 3).

【００２０】請求項５に記載の音声合成装置は、入力文
の所定の部分に対し、指示情報を付加する付加手段（例
えば、図３に示すメールヘッダ処理部３１など）をさら
に備えることを特徴とする。The speech synthesizing apparatus according to a fifth aspect of the invention further comprises an addition means (for example, the mail header processing section 31 shown in FIG. 3) for adding instruction information to a predetermined portion of the input sentence. And

【００２１】なお、勿論この記載は、各手段を上記した
ものに限定することを意味するものではない。Of course, this description does not mean that each means is limited to the above.

【００２２】図１は、本発明を適用した音声合成装置の
一実施例の構成を示している。なお、図中、図８におけ
る場合と対応する部分については、同一の符号を付して
あり、以下では、その説明は、適宜省略する。即ち、こ
の音声合成装置は、コマンド認識処理部１が新たに設け
られている他は、図８の音声合成装置と同様に構成され
ている。FIG. 1 shows the configuration of an embodiment of a speech synthesizer to which the present invention is applied. In the figure, parts corresponding to those in FIG. 8 are designated by the same reference numerals, and description thereof will be omitted below as appropriate. That is, this voice synthesizer has the same configuration as the voice synthesizer of FIG. 8 except that the command recognition processing unit 1 is newly provided.

【００２３】コマンド認識処理部１は、入力文に含まれ
る所定の指示情報を認識し、その指示情報に対応して、
入力文から、音声合成する部分を抽出するようになされ
ている。そして、コマンド認識処理部１は、その抽出し
た部分のみを、後段のテキスト音声合成部２に供給する
ようになされている。The command recognition processing section 1 recognizes predetermined instruction information included in the input sentence, and in response to the instruction information,
From the input sentence, the part to be speech-synthesized is extracted. Then, the command recognition processing unit 1 is configured to supply only the extracted portion to the text-to-speech synthesis unit 2 in the subsequent stage.

【００２４】次に、その動作について説明する。なお、
ここでは、指示情報は、例えば、テキストファイルにお
ける音声合成する部分を、行単位で指示するもの（この
ような指示情報を、以下、適宜、音声化コマンドとい
う）とする。また、行は、改行コードまでを１行とし、
従って、テキストファイルの先頭から最初の改行コード
までが１行を構成し、その後は、改行コードから次の改
行コードまでが１行を構成し、最後の改行コードから、
テキストファイルの最後までが１行を構成するものとす
る。さらに、音声化コマンドは、例えば「\speech」で
表され、ある行を音声合成する場合には、その行頭に記
述されるものとする。Next, the operation will be described. In addition,
Here, the instruction information is, for example, information for instructing the portion of the text file to be voice-synthesized line by line (such instruction information will be referred to as a voice command hereinafter). In addition, the line is one line up to the line feed code,
Therefore, from the beginning of the text file to the first line feed code constitutes one line, after that, the line feed code to the next line feed code constitutes one line, and from the last line feed code,
It is assumed that the end of the text file constitutes one line. Further, the voice command is represented by, for example, "\ speech", and when a certain line is voice-synthesized, it is described at the beginning of the line.

【００２５】いま、例えば、次のようなテキストファイ
ルＴＦ１が、コマンド認識処理部１に入力されたものと
する（なお、先頭と終わりの括弧「、」でくくられた部
分が、テキストファイルの内容を表す）。Now, for example, assume that the following text file TF1 is input to the command recognition processing unit 1 (note that the portion enclosed by the leading and trailing parentheses "," is the content of the text file). Represents).

【００２６】テキストファイルＴＦ１：「この文章は、
本発明の実施例を説明するためのテキスト文です。従来
テキスト音声合成では、ひとつのテキストファイルを単
位として合成処理するため、テキストファイルの中に、
そのテキストの記述者が音声合成したい部分と音声合成
したくない部分があっても、その選択はできず、すべて
のテキストを音声合成してしまっていました。そのた
め、例えばこの行以前に書かれた内容は音声合成せず、
この行以降を音声合成したいという要望があったとして
も、それを選択して音声合成することはできませんでし
た。しかし、本発明の方法を使うとそれが可能になりま
す。」Text file TF1: "This text is
A text sentence for explaining the embodiment of the present invention. In conventional text-to-speech synthesis, one text file is used as a unit for synthesis processing.
Even if the writer of the text had some parts that he wanted to synthesize and some parts that he didn't want to synthesize, he couldn't select it and all the text was synthesized. Therefore, for example, the contents written before this line are not synthesized by voice,
Even if there was a request to synthesize speech after this line, it could not be selected and synthesized. However, the method of the present invention makes it possible. "

【００２７】コマンド認識処理部１では、テキストファ
イルＴＦ１が入力されると、行頭に記述されている音声
化コマンド「\speech」が認識（検索）される。しかし
ながら、この場合、テキストファイルＴＦ１には、行頭
に、音声化コマンド「\speech」が記述されている行が
存在しないため、コマンド認識処理部１では、音声化コ
マンド「\speech」が認識されず、従って、テキストフ
ァイルＴＦ１を構成する行のいずれも抽出されない。そ
の結果、後段のテキスト音声合成部２には、何も供給さ
れず、合成音は出力されない。When the text file TF1 is input, the command recognition processing unit 1 recognizes (searches) the voice command "\ speech" described at the beginning of the line. However, in this case, in the text file TF1, there is no line in which the voiced command “\ speech” is described at the beginning of the line, so the command recognition processing unit 1 does not recognize the voiced command “\ speech”. Therefore, none of the lines forming the text file TF1 is extracted. As a result, nothing is supplied to the text-to-speech synthesis unit 2 in the subsequent stage, and no synthetic sound is output.

【００２８】次に、例えば、以下のようなテキストファ
イルＴＦ２が、コマンド認識処理部１に入力されたもの
とする。Next, for example, assume that the following text file TF2 is input to the command recognition processing unit 1.

【００２９】テキストファイルＴＦ２：「この文章は、
本発明の実施例を説明するためのテキスト文です。従来
テキスト音声合成では、ひとつのテキストファイルを単
位として合成処理するため、テキストファイルの中に、
そのテキストの記述者が音声合成したい部分と音声合成
したくない部分があっても、その選択はできず、すべて
のテキストを音声合成してしまっていました。そのた
め、例えばこの行以前に書かれた内容は音声合成せず、 \speech この行以降を音声合成したいという要望があったとしても、そ \speech れを選択して音声合成することはできませんでした。しかし、本 \speech 発明の方法を使うとそれが可能になります。」Text file TF2: "This text is
A text sentence for explaining the embodiment of the present invention. In conventional text-to-speech synthesis, one text file is used as a unit for synthesis processing.
Even if the writer of the text had some parts that he wanted to synthesize and some parts that he didn't want to synthesize, he couldn't select it and all the text was synthesized. Therefore, for example, even if there is a request to synthesize the contents written before this line without speech synthesis, and \ speech, it was not possible to select this \ speech and synthesize it. . But the \ speech invention method makes it possible. "

【００３０】コマンド認識処理部１では、テキストファ
イルＴＦ２が入力されると、行頭に記述されている音声
化コマンド「\speech」が認識（検索）される。この場
合、テキストファイルＴＦ２の最後の３行の行頭に記述
されている音声化コマンド「\speech」が認識される。
コマンド認識処理部１は、音声化コマンド「\speech」
を認識すると、その音声化コマンド「\speech」が行頭
に記述されている行を、テキストファイルＴＦ２から抽
出し、後段のテキスト音声合成部２に供給する。即ち、
この場合、テキスト音声合成部２には、テキストデータ
「この行以降を音声合成したいという要望があったとし
ても、それを選択して音声合成することはできませんで
した。しかし、本発明の方法を使うとそれが可能になり
ます。」が入力文として供給される。When the text file TF2 is input, the command recognition processing unit 1 recognizes (searches) the voice command "\ speech" described at the beginning of the line. In this case, the voice command "\ speech" described at the beginning of the last three lines of the text file TF2 is recognized.
The command recognition processing unit 1 uses the voice command "\ speech".
When the voice recognition command is recognized, the line in which the voice command "\ speech" is described at the beginning of the line is extracted from the text file TF2 and supplied to the text voice synthesis unit 2 in the subsequent stage. That is,
In this case, the text-to-speech synthesizer 2 cannot select and synthesize the text data "even if there is a request to synthesize speech after this line. However, the method of the present invention is used. You can do that with it. "Is supplied as an input sentence.

【００３１】従って、テキスト音声合成部２では、音声
化コマンド「\speech」が行頭に記述されている行のみ
を対象に、前述の図８における場合と同様の音声合成処
理が行われ、その結果、スピーカ６からは、テキストフ
ァイルＴＦ２の最後の３行に対応する合成音「この行以
降を音声合成したいという要望があったとしても、それ
を選択して音声合成することはできませんでした。しか
し、本発明の方法を使うとそれが可能になります。」が
出力される。Therefore, the text-to-speech synthesis unit 2 performs the same speech synthesis processing as in the case of FIG. 8 described above only on the line in which the speech command "\ speech" is described at the beginning of the line. , From the speaker 6, the synthesized sound corresponding to the last three lines of the text file TF2 "Even if there is a request to synthesize speech after this line, it could not be selected and synthesized. , It is possible using the method of the present invention. "Is output.

【００３２】以上のように、行頭に記述されている音声
化コマンド「\speech」を認識し、その音声化コマンド
「\speech」が行頭に記述されている行を、テキストフ
ァイルＴＦ２から抽出して音声合成するようにしたの
で、ユーザ（テキストの記述者）は、音声合成すること
を希望する部分（ここでは、行）に、音声化コマンド
「\speech」を記述するだけで、その希望する部分だけ
を合成音で出力させることが可能となる。As described above, the voice command "\ speech" described at the beginning of the line is recognized, and the line having the voice command "\ speech" described at the beginning is extracted from the text file TF2. Since the voice synthesis is performed, the user (text writer) simply describes the voice command "\ speech" in the portion (here, line) that the user desires to perform voice synthesis. It is possible to output only a synthesized sound.

【００３３】なお、上述の場合においては、音声合成の
対象とする行のすべての行頭に音声化コマンド「\speec
h」を記述するようにしたが、例えば音声合成の対象と
する行が２行以上に亘る場合においては、最初の行の行
頭にだけ音声化コマンド「\speech」を記述し、音声合
成の対象とする部分を括弧で囲むようにすることも可能
である。In the above case, the voice command "\ speec" is added to the beginning of every line of the voice synthesis target.
Although "h" is described, for example, when there are two or more lines to be voice-synthesized, the voice command "\ speech" is described only at the beginning of the first line, and the voice-synthesis target is written. It is also possible to enclose the part to be enclosed in parentheses.

【００３４】即ち、例えば、以下のようなテキストファ
イルＴＦ３を、コマンド認識処理部１に入力するように
することも可能である。That is, for example, the following text file TF3 can be input to the command recognition processing section 1.

【００３５】テキストファイルＴＦ３：「この文章は、
本発明の実施例を説明するためのテキスト文です。従来
テキスト音声合成では、ひとつのテキストファイルを単
位として合成処理するため、テキストファイルの中に、
そのテキストの記述者が音声合成したい部分と音声合成
したくない部分があっても、その選択はできず、すべて
のテキストを音声合成してしまっていました。そのた
め、例えばこの行以前に書かれた内容は音声合成せず、 \speech{ この行以降を音声合成したいという要望があ
ったとしても、それを選択して音声合成することはでき
ませんでした。しかし、本発明の方法を使うとそれが可
能になります。}」Text file TF3: "This text is
A text sentence for explaining the embodiment of the present invention. In conventional text-to-speech synthesis, one text file is used as a unit for synthesis processing.
Even if the writer of the text had some parts that he wanted to synthesize and some parts that he didn't want to synthesize, he couldn't select it and all the text was synthesized. Therefore, for example, even if there is a request to synthesize the contents written before this line without speech synthesis and \ speech {after this line, it could not be selected and synthesized. However, the method of the present invention makes it possible. } "

【００３６】この場合、コマンド認識処理部１では、音
声化コマンド「\speech」と、括弧「{」が認識された
後、そこから、括弧「}」までの部分だけが、テキスト
音声合成部２に供給される。In this case, in the command recognition processing unit 1, after the voice command "\ speech" and the parenthesis "{" are recognized, only the portion up to the parenthesis "}" is recognized by the text-to-speech synthesis unit 2. Is supplied to.

【００３７】従って、この場合も、テキストファイルＴ
Ｆ２における場合と同様に、スピーカ６からは、テキス
トファイルＴＦ３の最後の３行に対応する合成音「この
行以降を音声合成したいという要望があったとしても、
それを選択して音声合成することはできませんでした。
しかし、本発明の方法を使うとそれが可能になりま
す。」が出力される。Therefore, also in this case, the text file T
As in the case of F2, even if there is a request from the speaker 6 that the synthesized voice corresponding to the last three lines of the text file TF3 is “voice-synthesized after this line,
It was not possible to select it and synthesize speech.
However, the method of the present invention makes it possible. Is output.

【００３８】次に、以上においては、指示情報として、
テキストファイルにおける音声合成処理の対象とする部
分を指示する音声化コマンド「\speech」を用いるよう
にしたが、その他、指示情報としては、例えばテキスト
ファイルにおいて、音声合成の対象としない部分を指示
するコマンド（以下、適宜、音声化不可コマンドとい
う）を用いることも可能である。ここで、この音声化不
可コマンドを、例えば「\mute」で表し、ある行を音声
合成しない場合には、その行頭に、音声化不可コマンド
「\mute」を記述するものとして、図１の音声合成装置
の動作を説明する。Next, in the above, as the instruction information,
Although the voice command "\ speech" for instructing a part of the text file to be subjected to the voice synthesis process is used, other instruction information is, for example, a part of the text file not to be subjected to the voice synthesis process is instructed. It is also possible to use a command (hereinafter, referred to as an unvoiced command as appropriate). Here, this non-vocalization command is represented by, for example, "\ mute", and when a certain line is not speech-synthesized, it is assumed that the non-vocalization command "\ mute" is described at the beginning of the line. The operation of the synthesizer will be described.

【００３９】いま、例えば、上述のテキストファイルＴ
Ｆ１が、コマンド認識処理部１に入力されたものとす
る。Now, for example, the above-mentioned text file T
It is assumed that F1 is input to the command recognition processing unit 1.

【００４０】コマンド認識処理部１では、テキストファ
イルＴＦ１が入力されると、行頭に記述されている音声
化不可コマンド「\mute」が認識（検索）される。しか
しながら、この場合、テキストファイルＴＦ１には、行
頭に、音声化不可コマンド「\mute」が記述されている
行が存在しないため、コマンド認識処理部１では、音声
化不可コマンド「\mute」が認識されず、従って、テキ
ストファイルＴＦ１を構成する行すべてが抽出される。When the text file TF1 is input, the command recognition processing unit 1 recognizes (searches) the voice-disabled command "\ mute" described at the beginning of the line. However, in this case, in the text file TF1, there is no line at the beginning of which the voice-disabled command “\ mute” is described. Therefore, the command recognition processing unit 1 recognizes the voice-disabled command “\ mute”. Not, therefore all the lines that make up the text file TF1 are extracted.

【００４１】その結果、後段のテキスト音声合成部２に
は、テキストファイルＴＦ１全体が、入力文として供給
され、これにより、スピーカ６からは、テキストファイ
ルＴＦ１を構成するすべての行に対応する合成音「この
文章は、本発明の実施例を説明するためのテキスト文で
す。従来テキスト音声合成では、ひとつのテキストファ
イルを単位として合成処理するため、テキストファイル
の中に、そのテキストの記述者が音声合成したい部分と
音声合成したくない部分があっても、その選択はでき
ず、すべてのテキストを音声合成してしまっていまし
た。そのため、例えばこの行以前に書かれた内容は音声
合成せず、この行以降を音声合成したいという要望があ
ったとしても、それを選択して音声合成することはでき
ませんでした。しかし、本発明の方法を使うとそれが可
能になります。」が出力される。As a result, the entire text file TF1 is supplied as an input sentence to the text-to-speech synthesis unit 2 in the subsequent stage, whereby the speaker 6 produces synthesized speech corresponding to all the lines forming the text file TF1. "This sentence is a text sentence for explaining the embodiment of the present invention. In the conventional text-to-speech synthesis, since one text file is used as a unit for synthesis processing, the person who wrote the text does not Even if there is a part that you want to synthesize and a part that you do not want to synthesize, you cannot select it, and all the texts have been synthesized, so for example, the contents written before this line are not synthesized. , Even if there was a request to synthesize speech after this line, it was not possible to select it and synthesize speech. Using the method of the present invention it becomes the possible. "It is output.

【００４２】次に、例えば、以下のようなテキストファ
イルＴＦ４が、コマンド認識処理部１に入力されたもの
とする。Next, for example, it is assumed that the following text file TF4 is input to the command recognition processing unit 1.

【００４３】テキストファイルＴＦ４：「 \mute この文章は、本発明の実施例を説明するためのテキスト文です。 \mute 従来テキスト音声合成では、ひとつのテキストファイルを単位 \mute として合成処理するため、テキストファイルの中に、そのテキス \mute トの記述者が音声合成したい部分と音声合成したくない部分があ \mute っても、その選択はできず、すべてのテキストを音声合成してし \mute まっていました。そのため、 \mute 例えばこの行以前に書かれた内容は音声合成せず、この行以降を音声合成したいという要望があったとして
も、それを選択して音声合成することはできませんでし
た。しかし、本発明の方法を使うとそれが可能になりま
す。」Text file TF4: "\ mute This sentence is a text sentence for explaining the embodiment of the present invention. \ Mute In the conventional text-to-speech synthesis, one text file is synthesized as a unit \ mute. Even if there is a part in the text file that the author of the text \ mute wants to synthesize and a part that he / she does not want to synthesize, it cannot be selected, and all the text is synthesized. Therefore, \ mute For example, even if there is a request to synthesize speech after this line without synthesizing the content written before this line, it is not possible to select it and synthesize speech. I couldn't, but the method of the invention makes it possible. "

【００４４】コマンド認識処理部１では、テキストファ
イルＴＦ４が入力されると、行頭に記述されている音声
化不可コマンド「\mute」が認識（検索）される。この
場合、テキストファイルＴＦ４の第１行から第７行まで
の行頭に記述されている音声化不可コマンド「\mute」
が認識される。コマンド認識処理部１は、音声化不可コ
マンド「\mute」を認識すると、その音声化不可コマン
ド「\mute」が行頭に記述されている行を除く部分を、
テキストファイルＴＦ４から抽出し、後段のテキスト音
声合成部２に供給する。即ち、この場合、テキスト音声
合成部２には、テキストデータ「この行以降を音声合成
したいという要望があったとしても、それを選択して音
声合成することはできませんでした。しかし、本発明の
方法を使うとそれが可能になります。」が入力文として
供給される。When the text file TF4 is input, the command recognition processing section 1 recognizes (searches) the voice-disabled command "\ mute" described at the beginning of the line. In this case, the voice-disabled command "\ mute" described at the beginning of the first to seventh lines of the text file TF4
Is recognized. When the command recognition processing unit 1 recognizes the non-vocalization command "\ mute", the command recognition processing unit 1 removes the portion except the line where the non-vocalization command "\ mute" is described at the beginning of the line
It is extracted from the text file TF4 and supplied to the text-to-speech synthesis unit 2 in the subsequent stage. In other words, in this case, the text-to-speech synthesizer 2 cannot select and synthesize the text data "even if there is a request to synthesize speech after this line." It is possible using the method. "Is supplied as an input sentence.

【００４５】従って、テキスト音声合成部２では、音声
化不可コマンド「\mute」が行頭に記述されている行を
除く部分（行）のみを対象に、音声合成処理が行われ、
その結果、スピーカ６からは、テキストファイルＴＦ４
の最後の３行に対応する合成音「この行以降を音声合成
したいという要望があったとしても、それを選択して音
声合成することはできませんでした。しかし、本発明の
方法を使うとそれが可能になります。」が出力される。Therefore, the text-to-speech synthesis unit 2 performs the speech synthesis process only on the portion (line) excluding the line in which the non-speech command "\ mute" is described at the beginning of the line,
As a result, the text file TF4 is output from the speaker 6.
The synthesized speech corresponding to the last three lines of "If you want to synthesize speech after this one, you could not select it and synthesize it. However, using the method of the present invention, Is possible. ”Is output.

【００４６】以上のように、行頭に記述されている音声
化不可コマンド「\mute」を認識し、その音声化不可コ
マンド「\mute」が行頭に記述されている行を除く行
を、テキストファイルＴＦ４から抽出して音声合成する
ようにしたので、ユーザは、音声合成しないことを希望
する部分（ここでは、行）に、音声化不可コマンド「\m
ute」を記述するだけで、その希望する部分が合成音で
出力されることを防止することができる。即ち、言い換
えれば、上述の音声化コマンド「\speech」を用いる場
合と同様に、ユーザは、音声合成することを希望する部
分以外に、音声化不可コマンド「\mute」を記述するだ
けで、その希望する部分だけを合成音で出力するように
することができる。As described above, the voice-disabled command "\ mute" described at the beginning of the line is recognized, and the lines other than the line where the voice-disabled command "\ mute" is described at the beginning of the line are changed to a text file. Since the speech is synthesized by extracting it from the TF4, the user does not have the speech-incapable command “\ m
By simply describing "ute", it is possible to prevent the desired portion from being output as a synthetic sound. That is, in other words, as in the case of using the voice command "\ speech" described above, the user only needs to describe the non-vocalization command "\ mute" in addition to the portion where the voice synthesis is desired. It is possible to output only the desired portion as a synthetic sound.

【００４７】なお、上述の場合においては、音声合成し
ない行のすべての行頭に音声化不可コマンド「\mute」
を記述するようにしたが、例えば音声合成の対象とする
行が２行以上に亘る場合においては、音声化コマンド
「\speech」を用いる場合と同様に、最初の行の行頭に
だけ音声化不可コマンド「\mute」を記述し、音声合成
の対象とする部分を括弧で囲むようにすることも可能で
ある。In the above case, the non-vocalization command "\ mute" is added to the beginning of every line in which voice synthesis is not performed.
However, if there are two or more lines to be speech-synthesized, as in the case of using the voicing command "\ speech", voicing is not possible only at the beginning of the first line. It is also possible to describe the command "\ mute" and enclose the target part of speech synthesis in parentheses.

【００４８】即ち、例えば、以下のようなテキストファ
イルＴＦ５を、コマンド認識処理部１に入力するように
することも可能である。That is, for example, the following text file TF5 can be input to the command recognition processing unit 1.

【００４９】テキストファイルＴＦ５：「 \mute{ この文章は、本発明の実施例を説明するため
のテキスト文です。従来テキスト音声合成では、ひとつ
のテキストファイルを単位として合成処理するため、テ
キストファイルの中に、そのテキストの記述者が音声合
成したい部分と音声合成したくない部分があっても、そ
の選択はできず、すべてのテキストを音声合成してしま
っていました。そのため、例えばこの行以前に書かれた
内容は音声合成せず、}この行以降を音声合成したいと
いう要望があったとしても、それを選択して音声合成す
ることはできませんでした。しかし、本発明の方法を使
うとそれが可能になります。」Text file TF5: "\ mute {This sentence is a text sentence for explaining the embodiment of the present invention. In conventional text-to-speech synthesis, since one text file is synthesized as a unit, the text file Even if there are some parts of the text that the writer of the text wants to synthesize and some parts that he does not want to synthesize, the text cannot be selected, and all the texts have been synthesized. So, for example, before this line Even if there is a request to synthesize speech after this line, it is not possible to select it and synthesize speech. However, using the method of the present invention, It will be possible. "

【００５０】この場合、コマンド認識処理部１では、音
声化不可コマンド「\mute」と、括弧「{」が認識された
後、そこから、括弧「}」までの部分を除く部分だけ
が、テキスト音声合成部２に供給される。In this case, the command recognition processing unit 1 recognizes only the part up to the parenthesis "}" after recognition of the non-vocalization command "\ mute" and the parenthesis "{". It is supplied to the voice synthesizer 2.

【００５１】従って、この場合も、テキストファイルＴ
Ｆ４における場合と同様に、スピーカ６からは、テキス
トファイルＴＦ５の最後の３行に対応する合成音「この
行以降を音声合成したいという要望があったとしても、
それを選択して音声合成することはできませんでした。
しかし、本発明の方法を使うとそれが可能になりま
す。」が出力される。Therefore, also in this case, the text file T
Similar to the case of F4, even if there is a request from the speaker 6 that the synthesized voice corresponding to the last three lines of the text file TF5 "I want to perform voice synthesis after this line,
It was not possible to select it and synthesize speech.
However, the method of the present invention makes it possible. Is output.

【００５２】なお、テキストファイルを作成、編集する
ためのエディタには、例えばある行と、他の行とを指定
すると（例えば、マウスなどでクリックすると）、その
指定された行の間にあるすべての行に、音声化コマンド
「\speech」または音声化不可コマンド「\mute」を付さ
せるようにすることが可能である（あるいは、そのすべ
ての行を囲む括弧（{，}）と、音声化コマンド「\speec
h」または音声化不可コマンド「\mute」とをを付させる
ようにすることが可能である）。Note that, for example, when a line and another line are designated (for example, by clicking with a mouse), an editor for creating and editing a text file will display everything between the designated lines. It is possible to add the voice command "\ speech" or the non-voice command "\ mute" to each line (or the parentheses ({,}) surrounding all the lines The command "\ speec
It is possible to add the command "h" or the command "\ mute" that cannot be voiced).

【００５３】次に、図２は、本発明を適用したネットワ
ークシステムの一実施例の構成を示している。ユーザ
は、コンピュータ１１を有し、例えばＰＳＴＮ（Public
Switched Telephone Network）やＩＳＤＮ（Integrate
d Service Digital Network）などの公衆網１２、ある
いは図示せぬ専用線を介して、サービスプロバイダ（接
続業者）が有するＳＰ（Service Provider）サーバ１３
に接続されている。そして、ＳＰサーバ１３は、インタ
ーネット１４に接続されている。即ち、コンピュータ１
１は、ＳＰサーバ１３を介して、インターネット１４に
接続されている。Next, FIG. 2 shows the configuration of an embodiment of a network system to which the present invention is applied. The user has a computer 11 and, for example, PSTN (Public
Switched Telephone Network) and ISDN (Integrate
SP (Service Provider) server 13 of a service provider (connector) via a public network 12 such as d Service Digital Network) or a private line (not shown).
It is connected to the. The SP server 13 is connected to the Internet 14. That is, computer 1
1 is connected to the Internet 14 via the SP server 13.

【００５４】なお、図２の実施例では、コンピュータ１
１のユーザ以外のユーザのコンピュータとして、コンピ
ュータ１５だけが、公衆網１２と、ＳＰサーバ１３と同
様に構成されるＳＰサーバ１６（他のサービスプロバイ
ダが有するサーバ）とを介してインターネット１４に接
続されているが、その他のユーザのコンピュータも同様
にして、ＳＰサーバ１３、あるいは他のサービスプロバ
イダが有するサーバや、大学や企業その他に設置されて
いるサーバ（ホストコンピュータ）を介して、インター
ネット１４に接続されている。In the embodiment shown in FIG. 2, the computer 1
As a computer of a user other than the one user, only the computer 15 is connected to the Internet 14 via the public network 12 and the SP server 16 (a server of another service provider) configured similarly to the SP server 13. However, the computers of other users are similarly connected to the Internet 14 via the SP server 13, the server of another service provider, or the server (host computer) installed in a university, a company, or the like. Has been done.

【００５５】また、ユーザは、インターネット１４に直
接接続することも可能であるが、通常は、サービスプロ
バイダと契約し、図２に示したように、公衆網１２を介
して、ＳＰサーバ１３または１６にアクセスすること
で、インターネット１４に接続される。Although the user can directly connect to the Internet 14, normally, the user makes a contract with a service provider and, as shown in FIG. 2, the SP server 13 or 16 via the public network 12. Access to the Internet 14 to connect to the Internet 14.

【００５６】インターネット１４においては、ＴＣＰ／
ＩＰ（Transmission Control Protocol/Internet Proto
col）と呼ばれるプロトコルにしたがって、コンピュー
タ相互間で通信を行うようになされている。また、イン
ターネット１４上には、ＷＷＷが構築されており、この
ＷＷＷでは、ＨＴＴＰ（Hyper Text Transfer Protoco
l）と呼ばれるプロトコルにより、データの転送を行
い、ＨＴＭＬ（Hyper TextMarkup Language）で画面を
記述することにより、情報の検索や表示を、簡単に行う
ことができるようになされている。さらに、インターネ
ット１４においては、ＷＷＷの他、例えば、いわゆる電
子メール（Ｅ−ｍａｉｌ）や、パソコン通信でいうとこ
ろの掲示板に相当するネットニュースなどのサービスも
提供されており、端末（コンピュータ）を有するユーザ
どうしは、電子メールのやりとりをしたり、また、特定
のテーマについての記事を書き込み、その記事を読むこ
とができるようになされている。なお、電子メールは、
ＳＭＴＰ（Simple Mail Transfer Protocol）と呼ばれ
るプロトコルで、また、ネットニュースにおける記事
は、ＮＮＴＰ（Network News Transfer Protocol）と呼
ばれるプロトコルで、それぞれ転送されるようになされ
ている。On the Internet 14, TCP /
IP (Transmission Control Protocol / Internet Proto)
According to a protocol called col), it is designed to communicate between computers. A WWW is built on the Internet 14. In this WWW, HTTP (Hyper Text Transfer Protocol) is used.
Data is transferred by a protocol called l), and a screen is described in HTML (Hyper Text Markup Language), so that information can be easily retrieved and displayed. In addition to the WWW, the Internet 14 is provided with services such as so-called electronic mail (E-mail) and net news equivalent to bulletin boards in personal computer communication, and has a terminal (computer). The users can exchange e-mails, write articles on a specific subject, and read the articles. The email is
A protocol called SMTP (Simple Mail Transfer Protocol) and an article in net news are transferred by a protocol called NNTP (Network News Transfer Protocol).

【００５７】ところで、図２のネットワークシステムに
おいては、コンピュータ１１や１５などの他、例えば電
話機（携帯電話機）１７などによっても公衆網１２を介
して、ＳＰサーバ１３（あるいは１６）にアクセスする
ことができるようになされており、これにより、例えば
コンピュータ１１のユーザは、電話機１７のプッシュボ
タン１７Ａを操作して、ＳＰサーバ１３に所定のコマン
ドを与え、コンピュータ１１のユーザ宛に送信されてき
た電子メールを、合成音で聴くことができるようになさ
れている（このように電子メールを合成音で提供するサ
ービスを、以下、適宜、音声化サービスという）。In the network system of FIG. 2, the SP server 13 (or 16) can be accessed via the public network 12 not only by the computers 11 and 15 but also by the telephone (mobile telephone) 17 or the like. With this, for example, the user of the computer 11 operates the push button 17A of the telephone 17 to give a predetermined command to the SP server 13 and send the electronic mail to the user of the computer 11. Can be listened to with a synthetic voice (a service for providing an electronic mail with a synthetic voice in this way is hereinafter referred to as a voice service).

【００５８】即ち、例えば、いま、コンピュータ１５の
ユーザが、ソフトウェアである電子メールを作成するた
めの電子メール用アプリケーション１５Ａによって電子
メールを作成し、コンピュータ１１のユーザ宛に送信し
たものとすると、その電子メールは、公衆網１２を介し
て、ＳＰサーバ１６で受信される。ＳＰサーバ１６は、
コンピュータ１１のユーザ宛の電子メールを受信する
と、その電子メールを、インターネット１４を介して、
コンピュータ１１と接続されているＳＰサーバ１３に転
送する。これに対応して、ＳＰサーバ１３では、コンピ
ュータ１１のユーザ宛の電子メールが記憶される。That is, for example, suppose that the user of the computer 15 now creates an electronic mail by the electronic mail application 15A for creating an electronic mail, which is software, and sends the electronic mail to the user of the computer 11. The electronic mail is received by the SP server 16 via the public network 12. SP server 16
When the electronic mail addressed to the user of the computer 11 is received, the electronic mail is transmitted via the Internet 14.
Transfer to the SP server 13 connected to the computer 11. In response to this, the SP server 13 stores an electronic mail addressed to the user of the computer 11.

【００５９】その後、コンピュータ１１のユーザが、コ
ンピュータ１１を操作して、電子メール用アプリケーシ
ョン１１Ａを起動することにより、公衆網１２を介し
て、ＳＰサーバ１３にアクセスし、自身宛の電子メール
を要求すると、ＳＰサーバ１３は、そのユーザ宛の電子
メールを、公衆網１２を介して、コンピュータ１１に送
信する。コンピュータ１１では、ＳＰサーバ１３からの
電子メールが受信され、例えば、図示せぬディスプレイ
に表示される。これにより、ユーザは、自身宛の電子メ
ールを見る（読む）ことができる。Thereafter, the user of the computer 11 operates the computer 11 to activate the electronic mail application 11A, thereby accessing the SP server 13 via the public network 12 and requesting an electronic mail addressed to itself. Then, the SP server 13 sends the electronic mail addressed to the user to the computer 11 via the public network 12. The computer 11 receives the electronic mail from the SP server 13 and displays it on a display (not shown), for example. This allows the user to view (read) the email addressed to himself.

【００６０】また、コンピュータ１１のユーザが、電話
機１７のプッシュボタン１７Ａを操作し、音声化サービ
ス専用の電話番号をダイヤルすることにより、公衆網１
２を介して、ＳＰサーバ１３にアクセスすると、ＳＰサ
ーバ１３と電話機１７との間で通信リンクが確立され
る。そして、ユーザが、自身宛の電子メールを要求する
コマンドを、プッシュボタン１７Ａを操作することによ
り入力すると、その操作に対応したプッシュボタン信号
（プッシュトーン信号）（あるいは、ダイヤルパルス）
が、電話機１７から、公衆網１２を介してＳＰサーバ１
３に送信される。In addition, the user of the computer 11 operates the push button 17A of the telephone 17 to dial the telephone number dedicated to the voice conversion service, whereby the public network 1
When the SP server 13 is accessed via 2, a communication link is established between the SP server 13 and the telephone set 17. When the user inputs a command requesting an electronic mail addressed to himself by operating the push button 17A, a push button signal (push tone signal) (or dial pulse) corresponding to the operation is input.
From the telephone 17 through the public network 12 to the SP server 1
3 is sent.

【００６１】ＳＰサーバ１３では、電話機１７からプッ
シュボタン信号を受信すると、そのプッシュボタン信号
が解析（解読）される。そして、プッシュボタン信号
が、コンピュータ１１のユーザ宛の電子メールを要求す
るものである場合、ＳＰサーバ１３は、そのユーザ宛の
電子メールを、音声合成処理することにより合成音と
し、公衆網１２を介して、電話機１７に送信する。これ
により、電話機１７のスピーカ１７Ｂからは、電子メー
ルを読み上げた合成音が出力される。When the SP server 13 receives the push button signal from the telephone set 17, the push button signal is analyzed (decoded). Then, when the push button signal is for requesting an electronic mail addressed to the user of the computer 11, the SP server 13 performs a voice synthesis process on the electronic mail addressed to the user to generate a synthesized sound, and the public network 12 is connected. Via the telephone set 17. As a result, the speaker 17B of the telephone set 17 outputs the synthesized voice read out from the electronic mail.

【００６２】従って、ユーザは、外出先などから、電話
機１７によって自身宛の電子メールを確認することがで
きる。Therefore, the user can confirm the e-mail addressed to him / herself by the telephone 17 from the place where he / she goes.

【００６３】次に、図３は、図２のＳＰサーバ１３の構
成例を示している。通信部２１は、インターネット１４
を介して通信を行ったり、また、公衆網１２を介して、
コンピュータ１１や電話機１７と通信を行うために必要
な通信制御を行うようになされている。コマンド処理部
２２は、ＳＰサーバ１３全体を制御する他、コンピュー
タ１１からの要求や、電話機１７から送信されてくるプ
ッシュボタン信号を解析し、その解析結果に対応した処
理を行うようになされている。テキストメール記憶部２
３は、ＳＰサーバ１３を有する接続業者と契約したユー
ザ宛に送信されてきた電子メールを記憶するようになさ
れている。なお、テキストメール記憶部２３には、接続
業者と契約したユーザ宛の電子メールを記憶する記憶領
域が、各ユーザごとに設けられており（このようにユー
ザごとに設けられた記憶領域を、以下、適宜、メールボ
ックスという）、各ユーザ宛の電子メールは、コマンド
処理部２２の制御の下、そのユーザのメールボックスに
記憶されるようになされている。Next, FIG. 3 shows a configuration example of the SP server 13 of FIG. The communication unit 21 uses the Internet 14
Via the public network 12
The communication control necessary for communicating with the computer 11 and the telephone 17 is performed. In addition to controlling the entire SP server 13, the command processing unit 22 analyzes the request from the computer 11 and the push button signal transmitted from the telephone 17 and performs processing corresponding to the analysis result. . Text mail storage unit 2
3 stores the e-mail sent to the user who contracts with the connection company having the SP server 13. The text mail storage unit 23 is provided with a storage area for storing an electronic mail addressed to a user who contracts with a connection provider (for each user, a storage area as described below is provided). The electronic mail addressed to each user is stored in the mailbox of the user under the control of the command processing unit 22.

【００６４】音声合成装置２４は、メールヘッダ処理部
３１が新たに設けられている他は、図１の音声合成装置
と同様に構成されている。メールヘッダ処理部３１は、
電子メールのヘッダ（メールヘッダ）の所定の部分に対
し、指示情報（音声化コマンド「\speech」または音声
化不可コマンド「\mute」）を付加し、コマンド認識処
理部１に出力するようになされている。The speech synthesizer 24 has the same structure as the speech synthesizer shown in FIG. 1 except that the mail header processing section 31 is newly provided. The mail header processing unit 31
Instruction information (speech command "\ speech" or non-speech command "\ mute") is added to a predetermined part of the email header (mail header), and output to the command recognition processing unit 1. ing.

【００６５】以上のように構成されるＳＰサーバ１３に
おいては、例えばコンピュータ１５のユーザから、イン
ターネット１４を介して、コンピュータ１１のユーザ宛
に電子メールが送信されてくると、その電子メールは、
通信部２１で受信され、コマンド処理部２２の制御の
下、テキストメール記憶部２３に転送されて記憶され
る。In the SP server 13 configured as described above, when an electronic mail is sent from the user of the computer 15 to the user of the computer 11 via the Internet 14, the electronic mail is
The data is received by the communication unit 21, transferred to the text mail storage unit 23 and stored therein under the control of the command processing unit 22.

【００６６】その後、コンピュータ１１のユーザが、コ
ンピュータ１１を操作することにより、公衆網１２を介
して、ＳＰサーバ１３にアクセスし、自身宛の電子メー
ルを要求すると、コマンド処理部２２は、そのユーザ用
のメールボックス（テキストメール記憶部２３）から電
子メールを読み出し、通信部２１に送信させる。これに
より、電子メールは、通信部２１から、公衆網１２を介
して、コンピュータ１１に送信され、ユーザは、自身宛
の電子メールを見ることができる。After that, when the user of the computer 11 operates the computer 11 to access the SP server 13 via the public network 12 and request an e-mail addressed to itself, the command processing section 22 causes the user to operate. The e-mail is read from the mail box (text mail storage unit 23) for use by the communication unit 21. As a result, the electronic mail is transmitted from the communication unit 21 to the computer 11 via the public network 12, and the user can see the electronic mail addressed to himself.

【００６７】また、電話機１７のユーザ（ここでは、コ
ンピュータ１１のユーザでもある）が、電話機１７によ
って、公衆網１２を介して、ＳＰサーバ１３にアクセス
し、自身宛の電子メールを要求するように、プッシュボ
タン１７Ａを操作すると、その操作に対応するプッシュ
ボタン信号は、電話機１７からＳＰサーバ１３に送信さ
れ、通信部２１において受信される。コマンド処理部２
２では、通信部２１で受信されたプッシュボタン信号が
解読され、その解読結果に対応して、電話機１７のユー
ザ宛の電子メールが、テキストメール記憶部２３から読
み出され、音声合成装置２４に転送される。Further, the user of the telephone set 17 (here, also the user of the computer 11) accesses the SP server 13 via the public network 12 by the telephone set 17 and requests the electronic mail addressed to itself. When the push button 17A is operated, a push button signal corresponding to the operation is transmitted from the telephone set 17 to the SP server 13 and received by the communication unit 21. Command processing unit 2
In 2, the push button signal received by the communication unit 21 is decrypted, and an electronic mail addressed to the user of the telephone set 17 is read from the text mail storage unit 23 according to the decryption result, and is read by the voice synthesizer 24. Transferred.

【００６８】ここで、コンピュータ１１のユーザ宛の電
子メール（入力文）は、例えば図４に示すように、音声
合成処理の対象とする行の行頭に、音声化コマンド「\s
peech」が付加されているものとする。Here, the electronic mail (input sentence) addressed to the user of the computer 11 has a voice command "\ s" at the beginning of a line to be subjected to voice synthesis processing, as shown in FIG.
"peech" is added.

【００６９】従って、この場合、コマンド処理部２２か
ら音声合成装置２４に対して転送された電子メールが、
メールヘッダ処理部３１を介して、そのままコマンド認
識処理部１に供給された場合、音声化コマンド「\speec
h」が付加された行だけが音声合成処理の対象とされる
ため、メールヘッダや、本文中に記述されたフェイスマ
ークおよび図、さらには、署名における電子メールアド
レスなどの部分は合成音とされず、その結果、電子メー
ルの内容を理解し易い合成音を得ることができる。Therefore, in this case, the electronic mail transferred from the command processing unit 22 to the voice synthesizer 24 is
If the command recognition processing unit 1 is directly supplied via the mail header processing unit 31, the voice command "\ speec
Since only the line with "h" added is subject to voice synthesis processing, the mail header, the face mark and figures described in the text, and the part such as the email address in the signature are considered to be the synthesized voice. As a result, it is possible to obtain a synthesized voice that makes it easy to understand the content of the electronic mail.

【００７０】ところで、図４に示した電子メールに対す
る音声化コマンド「\speech」の付加は、その電子メー
ルの差出人（ここでは、コンピュータ１５のユーザ）に
よって行われるが、ユーザが音声化コマンド「\speec
h」を付加することができるのは、電子メールの本文と
署名の部分に限られる。即ち、電子メールのメールヘッ
ダの部分については、電子メールの差出人であるユーザ
が、勝手にその形式を変更することはできず、従って、
音声化コマンド「\speech」を付加することはできな
い。The addition of the voice command "\ speech" to the electronic mail shown in FIG. 4 is performed by the sender of the electronic mail (here, the user of the computer 15). speec
"h" can be added only to the body and signature of the email. That is, the user who is the sender of the e-mail cannot change the format of the e-mail header part without permission.
The voice command "\ speech" cannot be added.

【００７１】しかしながら、メールヘッダにおけるメー
ルの送信者（差出人）（From:で始まる部分）や、タイ
トル（Subject:で始まる部分）、日付（Date:で始まる
部分）などは、電子メールによるコミュニケーションを
図る上で重要な情報（以下、適宜、重要情報という）で
あり、このような部分まで、音声合成処理の対象から除
外してしまうのは好ましくない。However, the sender (sender) of the mail in the mail header (the part starting with From :), the title (the part starting with Subject :), the date (the part starting with Date :), etc. are communicated by electronic mail. It is important information above (hereinafter referred to as important information as appropriate), and it is not preferable to exclude such a part from the target of the voice synthesis processing.

【００７２】そこで、音声合成装置２４のメールヘッダ
処理部３１は、メールヘッダの重要情報が記述された行
の行頭に、音声化コマンド「\speech」を付加するよう
になされている。Therefore, the mail header processing section 31 of the voice synthesizer 24 is designed to add the voice command "\ speech" to the beginning of the line in which the important information of the mail header is described.

【００７３】即ち、コマンド処理部２２から音声合成装
置２４に対して、電子メールが転送されてくると、メー
ルヘッダ処理部３１において、その電子メールのメール
ヘッダから、重要情報が記述された行（例えば、Fro
m:，Subject:，Dateそれぞれで始まる行）が検出され
る。さらに、メールヘッダ処理部３１では、その重要情
報が記述された行の行頭に、音声化コマンド「\speec
h」が付加され、これにより、図４に示した電子メール
は、図５に示すようにされる。そして、図５に示すよう
にされた電子メールは、コマンド認識処理部１に出力さ
れ、以下、図１で説明した場合と同様の音声合成処理が
行われる。That is, when an electronic mail is transferred from the command processing unit 22 to the voice synthesizer 24, the mail header processing unit 31 writes a line () in which important information is described from the mail header of the electronic mail. For example, Fro
Lines starting with m :, Subject :, and Date) are detected. Furthermore, in the mail header processing unit 31, the voice command "\ speec
"h" is added, so that the electronic mail shown in FIG. 4 is made as shown in FIG. Then, the electronic mail as shown in FIG. 5 is output to the command recognition processing unit 1, and thereafter, the same voice synthesis processing as that described in FIG. 1 is performed.

【００７４】音声合成装置２４において得られた合成音
（音声出力部３より出力される合成音）は、コマンド処
理部２２を介して、通信部２１に供給され、通信部２１
では、その合成音が、公衆網１２を介して、電話機１７
に送信される。その結果、電話機１７のスピーカ１７Ｂ
からは、図５において、行頭に音声化コマンド「\speec
h」が記述された行に対応する合成音「Subject: mail no example Date: Wed, 6 Dec 1995 17:59:46 From: mail_sender@***.***.sony.co.jp 宮崎＠ソニーです。一通のメールには、上に示すような、メールヘッダ、送
信者の感情を表現するフェイスマーク、記号を組み合わ
せて作った図、それから一番下に示すような、メール送
信者の署名などが含まれています。宮崎敏＠ソニー株
式会社」が出力される。The synthesized sound obtained by the speech synthesizer 24 (synthesized sound output from the speech output unit 3) is supplied to the communication unit 21 via the command processing unit 22, and the communication unit 21
Then, the synthesized voice is transmitted to the telephone 17 via the public network 12.
Sent to. As a result, the speaker 17B of the telephone 17
From Fig. 5, the voice command "\ speec
Synthetic sound corresponding to the line in which "h" is described "Subject: mail no example Date: Wed, 6 Dec 1995 17:59:46 From: mail_sender@***.***.sony.co.jp Miyazaki @ Sony For each email, the email header, the face mark expressing the sender's emotions, the diagram made by combining the symbols as shown above, and the email sender's signature as shown at the bottom, etc. "Satoshi Miyazaki @ Sony Corporation" is output.

【００７５】従って、この場合、コンピュータ１５のユ
ーザは、重要な部分についてだけ、コンピュータ１１の
ユーザに聴いてもらうことができ、また、コンピュータ
１１のユーザは、不必要な合成音を聴かさせる煩雑さか
ら解放されることになる。即ち、正確で迅速なコミュニ
ケーションを図ることが可能となる。Therefore, in this case, the user of the computer 15 can ask the user of the computer 11 to listen only to the important part, and the user of the computer 11 can make unnecessary synthetic sounds heard. Will be released from. That is, accurate and quick communication can be achieved.

【００７６】次に、上述の場合においては、音声合成処
理の対象とする行の行頭に、音声化コマンド「\speec
h」が付加された電子メールを対象としたが、音声化不
可コマンド「\mute」が付加された電子メールについて
も、同様に処理することが可能である。Next, in the above case, the voice command "\ speec" is added at the beginning of the line to be subjected to the voice synthesis processing.
Although the e-mail to which "h" is added is targeted, the e-mail to which the non-voice command "\ mute" is added can be processed in the same manner.

【００７７】即ち、コンピュータ１１のユーザ宛の電子
メールが、例えば図６に示すようなものであった場合、
メールヘッダ処理部３１においては、その電子メールの
メールヘッダから、重要情報が記述された行（ここで
は、上述したように、From:，Subject:，Dateそれぞれ
で始まる行）が検出される。さらに、メールヘッダ処理
部３１では、その重要情報が記述された行を除く行の行
頭に、音声化不可コマンド「\mute」が付加され、これ
により、図６に示した電子メールは、図７に示すように
される。そして、図７に示すようにされた電子メール
は、コマンド認識処理部１に出力され、以下、図１で説
明した場合と同様の音声合成処理が行われる。That is, when the electronic mail addressed to the user of the computer 11 is, for example, as shown in FIG.
In the mail header processing unit 31, a line in which important information is described (here, lines starting with From :, Subject :, and Date, respectively) is detected from the mail header of the electronic mail. Further, in the mail header processing unit 31, the non-vocalization command “\ mute” is added to the beginning of the lines other than the line in which the important information is described, whereby the electronic mail shown in FIG. As shown in. Then, the electronic mail as shown in FIG. 7 is output to the command recognition processing unit 1, and thereafter, the same voice synthesis processing as that described in FIG. 1 is performed.

【００７８】従って、この場合、音声合成装置２４から
は、図７において、行頭に音声化不可コマンド「\mut
e」が記述された行を除く行に対応する合成音「Subject: mail no example Date: Wed, 6 Dec 1995 17:59:46 From: mail_sender@***.***.sony.co.jp 宮崎＠ソニーです。一通のメールには、上に示すような、メールヘッダ、送
信者の感情を表現するフェイスマーク、記号を組み合わ
せて作った図、それから一番下に示すような、メール送
信者の署名などが含まれています。宮崎敏＠ソニー株
式会社」が出力される。Therefore, in this case, from the voice synthesizer 24, in FIG.
Synthetic sounds corresponding to the lines excluding the line in which `` e '' is written `` Subject: mail no example Date: Wed, 6 Dec 1995 17:59:46 From: mail_sender@***.***.sony.co.jp My name is Miyazaki@Sony.The mail header, the face mark that expresses the sender's emotions, and the figure made up of symbols are combined in one mail, and the mail sender is shown at the bottom. “Satoshi Miyazaki @ Sony Corporation” is output.

【００７９】以上、本発明を適用した音声合成装置およ
びネットワークシステムについて説明したが、本発明
は、その他、音声合成を用いるあらゆるシステムに適用
可能である。Although the speech synthesizer and the network system to which the present invention is applied have been described above, the present invention can be applied to all other systems using speech synthesis.

【００８０】なお、本実施例においては、行頭に、音声
化コマンドまたは音声化不可コマンドを記述し、行単位
で、音声合成を行うかどうかを指示するようにしたが、
音声合成を行うかどうかは、その他の任意の単位で指示
するようにすることが可能である。In this embodiment, the voiced command or the voiced non-voiced command is described at the beginning of the line, and the line-by-line instruction is given as to whether or not to perform voice synthesis.
It is possible to instruct whether to perform voice synthesis in any other unit.

【００８１】また、本実施例では、音声化コマンドとし
て、「\speech」を用いるようにしたが、音声化コマン
ドとしては、その他の任意の表記（例えば、「\voice」
や「\音声合成」など）を用いることが可能である。同
様に、音声化不可コマンドについても、「\mute」以外
の任意の表記（例えば、「\del」や「\音声化不可」な
ど）を用いることが可能である。但し、音声化コマンド
および音声化不可コマンドとしては、テキストファイル
において記述されない（あるいは、記述される頻度が少
ない）表記とするのが望ましい。Further, in the present embodiment, "\ speech" is used as the voice command, but other arbitrary notation (for example, "\ voice") is used as the voice command.
Or "\ speech synthesis") can be used. Similarly, as for the non-vocalization command, any notation other than "\ mute" (for example, "\ del" or "\ non-vocalization") can be used. However, it is preferable that the voice command and the non-voice command are described in a text file that is not described (or is less frequently described).

【００８２】さらに、本実施例においては、音声化コマ
ンドまたは音声化不可コマンドのうちのいずれか一方だ
けをテキストファイルに記述するようにしたが、テキス
トファイルには、例えば音声化コマンドおよび音声化不
可コマンドの両方を混在させるようにすることも可能で
ある。但し、この場合、音声化コマンドおよび音声化不
可コマンドのいずれも記述されていない行については、
音声合成処理の対象とするのか、または対象としないの
かを、あらかじめ決めておく必要がある。Further, in the present embodiment, only one of the voice command and the voice disable command is described in the text file. However, in the text file, for example, voice command and voice disable It is also possible to mix both commands. However, in this case, for the line in which neither the voice command nor the non-voice command is written,
It is necessary to determine in advance whether or not to be the target of the voice synthesis processing.

【００８３】また、本実施例では特に言及しなかった
が、テキストファイルを記述する言語は、特に限定され
るものではない。即ち、テキストファイルは、例えば日
本語や、英語、フランス語、あるいは、２以上の言語が
混在したものであっても良い。Although not particularly mentioned in this embodiment, the language for describing the text file is not particularly limited. That is, the text file may be, for example, Japanese, English, French, or a mixture of two or more languages.

【００８４】さらに、図２の実施例では、コンピュータ
ネットワークとして、インターネット１４を利用した場
合について説明したが、本発明は、その他のコンピュー
タネットワークを利用したネットワークシステムにも適
用可能である。Further, in the embodiment shown in FIG. 2, the case where the Internet 14 is used as the computer network has been described, but the present invention can be applied to a network system using other computer networks.

【００８５】また、図２の実施例においては、メールヘ
ッダにおけるメールの送信者、タイトル、および日付を
重要情報とするようにしたが、重要情報の設定は、任意
に行うことが可能である。即ち、重要情報として、メー
ルの送信者、タイトル、および日付以外の、例えば電子
メールのコピーの転送先（Cc:で始まる行）などを設定
したり、また、メールヘッダのすべてを重要情報とする
ことも可能である。さらに、メールヘッダのすべてを重
要情報としないようにすることも可能である。Further, in the embodiment of FIG. 2, the sender of the mail, the title, and the date in the mail header are used as the important information, but the important information can be set arbitrarily. That is, as important information, other than the sender of the mail, the title, and the date, for example, the transfer destination of the copy of the e-mail (the line starting with Cc :), etc. can be set, and all of the mail header is made important information. It is also possible. Furthermore, it is possible not to make all the mail headers important information.

【００８６】さらに、図２の実施例では、音声合成装置
２４を、ＳＰサーバ１３に設けるようにしたが、音声合
成装置２４をコンピュータ１１に設け、コンピュータ１
１に、受信した電子メールを読み上げさせるようにする
ことなども可能である。Further, in the embodiment of FIG. 2, the speech synthesizer 24 is provided in the SP server 13, but the speech synthesizer 24 is provided in the computer 11 and the computer 1
It is also possible to make 1 read the received e-mail.

【００８７】[0087]

【発明の効果】請求項１に記載の音声合成方法および請
求項４に記載の音声合成装置によれば、入力文に含まれ
る所定の指示情報が認識され、その指示情報に対応し
て、入力文から、音声合成する部分が抽出されて合成音
が生成される。従って、必要な部分についての、理解し
易い合成音を得ることが可能となる。According to the voice synthesizing method of the first aspect and the voice synthesizing apparatus of the fourth aspect, the predetermined instruction information included in the input sentence is recognized, and the input is made in correspondence with the instruction information. From the sentence, a part to be voice-synthesized is extracted to generate a synthetic voice. Therefore, it is possible to obtain a synthesized voice that is easy to understand for a necessary portion.

[Brief description of drawings]

【図１】本発明を適用した音声合成装置の一実施例の構
成を示すブロック図である。FIG. 1 is a block diagram showing the configuration of an embodiment of a speech synthesizer to which the present invention is applied.

【図２】本発明を適用したネットワークシステムの一実
施例の構成を示す図である。FIG. 2 is a diagram showing a configuration of an embodiment of a network system to which the present invention is applied.

【図３】図２のＳＰサーバ１３の構成例を示すブロック
図である。FIG. 3 is a block diagram showing a configuration example of an SP server 13 of FIG.

【図４】電子メールを示す図である。FIG. 4 is a diagram showing an electronic mail.

【図５】電子メールを示す図である。FIG. 5 is a diagram showing an electronic mail.

【図６】電子メールを示す図である。FIG. 6 is a diagram showing an electronic mail.

【図７】電子メールを示す図である。FIG. 7 is a diagram showing an electronic mail.

【図８】従来の音声合成装置の一例の構成を示すブロッ
ク図である。FIG. 8 is a block diagram showing a configuration of an example of a conventional speech synthesizer.

【図９】電子メールを示す図である。FIG. 9 is a diagram showing an electronic mail.

[Explanation of symbols]

１コマンド認識処理部，２テキスト音声合成部，
３言語処理部，４音声合成部，５音声出力
部，６スピーカ，１１コンピュータ，１２公
衆網，１３ＳＰサーバ，１４インターネット，
１５コンピュータ，１６ＳＰサーバ，１７
電話機，１７Ａプッシュボタン，１７Ｂスピー
カ，２１通信部，２２コマンド処理部，２３
テキストメール記憶部，３１メールヘッダ処理部1 command recognition processor, 2 text-to-speech synthesizer,
3 language processing unit, 4 voice synthesis unit, 5 voice output unit, 6 speaker, 11 computer, 12 public network, 13 SP server, 14 Internet,
15 computer, 16 SP server, 17
Telephone, 17A push button, 17B speaker, 21 communication unit, 22 command processing unit, 23
Text mail storage unit, 31 Mail header processing unit

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁶ 識別記号庁内整理番号ＦＩ技術表示箇所Ｈ０４Ｌ 12/54 9466−5ＫＨ０４Ｌ 11/20 １０１Ｂ 12/58 ─────────────────────────────────────────────────── ─── Continuation of the front page (51) Int.Cl. ⁶ Identification number Office reference number FI technical display location H04L 12/54 9466-5K H04L 11/20 101B 12/58

Claims

[Claims]

1. A speech synthesis method for generating a synthetic sound corresponding to an input sentence, comprising recognizing predetermined instruction information included in the input sentence, and corresponding to the instruction information, a voice from the input sentence. A voice synthesizing method characterized in that a part to be synthesized is extracted and only a synthesized voice corresponding to the part to be synthesized is generated.

2. The voice synthesizing method according to claim 1, wherein the instruction information indicates a portion of the input sentence that is to be voice-synthesized or a portion that is not voice-synthesized.

3. The voice synthesizing method according to claim 1, wherein the instruction information indicates, for each line, a portion of the input sentence that is to be voice-synthesized or a portion that is not to be voice-synthesized.

4. A voice synthesis device for generating a synthetic sound corresponding to an input sentence, comprising: a recognition unit that recognizes predetermined instruction information included in the input sentence; and a recognition unit that recognizes the instruction information recognized by the recognition unit. Correspondingly, it is provided with extraction means for extracting a portion to be speech-synthesized from the input sentence, and speech synthesis means for generating a synthesized sound corresponding to the portion to be speech-synthesized extracted by the extraction means. Speech synthesizer.

5. The speech synthesis apparatus according to claim 4, further comprising an addition unit that adds the instruction information to a predetermined portion of the input sentence.

6. The input sentence is an electronic mail transmitted via a computer network, and the adding means adds the instruction information to a predetermined portion of a header of the electronic mail. The speech synthesizer according to claim 5.