JP2001350489A

JP2001350489A - Voice synthesizer

Info

Publication number: JP2001350489A
Application number: JP2000170370A
Authority: JP
Inventors: Eiji Komatsu; 英二小松
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 2000-06-07
Filing date: 2000-06-07
Publication date: 2001-12-21
Anticipated expiration: 2020-06-07
Also published as: JP3464435B2

Abstract

PROBLEM TO BE SOLVED: To provide a voice synthesizer in which the number of errors is reading a same shape different sound word is made small, rhythm/voice quality that are appropriate for the contents of a mail are used for the reading and degree of understanding is made high. SOLUTION: The synthesizer is provided with means (115, 116 and 117) which conduct grouping for mail addresses, and store hierarchial data that are made by hierarchizing each group, means (118 and 106) which add node names of the hierarchial data to the words in the mail and register these words in a user's word dictionary and a means which generates list of the node names of the data from information containing the names of originators of the mails/ addresses/destinations of carbon copies. During the retrieval of the user's word dictionary (106), the members of the list above and the node names that are added to the words in the dictionary are collated and text analysis is conducted by using the matched words only.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、電子メール読み上
げのための音声合成装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech synthesizer for reading out an electronic mail.

【０００２】[0002]

【従来の技術】従来、テキスト音声変換において同形異
音語の読み分けを行う方法として、文献１：（『決定リ
ストによる同形異音語の読み分け』梅村祥之、清水
司、トヨタ中央研究所機械認識研究室、言語処理学会第
４回年次大会発表論文集、１９９８年３月）のような方
法がある。同形異音語とは、表記が同じで、複数の読み
がある単語を指す。例えば、「市場」は、普通名詞とし
て、「シジョウ」、「イ↓チバ」の２つの読みがある同
形異音語である（「↓」は、アクセントの下降位置を表
す）。2. Description of the Related Art Conventionally, as a method of distinguishing between homonymous words in text-to-speech conversion, reference 1: (“Separation of homonymous words using a decision list”) Yoshiyuki Umemura, Tsukasa Shimizu, Toyota Central Research Laboratory Machine Recognition Room, The 4th Annual Meeting of the Language Processing Society of Japan, March 1998). Homomorphic words are words that have the same notation and have multiple readings. For example, "Market" is a homonym with two readings as common nouns, "Shijo" and "I ↓ Chiba"("↓" indicates a descending position of an accent).

【０００３】決定リストを用いた自然言語解析として
は、文献２：（『コーパスからの日本語従属節係り受け
選好情報の抽出およびその評価』、宇津呂武仁他、自
然言語処理Ｖｏｌ．６、Ｎｏ．７、１９９９年１０
月）のような方法がある。[0003] Natural language analysis using a decision list is described in Document 2: ("Extraction and Evaluation of Japanese Dependent Dependency Preference Information from Corpus", Takehito Utsuro et al., Natural Language Processing Vol. 7, 1999 10
Month).

【０００４】日本語文を単語に分割し、さらに、品詞や
読みを決定する形態素解析の方法については、文献３：
（『確率付決定木を用いた日本形態素解析』、柏岡秀
紀、他、言語処理学会第３回年次大会発表論文集、１９
９７年３月）のような方法がある。この方法では、単語
に関する属性を用意し、属性の値により単語を分類し
て、確率付き決定木を構成することにより学習を行う。
解析時には、単語属性の値の組を用いて、決定木から単
語の出現確率を得て、より出現確率の高い単語の組み合
わせを、出力の単語分割とすることにより単語分割を行
う。A method of morphological analysis for dividing a Japanese sentence into words and further determining part of speech and pronunciation is described in Reference 3:
("Japanese Morphological Analysis Using Probabilistic Decision Trees", Hideki Kashioka, et al., Proc. Of the 3rd Annual Meeting of the Language Processing Society of Japan, 19
(March 1997). In this method, learning is performed by preparing an attribute related to a word, classifying the word according to the value of the attribute, and forming a decision tree with probability.
At the time of analysis, word probabilities are obtained from the decision tree using a set of values of the word attribute, and word combinations with higher appearance probabilities are used as output word divisions to perform word division.

【０００５】話者の違いによるアクセント・継続時間・
ポーズの違いについて示したものとしては、文献４：
（『座談会及び落語における日本語会話音声の韻律的特
徴の解析』、武田昌一、他、日本音響学会誌５４巻３
号、１９９８年）に示すような知見がある。韻律とは、
個々の母音や子音の分節的特徴ではなく、複数の音素か
らなる音声連続に対して与えられる特徴で、アクセント
（ａｃｃｅｎｔ）、イントネーション（ｉｎｔｏｎａｔ
ｉｏｎ）、強勢（ｓｔｒｅｓｓ）、強調（ｅｍｐｈａｓ
ｉｓ）、卓立（ｐｒｏｍｉｎｅｎｃｅ）、リズム（ｒｈ
ｙｔｈｍ）、テンポ（ｔｅｍｐｏ）、ポーズ（ｐａｕｓ
ｅ）等が相当する（［音響用語辞典］の「韻律的特徴」
の記述）。音声合成装置においては、韻律の選択は、継
続時間予測テーブル、アクセント予測テーブル、ピッチ
予測テーブル、ポーズ長予測テーブル等の韻律予測テー
ブルの選択に対応する。[0005] Accent, duration,
Reference 4 shows the differences in poses:
("Analysis of Prosodic Features of Japanese Conversational Voice in Roundtable and Rakugo", Shoichi Takeda, et al., Journal of the Acoustical Society of Japan, Vol. 54, No. 3
No. 1998). What is prosody?
It is not a segmental feature of individual vowels or consonants, but a feature given to a speech sequence consisting of a plurality of phonemes. Accent, intonation (intonat)
ion), stress, emphas
is), prominence, rhythm (rh)
ythm), tempo, pause
e) etc. ("Prosodic features" in [Acoustic Term Dictionary]
Description). In the speech synthesizer, selection of a prosody corresponds to selection of a prosody prediction table such as a duration prediction table, an accent prediction table, a pitch prediction table, and a pause length prediction table.

【０００６】また、メール読み上げに際して、メールの
発信者毎に声質を変え、メールの発信者の識別を容易に
するものとして文献５：（特開平１１−１０２１９８
『メッセージ処理装置、メッセージ処理方法及びメッセ
ージ処理プログラムを記録した媒体』）のような方法が
ある。声質とは、音声波から知覚される、言語情報とし
ての音素以外の、音声全体の聴覚上の特質で、誰が話し
ているかという話者の個人性情報、どのような心的状態
で話しているかという感情に関連する情報等を示す
（［音響用語辞典］の「声質」の記述）。音声合成装置
においては、声質の選択は、素片セット（一人の話者か
ら作成した素片一式）の選択に対応する。[0006] Further, when reading out a mail, the voice quality is changed for each mail sender to facilitate the identification of the mail sender, as disclosed in Japanese Patent Laid-Open No. 11-102198.
"Message processing device, message processing method and medium recording message processing program". Voice quality refers to the auditory characteristics of the whole voice, other than phonemes as linguistic information, perceived from speech waves, personality information of the speaker as to who is talking, and what kind of mental state is being spoken (Information of "voice quality" in the [acoustic dictionary]). In the speech synthesizer, selection of voice quality corresponds to selection of a segment set (a set of segments created from one speaker).

【０００７】[0007]

【発明が解決しようとする課題】電子メールを音声合成
する場合、電子メールの分野は多岐に渡るため、従来の
韻律・声質使用方法では、メールの文脈からかけ離れた
読みになることが多かった。本発明は、メールに付加さ
れている発信者・宛先・複写送付先等の情報と、装置内
のデータベースとを用いて、メールの文脈に適した読み
を選択することにより、同形異音語の読み誤りが少ない
音声合成装置を提供することを第１の目的とする。In the case of synthesizing an e-mail by voice, the field of the e-mail is diversified. Therefore, in the conventional method of using prosody and voice quality, reading is far from the context of the e-mail. The present invention uses the information of the sender, destination, copy destination, etc. attached to the mail, and a database in the device to select a reading suitable for the context of the mail, thereby enabling the homomorphic word to be read. A first object is to provide a speech synthesizer with few reading errors.

【０００８】また、電子メールを音声合成する場合、電
子メールの分野は多岐に渡るため、従来の韻律・声質を
使用する方法では、メールの発信人や内容にそぐわない
韻律・声質になることが多かった。本発明は、メールに
付加されている発信者・宛先・複写送付先等の情報と、
装置内のデータベースとを用いて、メールに適した読み
・韻律・声質を選択することにより、メールの内容に適
した韻律・声質で読み上げることができる了解性の高い
音声合成装置を提供することを第２の目的とする。[0008] Further, in the case of synthesizing an e-mail, the field of the e-mail is diversified. Therefore, the conventional method using the prosody and voice quality often has a prosody and voice quality that does not match the sender or contents of the mail. Was. According to the present invention, information such as a sender, a destination, and a copy destination added to a mail is provided.
To provide a speech synthesis device with high intelligibility that can read aloud with the prosody and voice quality suitable for the contents of the mail by selecting the reading, prosody and voice quality suitable for the mail using the database in the device. This is the second purpose.

【０００９】[0009]

【課題を解決するための手段】そのために、第１発明の
音声合成装置においては、受信メールをテキスト解析す
る際に用いる単語を登録する手段と、メールアドレスの
データベースを格納する手段と、メールアドレスのデー
タベースの内容を追加・修正・削除する手段とを備えた
音声合成装置において、前記データベースのメールアド
レスをグルーピングすると共に、各グループを階層化し
た階層データとして格納する手段と、単語に前記階層デ
ータのノード名を付加してユーザ単語辞書に登録する手
段と、メールの発信人・宛先・複写送付先の情報から前
記階層データのノード名のリストを作成する手段とを備
え、前記ユーザ単語辞書の検索に際して、前記作成され
たリストのメンバーとユーザ単語辞書の単語に付加され
ているノード名とを照合し、ノード名が前記リストのメ
ンバーのいずれかに一致する単語のみを用いてテキスト
解析を行うことを特徴とする。For this purpose, in the voice synthesizing apparatus according to the first aspect of the present invention, there are provided means for registering words used in text analysis of a received mail, means for storing a mail address database, and mail address. Means for adding / modifying / deleting the contents of the database, grouping the mail addresses of the database, storing each group as hierarchical data, and storing the words as words in the hierarchical data. Means for adding a node name to the user word dictionary, and means for creating a list of node names of the hierarchical data from information on the sender, destination, and copy destination of the mail. In the search, the members of the created list and the node names added to the words in the user word dictionary Collating, and performing text analysis using only words that the node name matches any of the members of the list.

【００１０】また、第２発明の音声合成装置において
は、受信メールを読み上げる際の韻律又は声質を制御す
る手段と、メールアドレスのデータベースを格納する手
段と、メールアドレスのデータベースの内容を追加・修
正・削除する手段とを備えた音声合成装置において、前
記データベースのメールアドレスをグルーピングすると
共に、各グループを階層化した階層データとして格納す
る手段と、メールの発信人、宛先、複写送付先の情報に
基づいて、前記階層データのノードを選択する手段とを
備え、メールを読み上げる際に、前記選択されたノード
に対応付けられている韻律及び声質でメールを読み上げ
ることを特徴とする。Further, in the speech synthesizer of the second invention, means for controlling the prosody or voice quality when reading the received mail, means for storing a mail address database, and adding / modifying the contents of the mail address database A voice synthesizing apparatus having means for deleting the mail, grouping the mail addresses of the database, storing each group as hierarchical data in a hierarchy, and adding information on the sender, destination, and copy destination of the mail; Means for selecting a node of the hierarchical data based on the prosody and voice quality associated with the selected node when reading out the mail.

【００１１】更に、第３発明の音声合成装置において
は、決定リストを用いて同形異音語を読み分ける手段
と、メールアドレスのデータベースを格納する手段と、
メールアドレスのデータベースの内容を追加・修正・削
除する手段とを備えた音声合成装置において、前記デー
タベースのメールアドレスをグルーピングすると共に、
各グループを階層化した階層データとして格納する手段
と、メールの発信人・宛先・複写送付先の情報から前記
階層データのノード名を選択する手段とを備え、前記選
択されたノードに設定されている属性を証拠とした規則
を含む決定リストを用いて単語の読み分けを行うことを
特徴とする。Further, in the speech synthesizing apparatus according to the third aspect of the present invention, there is provided a means for reading out homomorphic words using a decision list, a means for storing a mail address database,
Means for adding / modifying / deleting the contents of a mail address database, wherein the mail addresses of the database are grouped,
Means for storing each group as hierarchical data, and means for selecting a node name of the hierarchical data from information on the sender, destination, and copy destination of the mail. The method is characterized in that words are distinguished by using a decision list including a rule in which a certain attribute is used as evidence.

【００１２】[0012]

【発明の実施の形態】以下、本発明の実施の形態（以
下、実施形態）について、図面を参照しながら詳細に説
明する。＜第１の実施形態＞＜構成＞図１は、第１の実施形態の構成を示す図であ
る。１０１は、音声合成装置を操作するためのユーザイ
ンタフェース、１０２は、メール管理部１１１における
メール送受信部１１２が受信したメールの内容を読み上
げるテキスト音声変換部であり、１０３は、受信したメ
ールからの入力文章読み出し、メールについている発信
人・宛先・複写送付先情報のテキスト音声変換用情報設
定部への受け渡し、及び、同処理部から返される２種類
の情報の後続処理部への受け渡しを行うテキスト入力部
である。１０４は、入力文章から中間言語（読み・アク
セント位置・フレーズ立ち上げ位置・ポーズ位置）を生
成するテキスト解析部であり、１０５は、テキスト解析
で用いる単語情報（表記・読み・品詞・アクセント型・
アクセント結合型等）を格納するシステム単語辞書、１
０６は、ユーザが追加した単語情報を格納するメールア
ドレス付きユーザ単語辞書である。１０７は、テキスト
解析の出力から、合成パラメータ（合成単位・継続時間
・ピッチ・ポーズ・振幅）を生成する合成パラメータ生
成部であり、１０８は、合成パラメータを決定するため
に用いる予測テーブルを格納する予測テーブル格納部で
ある。予測テーブル格納部には、継続時間予測テーブル
・ピッチ予測テーブル・ポーズ予測テーブル・振幅予測
テーブルが含まれ。韻律の選択に対応するため、いずれ
の種類のテーブルについても複数のテーブルを格納して
おく。１０９は、合成パラメータから波形を生成する音
声合成部であり、１１０は、音声合成で用いる音声素片
セットを格納する音声素片辞書である。音声素片辞書に
は、声質の選択に対応するため、複数の素片セットを格
納しておく。Embodiments of the present invention (hereinafter, embodiments) will be described below in detail with reference to the drawings. <First Embodiment><Configuration> FIG. 1 is a diagram showing a configuration of a first embodiment. Reference numeral 101 denotes a user interface for operating the speech synthesizer, 102 denotes a text-to-speech conversion unit that reads out the contents of the mail received by the mail transmission / reception unit 112 in the mail management unit 111, and 103 denotes an input from the received mail. Text input for reading text, passing sender / destination / copy / destination information on the mail to the text-to-speech conversion information setting unit, and passing two types of information returned from the processing unit to the subsequent processing unit Department. Reference numeral 104 denotes a text analysis unit that generates an intermediate language (reading, accent position, phrase start position, pause position) from the input text, and 105 denotes word information (notation, reading, part of speech, accent type,
System word dictionary that stores accent-binding type, etc., 1
Reference numeral 06 denotes a user word dictionary with a mail address that stores word information added by the user. A synthesis parameter generation unit 107 generates synthesis parameters (synthesis unit / duration / pitch / pause / amplitude) from the output of the text analysis, and 108 stores a prediction table used to determine synthesis parameters. This is a prediction table storage unit. The prediction table storage unit includes a duration prediction table, a pitch prediction table, a pause prediction table, and an amplitude prediction table. In order to cope with the selection of the prosody, a plurality of tables are stored for any type of table. Reference numeral 109 denotes a speech synthesis unit that generates a waveform from synthesis parameters, and reference numeral 110 denotes a speech unit dictionary that stores a speech unit set used in speech synthesis. A plurality of unit sets are stored in the speech unit dictionary in order to support selection of voice quality.

【００１３】１１１は、メールの送受信・保存を行うメ
ール管理部であり、１１２は、メールの送受信を行うメ
ール送受信部、１１３は、受信したメールを格納するメ
ール格納部である。Reference numeral 111 denotes a mail management unit for sending / receiving and storing mails; 112, a mail sending / receiving unit for sending / receiving mails; and 113, a mail storage unit for storing received mails.

【００１４】１１４は、メールアドレスに関連するデー
タベースを管理するデータベース管理部であり、１１５
は、メールアドレスデータベース１１６、及び、メール
アドレス階層データベース１１７の管理を行うメールア
ドレス管理部である。１１６は、メールアドレス及びグ
ループに関する情報を格納するメールアドレスデータベ
ース、１１７は、メールアドレス及びグループ間の階層
関係を格納するメールアドレス階層データベースであ
る。１１８は、ユーザアドレス付きユーザ単語辞書の単
語の追加・削除・修正を行うユーザ単語辞書管理部、１
１９は、韻律・声質設定データベース１２０のデータの
追加・削除・修正を行う韻律・声質設定データベース管
理部であり、１２０は、メールアドレスデータベース１
１６に格納されているメールアドレス又はグループと韻
律の属性の組及び声質の属性の組を対応づける韻律・声
質設定データベース、１２１は、韻律・声質設定データ
ベース１２０で用いられている韻律の属性の組と、予測
テーブル格納部１０８に格納されている予測テーブルと
を対応づける韻律・予測テーブル対応データベースであ
る。１２２は、韻律・声質設定データベース１２０で用
いられている声質の属性の組と、音声素片辞書１１０の
素片セットとを対応づける声質・素片セット対応データ
ベースである。Reference numeral 114 denotes a database management unit for managing a database related to the mail address.
Is a mail address management unit that manages the mail address database 116 and the mail address hierarchy database 117. Reference numeral 116 denotes a mail address database storing information on mail addresses and groups, and 117 denotes a mail address hierarchy database storing mail addresses and hierarchical relationships between groups. A user word dictionary management unit 118 adds, deletes, and corrects words in the user word dictionary with user addresses.
Reference numeral 19 denotes a prosody and voice quality setting database management unit that adds, deletes, and corrects data in the prosody and voice quality setting database 120, and 120 denotes the mail address database 1
A prosody / voice quality setting database 121 for associating the mail address or group stored in 16 with the set of prosody attributes and the set of voice attributes, 121 is a set of prosody attributes used in the prosody / voice quality database 120. A prosody / prediction table correspondence database that associates the prediction table stored in the prediction table storage unit 108 with the prediction table. Reference numeral 122 denotes a voice quality / unit set correspondence database that associates a set of voice quality attributes used in the prosody / voice quality setting database 120 with a unit set of the voice unit dictionary 110.

【００１５】１２３は、メールについている発信者・宛
先・複写送付先情報から、「ユーザ辞書検索用メールア
ドレスリスト」と「韻律・声質設定用メールアドレス」
の２つの情報を決定するテキスト音声変換用情報設定部
である。メールの「発信者」・「宛先」・「複写送付
先」は以下のように定義する。発信者とは、メールを発
信した人のメールアドレスとする。宛先とは、メールを
発信した人が指定するメールの送付先のメールアドレス
で、複数指定が可能とする。複写送付先とは、メールの
発信者がコピーを送る目的で、宛先に追加して指定する
メールアドレスで、複数指定が可能とする。宛先を省略
して、複写送付先だけを指定することはできないとす
る。受信メールの発信者・宛先・複写送付先は、発信時
の設定と同じとする。したがって、メーリングリストを
経由してくるメールについては、宛先・複写送付先のい
ずれにも、自分のメールアドレスがない場合がある。Reference numeral 123 denotes a “user dictionary search mail address list” and a “prosodic / voice quality setting mail address” based on the sender, destination, and copy destination information attached to the mail.
This is a text-to-speech conversion information setting unit that determines the following two types of information. The "sender", "destination", and "copy destination" of an email are defined as follows. The sender is the e-mail address of the person who sent the e-mail. The destination is a mail address of a mail destination specified by the person who sent the mail, and a plurality of addresses can be specified. The copy destination is a mail address that is added to the destination and specified by the sender of the mail for the purpose of sending a copy, and a plurality of destinations can be specified. It is assumed that it is not possible to omit the destination and specify only the copy destination. The sender, destination, and copy destination of the received mail are the same as the settings at the time of transmission. Therefore, there is a case where a mail which passes through the mailing list does not have its own mail address at both the destination and the copy destination.

【００１６】「ユーザ辞書検索用メールアドレスリス
ト」は、ユーザ辞書中の単語のうち、メールの発信人・
宛先・複写送付先に共通に関連する単語だけを取り出す
のに用いる情報で、メールアドレス又はグループ名のリ
ストである。リストのメンバーには、以下に説明するよ
うな優先度を付与し、リストのメンバーを優先度の降順
にソートしてある。優先度の値は、メールアドレスは
０、終端グループは−１とし、非終端グループは、木構
造を下方向にたどって、リスト中にあるいずれかの終端
グループまで到達する場合の最小値のリンク数にマイナ
スをつけたものとする。唯一、リストに共通グループし
かない場合は優先度が求まらないが、優先度の値として
０を設定する。The "user dictionary search e-mail address list" contains, among words in the user dictionary, the sender of the e-mail and
This is information used to extract only words commonly associated with the destination / copy / destination, and is a list of mail addresses or group names. The members of the list are given priorities as described below, and the members of the list are sorted in descending order of priority. The priority value is 0 for the mail address, -1 for the terminal group, and the minimum number of links when the non-terminal group follows the tree structure downward and reaches one of the terminal groups in the list. Minus. If there is only a common group in the list, the priority is not obtained, but 0 is set as the value of the priority.

【００１７】メールアドレス付きユーザ単語辞書１０６
検索の際に、単語に付与されているメールアドレス又は
グループ名が、このリストのメンバーと一致する場合に
限り、解析に用いる（システム辞書の単語については、
このような制限はしない）。「韻律・声質設定用メール
アドレス」は、メールを読み上げる際の韻律と声質を選
択するための情報で、メールアドレス又はグループ名で
ある。例えば、ユーザが宛先に入っているメールでは、
メールの発信人に設定した韻律・声質で読み上げるが、
複写送付で送られてきたメールでは、受信者であるユー
ザは第３者であるため、発信人・複写送付先・宛先が共
通に属するグループに設定した韻律・声質で読み上げ
る。また、メーリングリスト経由で来たメールは、メー
リングリストに設定した韻律・声質で読み上げる。韻律
・声質設定用メールアドレスは、合成パラメータ生成部
１０７で韻律選択に用い、また、音声合成部１０９で声
質選択に用いる。User word dictionary with mail address 106
At the time of search, only when the email address or group name assigned to a word matches a member of this list, it is used for analysis (for words in the system dictionary,
There is no such restriction). The “prosody / voice quality setting mail address” is information for selecting the prosody and voice quality when reading out the mail, and is a mail address or a group name. For example, in an email addressed to a user,
It reads aloud according to the prosody and voice quality set for the sender of the email,
Since the user who is the recipient is the third party in the mail sent by copy sending, the mail is read out according to the prosody and voice quality set in the group to which the sender, copy destination, and destination belong in common. Also, e-mails sent via the mailing list are read out according to the prosody and voice quality set in the mailing list. The prosody / voice quality setting mail address is used by the synthesis parameter generation unit 107 for prosody selection, and is used by the speech synthesis unit 109 for voice quality selection.

【００１８】[0018]

【表１】表１は、ユーザが追加した単語情報を格納するメールア
ドレス付きユーザ単語辞書の内容の一部を表示したもの
である。ユーザが表記・品詞・読み・アクセント等の情
報の他、メールアドレス又はグループ名を付与する。[Table 1] Table 1 shows a part of the contents of a user word dictionary with a mail address that stores word information added by the user. The user assigns a mail address or a group name in addition to information such as notation, part of speech, reading, and accent.

【００１９】[0019]

【表２】表２は、メールアドレスデータベースの内容の一部を表
示したものである。ユーザが、メールアドレス又はグル
ープ名・種別・名前・電話番号・所属等の情報を格納す
る。名前・電話番号・所属は、ユーザがメールの送信時
の宛先設定や宛先を識別するために用いる。[Table 2] Table 2 shows a part of the contents of the mail address database. The user stores information such as a mail address or group name, type, name, telephone number, affiliation, and the like. The name, telephone number, and affiliation are used by the user to set the destination when sending the mail and identify the destination.

【００２０】[0020]

【表３】表３は、メールアドレス及びグループ間の階層関係を格
納するメールアドレス階層データベースの内容の一部を
表示したものである。メールアドレスが複数のグループ
に属する場合は、複数のデータとして登録する。ｓｕｚ
ｕｋｉ＠ｏｋｉ．ｃｏ．ｊｐは、社内グループと音声グ
ループに属するため、２つのデータがある。[Table 3] Table 3 shows a part of the contents of the mail address hierarchy database that stores the mail addresses and the hierarchical relationship between groups. If the mail address belongs to a plurality of groups, it is registered as a plurality of data. suz
uki @ oki. co. Since jp belongs to an in-house group and a voice group, there are two data.

【００２１】[0021]

【表４】表４は、メールアドレスデータベース１１６に格納され
ているメールアドレス又はグループに韻律・声質を設定
する韻律・声質設定データベースの内容の一部を表示し
たものである。メールアドレス及びグループに対し、韻
律を、「速さ」と「口調」等、予め用意した属性の組
（属性が１つの場合も含む）で指定する。声質も同じく
予め用意した属性の組（属性が１つの場合も含む）で指
定する。メールアドレス及びグループに対し、予測テー
ブル・素片セットを対応付けるための属性が設定され
る。属性は予め用意したものだけをユーザが選択して設
定する。属性の値も予め用意した値のみを用いる。[Table 4] Table 4 shows a part of the contents of the prosody and voice quality setting database for setting the prosody and voice quality to the mail address or the group stored in the mail address database 116. For the mail address and the group, the prosody is designated by a set of attributes (including one attribute) prepared in advance, such as “speed” and “tone”. The voice quality is also designated by a set of attributes prepared in advance (including a case where there is only one attribute). An attribute for associating the prediction table / segment set with the mail address and the group is set. The user selects and sets only the attributes prepared in advance. Only the value of the attribute prepared in advance is used.

【００２２】[0022]

【表５】表５は、韻律・声質設定データベースで用いられている
韻律の属性の組と、予測テーブル格納部１０８に格納さ
れている予測テーブルとを対応づける韻律・予測テーブ
ル対応データベースの内容の一部を表示したものであ
る。予測テーブルについては、継続時間、ピッチ、ポー
ズ、振幅の４種類のテーブルがある。「Ｄｕｒａｔｉｏ
ｎ」＋数字、「Ｐｉｔｃｈ」＋数字、「Ｐａｕｓｅ」＋
数字、「Ｐｏｗｅｒ」＋数字の文字列は、予測テーブル
の識別子とする。[Table 5] Table 5 shows a part of the contents of a prosody / prediction table correspondence database that associates a set of prosody attributes used in the prosody / voice quality setting database with a prediction table stored in the prediction table storage unit 108. It was done. As the prediction table, there are four types of tables of duration, pitch, pause, and amplitude. "Duratio
n ”+ number,“ Pitch ”+ number,“ Pause ”+
The numeral, “Power” + the character string of the numeral is used as an identifier of the prediction table.

【００２３】[0023]

【表６】表６は、韻律・声質設定データベースで用いられている
声質の属性の組と、音声素片辞書１１０の素片セットと
を対応づける声質・素片セット対応データベースの内容
の一部を表示したものである。「Ｍａｌｅ」＋数字、
「Ｆｅｍａｌｅ」＋数字の文字列は、素片セットの識別
子とする。声質の属性の設定パタンに、素片セットを対
応させる。[Table 6] Table 6 shows a part of the contents of the voice quality / unit set correspondence database that associates a set of voice quality attributes used in the prosody / voice quality setting database with the unit set of the voice unit dictionary 110. It is. "Male" + number,
The character string of “Female” + number is used as an identifier of a unit set. The segment set is made to correspond to the voice quality attribute setting pattern.

【００２４】＜動作＞本実施形態では、図２に示す、メ
ールアドレス階層データベースと呼ぶデータを用いる。
このデータベースにおける「グループ」、「非終端グル
ープ」、「終端グループ」、「共通グループ」は以下の
ような条件を満たす。<Operation> In this embodiment, data called a mail address hierarchy database shown in FIG. 2 is used.
The “group”, “non-terminal group”, “terminal group”, and “common group” in this database satisfy the following conditions.

【００２５】（１）「グループ」は、「非終端グルー
プ」と「終端グループ」に分類される。（２）「非終端グループ」は、グループが下位に来るこ
とができるが、メールアドレスが下位に来ることはでき
ない。（３）「終端グループ」は、メールアドレスが下位に来
ることができるが、グループが下位に来ることはできな
い。（４）階層構造のルートには、常に「共通グループ」と
呼ぶ非終端グループがあるとする。（５）「共通グループ」以外の非終端グループは、ただ
１つの非終端グループの下位に来る。ただし、自分自身
や自分自身の下方向にある非終端グループの下位に来る
ようなループは許さない。（６）メールアドレスは、必ず終端グループの下位に
来る。メールアドレスが複数の終端グループの下位に来
ることも許す。(1) The “group” is classified into a “non-terminal group” and a “terminal group”. (2) In the “non-terminal group”, the group can be at the lower level, but the mail address cannot be at the lower level. (3) In the “termination group”, the mail address can come at the lower position, but the group cannot come at the lower position. (4) It is assumed that there is a non-terminal group always called a “common group” at the root of the hierarchical structure. (5) Non-terminal groups other than the “common group” come under one non-terminal group. It does not allow loops that fall beneath yourself or a non-terminal group below yourself. (6) The mail address always comes under the terminal group. Allows email addresses to be subordinate to multiple termination groups.

【００２６】以下の説明において、特に記述しない場合
でも、データベース間でデータの整合性を保つように動
作する。メールアドレス又はグループは、ユーザが追加
・修正・削除できるため、整合性を失わせる処理を常に
チェックし排除するようにし、メールアドレス付きユー
ザ単語辞書１０６、メールアドレスデータベース１１
６、メールアドレス階層データベース１１７、韻律・声
質設定データベース１２０のデータの整合性を保てるよ
うにする。In the following description, an operation is performed so as to maintain data consistency between databases even if not particularly described. The user can add / modify / delete the e-mail address or group. Therefore, always check and eliminate the processing that causes inconsistency. The e-mail address-added user word dictionary 106, the e-mail address database 11
6. Data consistency of the mail address hierarchy database 117 and the prosody and voice quality setting database 120 can be maintained.

【００２７】ユーザは、ユーザインタフェース１０１を
介して、メールアドレス管理部１１５を呼び出すことに
より、メールアドレスデータベース１１６とメールアド
レス階層データベース１１７の内容を追加・削除・変更
する。また、ユーザ単語辞書管理部１１８を呼び出すこ
とにより、メールアドレス付きユーザ単語辞書１０６の
単語を追加・削除・変更する。さらに、韻律・声質設定
データベース管理部１１９を呼び出すことにより、韻律
・声質設定データベース１２０の内容を追加・削除・変
更する。The user adds, deletes, or changes the contents of the mail address database 116 and the mail address hierarchy database 117 by calling the mail address management unit 115 via the user interface 101. Also, by calling the user word dictionary management unit 118, words in the user word dictionary with mail address 106 are added / deleted / changed. Further, the contents of the prosody and voice quality setting database 120 are added, deleted, and changed by calling the prosody and voice quality setting database management unit 119.

【００２８】ユーザは、メール送受信部１１２により、
メールを送受信する。受信したメール、或いは、送信し
たメールの複写等は、メール格納部１１３に保存する。
ユーザがメールをテキスト音声変換する場合には、ユー
ザインタフェース１０１を介して、メール格納部１１３
に格納されている電子メールを指定し、テキスト音声変
換部１０２を呼び出す。The user operates the mail transmitting / receiving unit 112 to
Send and receive mail. The received mail or a copy of the transmitted mail is stored in the mail storage unit 113.
When the user performs text-to-speech conversion of the mail, the mail storage unit 113 is transmitted via the user interface 101.
Is specified, and the text-to-speech converter 102 is called.

【００２９】テキスト音声変換部１０２では、まず、テ
キスト入力部１０３がメール格納部１１３からメールを
読み込む。次に、テキスト音声変換用情報設定部１２３
を呼び出して、読み込んだメールについている宛先・発
信人・複写送付先から、ユーザ辞書検索用メールアドレ
スリストと、韻律・声質設定用メールアドレスを獲得
し、後続の処理部へ渡す。テキスト音声変換用情報設定
部１２３の処理終了後、テキスト解析部１０４が、メー
ルの文章を単語に分割し、読み・アクセント位置・イン
トネーション立ち上げ位置・ポーズ位置を決定し、中間
言語を生成する。In the text-to-speech conversion unit 102, first, the text input unit 103 reads a mail from the mail storage unit 113. Next, the text-to-speech information setting unit 123
To obtain a user dictionary search mail address list and a prosody / voice quality setting mail address from the destination, sender, and copy destination of the read mail, and pass them to the subsequent processing unit. After the processing of the text-to-speech information setting unit 123, the text analysis unit 104 divides the text of the mail into words, determines the reading, accent position, intonation start position, and pause position, and generates an intermediate language.

【００３０】合成パラメータ生成部１０７は、テキスト
解析部が出力した中間言語に対して、音素の継続時間・
ピッチ・ポーズ長・振幅についてのパラメータを決定す
る。テキスト音声変換用情報設定部１２３が決定した韻
律・声質設定用メールアドレスに対応付けられている予
測テーブルは、韻律・声質設定データベース１２０と韻
律・予測テーブル対応データベース１２１を参照するこ
とにより得ることができる。そして、この予測テーブル
を用いて韻律を生成する。The synthesizing parameter generation unit 107 calculates the duration of the phoneme for the intermediate language output by the text analysis unit.
Determine parameters for pitch, pause length, and amplitude. The prediction table associated with the prosody / voice quality setting mail address determined by the text-to-speech conversion information setting unit 123 can be obtained by referring to the prosody / voice quality setting database 120 and the prosody / prediction table correspondence database 121. it can. Then, a prosody is generated using the prediction table.

【００３１】音声合成部１０９は、テキスト解析部が出
力する中間言語、合成パラメータ生成部が出力するパラ
メータから、音声を合成する。テキスト音声変換用情報
設定部１２３が決定した韻律・声質設定用メールアドレ
スに対応付けられている声質は、韻律・声質設定データ
ベース１２０と声質・素片セット対応データベース１２
２を参照することにより知ることができる。そして、音
声素片辞書１１０中の素片セットを用いて音声を合成す
る。The speech synthesis unit 109 synthesizes speech from the intermediate language output from the text analysis unit and the parameters output from the synthesis parameter generation unit. The voice quality associated with the prosody / voice quality setting mail address determined by the text-to-speech conversion information setting unit 123 is the prosody / voice quality setting database 120 and the voice / speech unit correspondence database 12.
2 can be known. Then, speech is synthesized using the segment set in the speech segment dictionary 110.

【００３２】図３、図４、図５、図６、図７は、テキス
ト音声変換用情報設定部１２３が、ユーザ辞書検索用メ
ールアドレスリストと韻律・声質設定用メールアドレス
を設定する、テキスト音声変換用情報設定処理のフロー
である。FIGS. 3, 4, 5, 6, and 7 show a text-to-speech conversion information setting unit 123 in which a text-to-speech conversion mail address list and a prosodic / voice-quality setting mail address are set. It is a flow of information setting processing for conversion.

【００３３】図３は、テキスト音声変換用情報設定処理
のメインルーチンである。処理３０１、処理３０２で、
メールについている発信人、宛先、複写送付先を参照し
て、自分宛のメール、複写で送付されてきたメール、メ
ーリングリスト経由で来たメールの分類を行う。それぞ
れの分類に応じて、処理３０３、処理３０４、処理３０
５のサブルーチンを実行する。処理３０６で、各サブル
ーチンで決定されたユーザ辞書検索用メールアドレスリ
ストに、リンクを上方向に向かってたどることにより到
達できるすべてのグループのグループ名を追加する。リ
ストが空のときは、無条件に「共通グループ」を追加す
る。処理３０７では、ユーザ単語辞書検索用メールアド
レスリストのメンバーに優先度を付与し、優先度の降順
にソートするサブルーチンを実行する。FIG. 3 shows a main routine of the text-to-speech conversion information setting process. In processing 301 and processing 302,
By referring to the sender, destination, and copy destination of the mail, the mail is classified into the mail addressed to the user, the mail sent by copy, and the mail received via the mailing list. Processing 303, processing 304, processing 30
The subroutine 5 is executed. In step 306, the group names of all the groups that can be reached by following the link upward are added to the user dictionary search mail address list determined in each subroutine. If the list is empty, unconditionally add a "common group". In the process 307, a subroutine for assigning priorities to members of the user word dictionary search mail address list and sorting the members in descending order of priority is executed.

【００３４】図４は、自分宛に来たメールについての情
報設定の処理フローである。処理４０１、処理４０２、
処理４０３は、ユーザ単語辞書検索用メールアドレスリ
ストを決定する処理であり、処理４０４、処理４０５、
処理４０６は、韻律・声質設定用メールアドレスを決定
する処理である。FIG. 4 is a processing flow of information setting for a mail addressed to the user. Process 401, Process 402,
Process 403 is a process for determining a mail address list for user word dictionary search, and includes processes 404, 405,
Process 406 is a process for determining a prosody / voice quality setting mail address.

【００３５】図５は、複写で来たメールについてのサブ
ルーチンである。処理５０１は、ユーザ単語辞書検索用
メールアドレスリストを決定する処理であり、処理５０
２は、韻律・声質設定用メールアドレスを決定する処理
である。FIG. 5 shows a subroutine for a mail that has been copied. Process 501 is a process for determining a mail address list for user word dictionary search.
2 is a process for determining a prosody / voice quality setting mail address.

【００３６】図６は、メーリングリスト経由で来たメー
ルについての情報設定の処理フローである。メーリング
リスト経由の場合は、メールについている宛先は、自分
ではなく、メーリングリストのアドレスになっている。
処理６０１、処理６０２、処理６０３は、ユーザ単語辞
書検索用メールアドレスリストを決定する処理であり、
処理６０４、処理６０５、処理６０６は、韻律・声質設
定用メールアドレスを決定する処理である。FIG. 6 is a processing flow for setting information on a mail that has arrived via a mailing list. In the case of mailing via a mailing list, the destination attached to the mail is not the user but the mailing list address.
Process 601, process 602, and process 603 are processes for determining a mail address list for user word dictionary search.
Processes 604, 605, and 606 are processes for determining a prosody / voice quality setting mail address.

【００３７】図７は、ユーザ辞書検索用メールアドレス
リストのメンバーのソート処理のフローである。処理７
０１で、リスト中のメールアドレスの優先度に０、リス
ト中の終端グループの優先度に−１を設定し、非終端グ
ループの優先度は、未設定であることを表す１を設定す
る。処理７０２、処理７０３、処理７０４、処理７０
５、処理７０６は、非終端グループのメンバーについ
て、メールアドレス階層データベースにおいて、下方向
へ最短のメールアドレスまでノードをたどったときの距
離の符合を変えた値を優先度として設定する処理であ
る。処理７０７で、ユーザ辞書検索用メールアドレスリ
ストのメンバーを、優先度をキーとして大小関係の降順
にソートする。ソートのアルゴリズムは、既存のものを
用いる。FIG. 7 is a flowchart of a process for sorting members of the user dictionary search mail address list. Processing 7
In step 01, the priority of the mail address in the list is set to 0, the priority of the terminal group in the list is set to -1, and the priority of the non-terminal group is set to 1 indicating that it is not set. Process 702, Process 703, Process 704, Process 70
5. Process 706 is a process of setting, as the priority, a value obtained by changing the sign of the distance when the node is traced downward to the shortest mail address in the mail address hierarchy database for the members of the non-terminal group. In process 707, members of the user dictionary search mail address list are sorted in descending order of magnitude using the priority as a key. An existing sorting algorithm is used.

【００３８】図８は、メールアドレスに付与されている
アドレス（発信人・宛先・複写送付先）、及び、テキス
ト音声変換用情報設定部により決定されるテキスト音声
変換用情報（ユーザ辞書検索用メールアドレスリスト、
及び、韻律・声質設定用メールアドレス）の一例を示し
たものである。FIG. 8 shows the address (sender / destination / copy / destination) assigned to the mail address, and the text / speech conversion information (user dictionary search mail) determined by the text / speech conversion information setting unit. Address list,
And a prosody / voice quality setting mail address).

【００３９】以下、図３、図４、図５、図６、図７を用
いて、図８に例示したアドレスからテキスト音声変換用
情報を決定する処理の流れを説明する。ただし、処理に
は、表１、表２、表３、表４のデータベースを用いるも
のとする。The flow of processing for determining text-to-speech conversion information from the addresses illustrated in FIG. 8 will be described below with reference to FIGS. 3, 4, 5, 6, and 7. However, it is assumed that the databases of Table 1, Table 2, Table 3, and Table 4 are used for the processing.

【００４０】まず、図３のメインルーチンにおいて、処
理３０１で自分宛のメールであると判定され、処理３０
３でサブルーチン１が呼び出される。図４において、処
理４０１で複写送付先があるため、処理４０３が実行さ
れる。ａｓａｈｉ＠ｉｉｄｅ．ｃｏ．ｊｐはメールアド
レス階層データベース（表３）に登録されていないた
め、複写送付先にあるｎｏｇｕｃｈｉ＠ｎｏｒｔｈ．ｃ
ｏ．ｊｐ、ｋｕｒｏｂｅ＠ｎｏｒｔｈ．ｃｏ．ｊｐ、ｓ
ｈｉｏｍｉ＠ｓｏｕｔｈ．ｃｏ．ｊｐにうち最も多くが
属するグループを選ぶ。メールアドレス階層データベー
スにより、登山グループに３つのメールアドレスすべて
が属し、いずれのアドレスも他のグループには属さない
ため、ユーザ辞書検索用メールアドレスリストは｛登山
グループ｝となる。First, in the main routine of FIG. 3, it is determined in step 301 that the mail is addressed to the user, and in step 30
At 3 the subroutine 1 is called. In FIG. 4, since there is a copy destination in step 401, step 403 is executed. asahi @ ide. co. jp is not registered in the mail address hierarchy database (Table 3), so that noguchi @ north. c
o. jp, kurobe @ north. co. jp, s
hiomi @ south. co. The group to which the largest group belongs to jp is selected. According to the mail address hierarchy database, all three mail addresses belong to the climbing group, and none of the addresses belong to the other groups. Therefore, the mail address list for user dictionary search is {climbing group}.

【００４１】次に、処理４０４が実行され、発信人のａ
ｓａｈｉ＠ｉｉｄｅ．ｃｏ．ｊｐがメールアドレスデー
タベース（表２）に登録されていないため、処理４０６
が実行される。ａｓａｈｉ＠ｉｉｄｅ．ｃｏ．がメール
アドレス階層データベースに登録されておらず、複写送
付先にあるｎｏｇｕｃｈｉ＠ｎｏｒｔｈ．ｃｏ．ｊｐ、
ｋｕｒｏｂｅ＠ｎｏｒｔｈ．ｃｏ．ｊｐ、ｓｈｉｏｍｉ
＠ｓｏｕｔｈ．ｃｏ．ｊｐが共に登山グループに属し、
いずれのアドレスも他のグループに属さないため、韻律
・声質設定用メールアドレスは「登山グループ」とな
る。ここで、サブルーチン１を終了し、図３のメインル
ーチンに戻る。Next, the process 404 is executed, and a
sahi @ ide. co. Since jp is not registered in the mail address database (Table 2), the process 406
Is executed. asahi @ ide. co. Is not registered in the mail address hierarchy database, and noguchi @ north. co. jp,
kurobe @ north. co. jp, shiomi
＠South. co. jp belong to the climbing group,
Since none of the addresses belong to another group, the e-mail address for setting the prosody and voice quality is “climbing group”. At this point, the subroutine 1 ends, and the process returns to the main routine of FIG.

【００４２】図３の処理３０６において、ユーザ辞書検
索用メールアドレスリストは｛登山グループ｝となって
いるため、登山グループの上位のグループである、「私
的関連グループ」と「共通グループ」を追加し、ユーザ
辞書検索用メールアドレスリストは｛登山グループ、私
的関連グループ、共通グループ｝となる。In the process 306 of FIG. 3, since the mail address list for user dictionary search is {Climbing Group}, the "Private Related Group" and the "Common Group" which are the upper groups of the climbing group are added. Then, the user dictionary search mail address list is {climbing group, privately related group, common group}.

【００４３】次に、処理３０７でサブルーチン４が呼び
出される。優先度は、終端グループである「登山グルー
プ」が−１、非終端グループである「私的関連グルー
プ」が−２、「共通グループ」が−３となり、ユーザ辞
書検索用メールアドレスリストは｛登山グループ（−
１）、私的関連グループ（−２）、共通グループ（−
３）｝となる。ただし、グループ名の後の括弧中の数値
は優先度の値とする。Next, at step 307, subroutine 4 is called. The priority is “-1” for the terminal group “climbing group”, −2 for the non-terminal group “private related group”, and −3 for the “common group”. (-
1), privately related group (-2), common group (-
3) It becomes｝. However, the numerical value in parentheses after the group name is the priority value.

【００４４】最終的に、ユーザ辞書検索用メールアドレ
スリストは｛登山グループ（−１）、私的関連グルー
プ（−２）、共通グループ（−３）｝、韻律・声質設定
用メールアドレスは「登山グループ」となる。以上の処
理で、図８のテキスト音声変換用情報が決定される。Finally, the mail address list for user dictionary search is {climbing group (-1), private related group (-2), common group (-3)}, and the prosodic / voice quality setting mail address is "climbing Group ". Through the above processing, the text-to-speech conversion information in FIG. 8 is determined.

【００４５】図９は、メールに含まれる入力文章と、生
成された中間言語の一例を示したものである。テキスト
解析部１０４は、中間言語を作成する過程において、入
力文章を単語に分割する処理を行うが、この際、分割す
る単語の候補は、システム単語辞書１０５、及び、メー
ルアドレス付きユーザ単語辞書１０６から取り出した単
語を用いる。メールアドレス付きユーザ単語辞書の検索
にあたっては、単語に付与されているメールアドレス又
はグループ名が、ユーザ辞書検索用メールアドレスリス
トのいずれかのメンバーと一致する場合のみ、単語分割
の候補として取り出す。FIG. 9 shows an example of the input sentence included in the mail and the generated intermediate language. The text analysis unit 104 performs a process of dividing an input sentence into words in the process of creating an intermediate language. At this time, candidates for the words to be divided are a system word dictionary 105 and a user word dictionary with mail address 106. Use words taken from. When searching the user word dictionary with a mail address, only when the mail address or the group name assigned to the word matches any member of the mail address list for user dictionary search, the word is extracted as a candidate for word division.

【００４６】図１０は、辞書引きから単語分割までの処
理フローである。図１１は、処理１０１で生成するグラ
フ構造である。グラフには、文頭と文末に相当する仮想
的なノードを付け加える。FIG. 10 is a processing flow from dictionary lookup to word division. FIG. 11 shows a graph structure generated in the process 101. Virtual nodes corresponding to the beginning and end of the sentence are added to the graph.

【００４７】処理１０２では、文献３の方法により、出
現確率を決定する。文献３では、単語の属性を用いて、
出現確率を計算するが、本実施形態においては、「優先
度」という属性を追加する。単語の属性としての優先度
の値は、単語に付与されているメールアドレス又はグル
ープ名の、ユーザ辞書検索用メールアドレスリストでの
優先度の値を用いる。メールアドレス又はグループの優
先度は、文章によって異なるため、同じ単語であって
も、単語の属性の「優先度」は、文章によって異なる値
を持つ。システム辞書から取り出した単語については、
ユーザ辞書検索用メールアドレスリストに於ける「共通
グループ」の優先度の値を用いる。共通グループは、必
ず、リストに含まれるため、値を決定することができ
る。但し、この値は、メールアドレス情報により異な
る。単語分割に用いる決定木は、メールアドレス情報の
ついた大量のメールのデータについて、上記の優先度の
属性を決定し、優先度の属性を含む決定木を作成してお
く。In the process 102, the appearance probability is determined by the method of Reference 3. In Reference 3, using the attributes of words,
The appearance probability is calculated. In the present embodiment, an attribute “priority” is added. As the value of the priority as the attribute of the word, the value of the priority of the mail address or the group name given to the word in the user dictionary search mail address list is used. Since the priority of the mail address or the group differs depending on the text, the “priority” of the attribute of the word has a different value depending on the text even if the word is the same. For words taken from the system dictionary,
The priority value of "common group" in the user dictionary search mail address list is used. Since the common group is always included in the list, the value can be determined. However, this value differs depending on the mail address information. As a decision tree used for word division, the above-mentioned priority attribute is determined for a large amount of mail data with mail address information, and a decision tree including the priority attribute is created.

【００４８】処理１０３では、Ｖｉｔｅｒｂｉアルゴリ
ズムを用いて、出現確率最大のパスを選ぶことができ
る。In the process 103, the path having the maximum appearance probability can be selected by using the Viterbi algorithm.

【００４９】上記の処理では、優先度が大きい単語が常
に選ばれるとは限らないが、他の属性が同じであれば、
優先度の属性値が大きい単語が選ばれる可能性が高い。In the above processing, a word having a high priority is not always selected, but if other attributes are the same,
There is a high possibility that a word having a higher priority attribute value will be selected.

【００５０】図１２は、メールに付与されたメールアド
レス情報とユーザ辞書とを用いて、単語検索により、同
形異音語の読み分けを行う処理の経過を説明した図であ
る。処理には、表１，２，３の各データベースを用い
る。入力文は「乗越が問題だ。」である。FIG. 12 is a diagram for explaining the progress of the process of distinguishing between homonymous words by word search using the mail address information given to the mail and the user dictionary. For processing, each database of Tables 1, 2, and 3 is used. The input sentence is "Overriding is a problem."

【００５１】メールに付与されているアドレスから、テ
キスト音声変換用情報決定処理により、ユーザ辞書検索
用メールアドレスリストは、｛登山グループ（−１）、
私的関連グループ（−２）、共通グループ（−３）｝と
なる。From the address given to the mail, the mail address list for user dictionary search is changed to {Climbing group (-1),
Private related group (-2), common group (-3)}.

【００５２】一方、表１のメールアドレス付きユーザ辞
書には、「乗越」の同形異音語として、「乗越（ノッコ
シ）」と「乗越（ノリコシ）」の２つの単語が登録され
ている。「乗越（ノッコシ）」には登山グループのアド
レスが付与されており、「乗越（ノリコシ）」には共通
グループのアドレスが付与されている。On the other hand, in the user dictionary with a mail address in Table 1, two words, "Norikoshi" and "Norikoshi", are registered as homonyms of "Norikoshi". The address of the mountain climbing group is assigned to “Choking (Nokkoshi)”, and the address of the common group is assigned to “Choking (Norikoshi)”.

【００５３】ユーザ辞書検索用メールアドレスリストに
おける優先度は、登山グループが−１、共通グループが
−３である。単語の属性としての優先度の値は、「乗越
（ノッコシ）」が−１、「乗越（ノリコシ）」が−３と
なる。この属性と、その他の属性を用いて、２つの単語
の出現確率を計算する。いずれもユーザ辞書の単語であ
り、この文において普通名詞とサ変名詞の違いが単語選
択に当たって影響を与えることはないことから、決定木
の作成方法を考えると、優先度が大きい「乗越（ノッコ
シ）」の方が確率が大きくなる可能性が高い。従って、
このような場合は、「乗越（ノッコシ）」が選択され
る。The priority in the user dictionary search mail address list is -1 for the climbing group and -3 for the common group. The value of the priority as the attribute of the word is "-1" for "override" and "-3" for "override". Using this attribute and other attributes, the appearance probability of two words is calculated. Both are words in the user dictionary. In this sentence, the difference between ordinary nouns and sa-variable nouns does not affect the word selection. Therefore, considering the method of creating a decision tree, “Norikoshi” has a higher priority. Is more likely to be greater. Therefore,
In such a case, “override” is selected.

【００５４】更に、アクセント位置、フレーズ立ち上げ
位置、ポーズ位置を決定する処理を経て、中間言語、
「Ｐノッコシガ，モンダイダ。」が生成される。この文
の場合、「乗越」の前後には、特に読み分けの手がかり
となる語はないため、本実施形態の優先度以外の属性で
は、正しく読み分けが行われる可能性は低い。例えば、
本実施形態の優先度の属性を用いず、かつ、単語の頻度
が属性として用いられる場合は、一般的な文において頻
度が高い「乗越（ノリコシ）」が選ばれる可能性が高
い。Further, through processing for determining an accent position, a phrase start position, and a pause position, an intermediate language,
"P Nokkoshiga, Mondida." Is generated. In the case of this sentence, there is no particular clue for reading before and after the “override”. Therefore, it is unlikely that reading is correctly performed with attributes other than the priority in the present embodiment. For example,
In the case where the attribute of the priority according to the present embodiment is not used and the frequency of the word is used as the attribute, it is highly possible that “Norikoshi”, which has a high frequency in a general sentence, is selected.

【００５５】以上説明したように、第１の実施形態にお
いては、以下の効果が得られる。（１）ユーザが登録する単語の利用範囲を限定すること
が出来るようになり、予期しない副作用による読み誤り
を減少させることができる。（２）ユーザ単語に登録した単語が使用される文章の分
野が制限されるため、ユーザ辞書への単語登録に際して
悪影響を考慮する必要が少なくなり、単語登録の労力を
軽減できる。（３）メールの発信人、宛先、複写送付先を考慮した韻
律・声質で読み上げるため、メールの要件を効率的に聴
取することができる。As described above, the first embodiment has the following advantages. (1) The range of use of words registered by the user can be limited, and reading errors due to unexpected side effects can be reduced. (2) Since the field of a sentence in which a word registered as a user word is used is limited, it is less necessary to consider an adverse effect when registering a word in a user dictionary, and the labor for word registration can be reduced. (3) Since the text-to-speech is read based on the prosody and voice quality in consideration of the sender, destination, and copy destination of the mail, the requirements of the mail can be efficiently heard.

【００５６】＜第２の実施形態＞＜構成＞本実施形態において、「グループ」、「韻律・
声質設定用メールアドレス」の定義は、実施形態１と同
じとする。図１３は、第２の実施形態の構成を示す図で
ある。２０１は、音声合成装置を操作するためのユーザ
インタフェース、２０２は、メール送受信部２１４が受
信したメールの内容を読み上げるテキスト音声変換部で
あり、２０３は、受信したメールからの入力文章読み出
し、メールについている発信人・宛先・複写送付先情報
のテキスト音声変換用情報設定部への受け渡し、及び、
同処理部から返される１つの情報の後続処理部への受け
渡しを行うテキスト入力部である。<Second Embodiment><Structure> In this embodiment, “group”, “prosody
The definition of the "voice quality setting mail address" is the same as in the first embodiment. FIG. 13 is a diagram illustrating the configuration of the second embodiment. Reference numeral 201 denotes a user interface for operating the speech synthesizer, 202 denotes a text-to-speech conversion unit that reads out the contents of the mail received by the mail transmission / reception unit 214, and 203 denotes reading of input text from the received mail, The sender / destination / copy / destination information to the text-to-speech conversion information setting unit, and
This is a text input unit that transfers one piece of information returned from the processing unit to a subsequent processing unit.

【００５７】２０４は、入力文章から中間言語（読み・
アクセント位置・フレーズ立ち上げ位置・ポーズ位置）
を生成するテキスト解析部であり、２０５は、テキスト
解析で用いる単語情報（表記・読み・品詞・アクセント
型・アクセント結合型等）を格納するシステム単語辞
書、２０６は、ユーザが追加した単語情報を格納するユ
ーザ単語辞書である。また、２０７は、同形異音語につ
いて読み分けを行う読み分け処理部であり、２０８は、
同形異音語の読み分けに用いられる読み分け用決定リス
トを格納する読み分け用決定リスト格納部である。決定
リストは、ある証拠ＥのもとでクラスＤを決定するとい
う規則を優先度の高い順にリスト形式で並べたもので、
適用時には優先度の高い規則から順に適用を試みていく
（文献２参照）。本実施形態では、決定リストのクラス
として読み、優先度として出現頻度の比（尤度比）を用
い、以下も「読み」と「尤度比」という用語を用いる。
読み分け用決定リストは、個々の同形異音語毎に、予め
コーパスから作成して格納しておく。Reference numeral 204 denotes an intermediate language (reading
Accent position, phrase start position, pause position)
Is a system word dictionary for storing word information (notation, reading, part of speech, accent type, accent combination type, etc.) used in the text analysis, and 206 is a word analysis unit for storing word information added by the user. It is a user word dictionary to be stored. Reference numeral 207 denotes a reading separation processing unit that performs reading separation for homomorphic words.
This is a reading determination list storage unit that stores a reading determination list used for reading homomorphic words. The decision list is a list of rules for determining a class D based on a certain evidence E in the order of priority, in the form of a list.
At the time of application, application is tried in order from the rule with the highest priority (see Document 2). In the present embodiment, reading is used as the class of the decision list, the ratio of appearance frequencies (likelihood ratio) is used as the priority, and the terms "reading" and "likelihood ratio" are also used below.
The reading determination list is created from a corpus and stored in advance for each homomorphic word.

【００５８】２０９は、テキスト解析の出力から、合成
パラメータ（合成単位・継続時間・ピッチ・ポーズ・振
幅）を生成する合成パラメータ生成部であり、２１０
は、合成パラメータを決定するために用いる予測テーブ
ルを格納する予測テーブル格納部である。２１１は、合
成パラメータから波形を生成する音声合成部であり、２
１２は、音声合成で用いる音声素片セットを格納する音
声素片辞書である。２１０の予測テーブル格納部には、
継続時間予測テーブル・ピッチ予測テーブル・ポーズ予
測テーブル・振幅予測テーブルが格納される。A synthesis parameter generation unit 209 generates synthesis parameters (synthesis unit, duration, pitch, pause, amplitude) from the output of the text analysis.
Is a prediction table storage unit that stores a prediction table used for determining a synthesis parameter. Reference numeral 211 denotes a speech synthesis unit that generates a waveform from synthesis parameters.
A speech unit dictionary 12 stores a speech unit set used in speech synthesis. In the prediction table storage unit 210,
A duration prediction table, a pitch prediction table, a pause prediction table, and an amplitude prediction table are stored.

【００５９】２１３は、メールの送受信・保存を行うメ
ール管理部であり、２１４は、メールの送受信を行うメ
ール送受信部、２１５は、受信したメールを格納するメ
ール格納部である。Reference numeral 213 denotes a mail management unit for sending / receiving and storing mails; 214, a mail sending / receiving unit for sending / receiving mails; and 215, a mail storage unit for storing received mails.

【００６０】２１６は、メールアドレスに関連するデー
タベースを管理するデータベース管理部であり、２１７
は、メールアドレスデータベース、及び、メールアドレ
ス階層データベースの管理を行うメールアドレス管理部
であり、２１８は、メールアドレス及びグループに関す
る情報を格納する属性付きメールアドレスデータベー
ス、２１９は、メールアドレス及びグループ間の階層関
係を格納するメールアドレス階層データベースである。
２２０は、メールについている発信者・宛先・複写送付
先情報から、テキスト音声変換に用いる１つの情報を決
定するテキスト音声変換用情報設定部である。決定する
テキスト音声変換用情報は、実施形態１の「韻律・声質
設定用メールアドレス」と同じ情報である。実施例２で
は、決定した情報の用途が異なるため、以下、「韻律・
声質設定用メールアドレス」を「読み分け用メールアド
レス」と呼ぶ。Reference numeral 216 denotes a database management unit for managing a database related to the mail address.
Is an e-mail address management unit that manages an e-mail address database and an e-mail address hierarchy database. 218 is an e-mail address database with attributes that stores information on e-mail addresses and groups. It is a mail address hierarchy database which stores a hierarchy relationship.
Reference numeral 220 denotes a text-to-speech conversion information setting unit that determines one piece of information to be used for text-to-speech conversion from the sender, destination, and copy / destination information attached to the mail. The text-to-speech conversion information to be determined is the same information as the “prosodic / voice-quality setting mail address” of the first embodiment. In Example 2, the purpose of the determined information is different.
The "voice quality setting mail address" is referred to as a "reading mail address".

【００６１】[0061]

【表７】表７は、ユーザ単語辞書の内容の一部を示したものであ
る。実施形態１と異なり、ユーザ辞書の単語には、メー
ルアドレス又はグループの情報は含まれない。[Table 7] Table 7 shows a part of the contents of the user word dictionary. Unlike the first embodiment, the words in the user dictionary do not include information on the mail address or the group.

【００６２】[0062]

【表８】表８は、同形異音語「市場」の読み分け用決定リストの
内容の一部を示したものである。従来の装置では、「前
後の単語の表記」（前後１０単語以内の自立語と一部の
付属語の表記）等、入力文中の証拠を用いるが、本実施
形態では、入力文以外の証拠として、メールアドレスデ
ータベースの属性を用いる。表８では、規則３が、「メ
ールアドレスデータベースの業種」を用いた規則であ
る。また、ディフォルトの値を設定するため、最も尤度
が低い規則として、証拠の種類が「ディフォルト」の規
則を追加しておく。表８では、規則８がディフォルトの
値を設定する規則である。[Table 8] Table 8 shows a part of the content of the decision list for reading the homomorphic word “market”. In the conventional device, evidence in the input sentence such as "notation of preceding and following words" (notation of independent words and some attached words within 10 words before and after) is used, but in the present embodiment, evidence other than the input sentence is used in the present embodiment. , Using the attribute of the mail address database. In Table 8, Rule 3 is a rule using “business type of the mail address database”. Further, in order to set a default value, a rule with the type of evidence “default” is added as a rule having the lowest likelihood. In Table 8, Rule 8 is a rule for setting a default value.

【００６３】[0063]

【表９】表９は、属性付きメールアドレスデータベースの内容で
ある。ユーザが、メールアドレス又はグループ名・種別
の情報、名前・電話番号・所属等の情報、及び、読み分
け用決定リストで用いる属性が追加されている。属性
は、「業種」、「分野」等、ユーザが容易に設定できる
ものを用いる。属性の種類と属性の値は、予め用意した
ものを用い、ユーザは装置が表示する属性値から選択す
る。[Table 9] Table 9 shows the contents of the attributed mail address database. Information such as mail address or group name / type, information such as name / telephone number / affiliation, and attributes used by a user in a decision list for reading distinction are added. Attributes that can be easily set by the user, such as “business type” and “field”, are used. The type of the attribute and the attribute value are prepared in advance, and the user selects from the attribute values displayed by the device.

【００６４】＜動作＞以下の説明において、特に記述し
ない場合でも、データベース間でデータの整合性を保つ
ように動作する。メールアドレス又はグループは、ユー
ザが追加・修正・削除できるため、整合性を失わせる処
理を常にチェックし排除するようにし、属性付きメール
アドレスデータベース２１８、メールアドレス階層デー
タベース２１９のデータの整合性を保てるようにする。<Operation> In the following description, an operation is performed so as to maintain data consistency between databases even if not particularly described. Since a user can add, modify, or delete a mail address or a group, the process of losing consistency is always checked and eliminated, and data consistency of the attributed mail address database 218 and the mail address hierarchy database 219 can be maintained. To do.

【００６５】ユーザは、ユーザインタフェース２０１を
介して、メールアドレス管理部２１７を呼び出すことに
より、属性付きメールアドレスデータベース２１８の内
容とメールアドレス階層データベース２１９の内容を追
加・削除・変更する。また、ユーザインタフェース２０
１を介して、メール送受信部２１４を呼び出すことによ
り、メールを送受信する。受信したメール、或いは、送
信したメールの複写等は、メール格納部２１５に保存さ
れる。The user adds / deletes / changes the contents of the attribute-added mail address database 218 and the contents of the mail address hierarchy database 219 by calling the mail address management section 217 via the user interface 201. The user interface 20
The e-mail is transmitted / received by calling the e-mail transmission / reception unit 214 via the communication unit 1. A copy of the received mail or the transmitted mail is stored in the mail storage unit 215.

【００６６】ユーザがメールをテキスト音声変換する場
合には、ユーザインタフェース２０１を介して、メール
格納部に格納されている電子メールを指定し、テキスト
音声変換部２０２を呼び出す。テキスト音声変換部で
は、まず、テキスト入力部２０３がメール格納部２１５
からメールを読み込む。次に、テキスト音声変換用情報
設定部２２０を呼び出して、読み込んだメールについて
いる宛先・発信人・複写送付先から、読み分け用メール
アドレスを獲得し、後続の処理部へ渡す。読み分け用メ
ールアドレスの決定方法は、実施形態１の韻律・声質設
定用メールアドレスの決定方法と同じである。When the user performs a text-to-speech conversion of an e-mail, the user specifies the e-mail stored in the e-mail storage unit via the user interface 201 and calls the text-to-speech conversion unit 202. In the text-to-speech conversion unit, first, the text input unit 203 is connected to the mail storage unit 215.
Read mail from. Next, the text-to-speech conversion information setting unit 220 is called, and a reading mail address is obtained from the destination, the sender, and the copy destination of the read mail, and passed to the subsequent processing unit. The method of determining the mail address for reading distinction is the same as the method of determining the prosody / voice quality setting mail address of the first embodiment.

【００６７】テキスト音声変換用情報設定部２２０の処
理終了後、テキスト解析部２０４が、システム単語辞書
２０５、ユーザ単語辞書２０６から、分割する単語の候
補となる単語を取り出し、メールの文章を単語に分割す
る。ここで、読み分け処理部２０７を呼び出して、同形
異音語の読み分けを行った後、読み・アクセント位置・
イントネーション立ち上げ位置・ポーズ位置を決定し、
中間言語を生成する。After the processing of the text-to-speech information setting unit 220 is completed, the text analysis unit 204 extracts words that are candidates for the words to be divided from the system word dictionary 205 and the user word dictionary 206, and converts the text of the mail into words. To divide. At this point, after calling the reading division processing unit 207 and performing reading of homomorphic words, reading, accent position,
Determine the intonation launch position / pause position,
Generate an intermediate language.

【００６８】図１４は、読み分け処理部の処理フローで
ある。読み分け処理は、テキスト解析が入力文を単語に
分割した後に呼び出される。読み分け処理部は、処理１
２１で、読み分け用メールアドレスを属性付きメールア
ドレスデータベースで検索し、属性を取り出す。処理１
２２で、テキスト解析部が単語分割した単語について、
１単語目に走査位置を設定する。処理１２３から、処理
１２６で、文末へ向かって１単語ずつ走査してゆき、走
査点が文の最後の単語の次に移動すると、処理１２３の
判定により、処理を終了する。処理１２４で、走査中の
単語の読み分け用決定リストが読み分け用決定リスト格
納部２０８にあるかどうか検索する。決定リストがあれ
ば、処理１２５で、規則の尤度の高い順に、決定リスト
の証拠の種類と証拠の値の条件を満たすかどうかチェッ
クし、満たされれば規則を適用する。ディフォルト設定
用の規則があるため、必ず読みが決定される。処理１２
６で、走査点を次の単語に移し、処理１２３に戻る。FIG. 14 is a processing flow of the read classification processing unit. The reading process is called after text analysis divides an input sentence into words. The read separation processing unit performs processing 1
At 21, the mail address for reading is searched for in the mail address database with attributes, and the attribute is extracted. Processing 1
At 22, the words that the text analysis unit has divided into words are:
The scanning position is set to the first word. From step 123 to step 126, scanning is performed one word at a time toward the end of the sentence, and when the scanning point moves to the position following the last word of the sentence, the processing ends according to the determination in step 123. In step 124, a search is made to determine whether a reading list for the word being scanned is present in the reading list storage unit 208. If there is a decision list, the process 125 checks in the order of the likelihood of the rule whether the condition of the type of evidence and the value of the evidence in the decision list satisfies, and if satisfied, applies the rule. Since there are rules for default setting, reading is always determined. Processing 12
At 6, the scan point is moved to the next word and the process returns to step 123.

【００６９】図１５は、同形異音語の読み分け処理の具
体例を示したものである。表８の読み分け用決定リスト
と、表９の属性付きメールアドレスデータベースがある
とする。入力文章「どこの市場を調べますか？」がメー
ルで送られてきたとする。テキスト解析部で、図に示す
ように単語に分割される。この時点では、一応読みも決
定されている。テキスト音声変換用情報設定部２２０に
おいて、読み分け用メールアドレスは、ｋａｔｏｈ＠ａ
ｏｚｏｒａ−ｂａｎｋ．ｃｏ．ｊｐとなる。読み分け処
理部で、文頭から１単語ずつ走査するが、ｎ＝１、ｎ＝
２については、処理１２４で、決定リストがないため、
走査点が移動してゆく。処理１２６でＮ＝３になった状
態を考える。３番目の単語「市場」は、表８の決定リス
トがあるため処理１２４の条件を満たし、処理１２５を
実行する。FIG. 15 shows a specific example of the homomorphic word reading process. It is assumed that there is a reading determination list in Table 8 and an attributed mail address database in Table 9. Suppose that the input sentence "Which market do you want to check?" The text analysis unit divides the words into words as shown in the figure. At this point, the reading has also been determined. In the text-to-speech conversion information setting unit 220, the read mail address is kato @ a
ozora-bank. co. jp. The reading division unit scans one word at a time from the beginning of the sentence, where n = 1 and n =
Regarding No. 2, since there is no decision list in process 124,
The scanning point moves. Consider the state in which N = 3 in process 126. The third word “market” satisfies the condition of the process 124 because there is the decision list in Table 8, and executes the process 125.

【００７０】表９の属性付きメールアドレスデータベー
スを参照すると、「業種」は金融である。前後の単語と
しては、「どこ」、「の」、「を」、「調べ」、「ま
す」、「か」、「？」という単語がある。決定リストを
検索する証拠の種類と値を列挙すると、図１５に示すよ
うになる。規則１から順に証拠の種類と証拠の値の条件
を満たすかどうかチェックする。規則１、規則２につい
ては、「市場」の前後には、「株式」、「シェア」とい
う単語はないため、規則１、規則２は条件を満たさず、
適用されない。規則３については、種別が「属性付きメ
ールアドレスデータベースの業種」で値が「金融」とい
う証拠があり、規則の条件を満たすため、規則が適用さ
れる。したがって、「市場」の読み分け結果は、規則３
が与える「シジョウ」になる。更に、処理１２６、処理
１２３、処理１２４が繰り返され、ｎ＝９になったとこ
ろで、対応する単語がないため読み分け処理を終了す
る。Referring to the attributed e-mail address database in Table 9, the "business type" is "finance". The words before and after include the words “where”, “no”, “wo”, “investigation”, “mas”, “ka”, and “?”. FIG. 15 shows a list of types and values of evidence for searching the decision list. It is checked whether the condition of the type of evidence and the value of evidence is satisfied in order from rule 1. As for rules 1 and 2, there are no words "stock" and "share" before and after "market", so rules 1 and 2 do not satisfy the condition,
Not applicable. As for rule 3, there is evidence that the type is "business type of the mail database with attributes" and the value is "finance", and the rule is applied to satisfy the condition of the rule. Therefore, the result of reading “market” is the same as that in Rule 3.
Will be given by the “Shijo”. Further, the processing 126, the processing 123, and the processing 124 are repeated, and when n = 9, there is no corresponding word, and thus the read-separation processing ends.

【００７１】読み分け部２０７の処理が終わった後、テ
キスト解析部２０４は、読み・アクセント位置・イント
ネーション立ち上げ位置・ポーズ位置を決定し、中間言
語を生成する。合成パラメータ生成部２０９は、テキス
ト解析部が出力した中間言語に対して、音素の継続時間
・ピッチ・ポーズ長・振幅についてのパラメータを決定
する。音声合成部２１１は、テキスト解析部が出力する
中間言語、合成パラメータ部が出力するパラメータか
ら、音声を合成する。After the processing of the reading division unit 207 is completed, the text analysis unit 204 determines the reading, accent position, intonation start position, and pause position, and generates an intermediate language. The synthesis parameter generation unit 209 determines parameters for the duration, pitch, pause length, and amplitude of the phoneme for the intermediate language output by the text analysis unit. The speech synthesis unit 211 synthesizes speech from the intermediate language output by the text analysis unit and the parameters output by the synthesis parameter unit.

【００７２】以上説明したように、本実施形態に依れ
ば、メールアドレスから得られる属性を読み分けに用い
るため、文章内に手掛かりがない場合でも、読み分けが
可能になる。As described above, according to this embodiment, since the attribute obtained from the mail address is used for reading, even if there is no clue in the text, reading can be performed.

【００７３】尚、本発明は、前述の実施形態に限定され
るものではなく、実施形態１、実施形態２は、電子メー
ルの読み上げに用いたが、ホームページの読み上げにも
用いることができる。ホームページの場合、発信者とし
てホームページのアドレス、複写送付先はなし、宛先と
してユーザのメールアドレスを用いる。Note that the present invention is not limited to the above-described embodiment, and the first and second embodiments are used for reading out an electronic mail, but may be used for reading out a home page. In the case of a homepage, the address of the homepage, no copy destination, and the mail address of the user are used as the sender and destination, respectively.

【００７４】[0074]

【発明の効果】以上詳細に説明したように、第１発明の
音声合成装置においては、受信メールをテキスト解析す
る際に用いる単語を登録する手段と、メールアドレスの
データベースを格納する手段と、メールアドレスのデー
タベースの内容を追加・修正・削除する手段とを備えた
音声合成装置において、前記データベースのメールアド
レスをグルーピングすると共に、各グループを階層化し
た階層データとして格納する手段と、単語に前記階層デ
ータのノード名を付加してユーザ単語辞書に登録する手
段と、メールの発信人・宛先・複写送付先の情報から前
記階層データのノード名のリストを作成する手段とを備
え、前記ユーザ単語辞書の検索に際して、前記作成され
たリストのメンバーとユーザ単語辞書の単語に付加され
ているノード名とを照合し、ノード名が前記リストのメ
ンバーのいずれかに一致する単語のみを用いてテキスト
解析を行う構成としたので、ユーザが登録する単語の利
用範囲を限定することが出来るようになり、予期しない
副作用による読み誤りを減少させることができる。ま
た、ユーザ単語に登録した単語が使用される文章の分野
が制限されるため、ユーザ辞書への単語登録に際して悪
影響を考慮する必要が少なくなり、単語登録の労力を軽
減できる。As described above in detail, in the speech synthesizer of the first invention, means for registering a word used for text analysis of a received mail, means for storing a mail address database, mail A voice synthesizing apparatus comprising: means for adding, modifying, and deleting the contents of an address database; a means for grouping mail addresses in the database and storing each group as hierarchical data; Means for adding a data node name to the user word dictionary and registering the data in a user word dictionary; and means for creating a list of node names of the hierarchical data from information on the sender, destination, and copy destination of the mail. At the time of searching, the members of the created list and the node names added to the words in the user word dictionary are Since the text analysis is performed using only the words whose node names match any of the members of the list, the use range of the words registered by the user can be limited, and unexpected Reading errors due to side effects can be reduced. In addition, since the field of a sentence in which the word registered as the user word is used is limited, it is less necessary to consider an adverse effect when registering the word in the user dictionary, and the labor for word registration can be reduced.

【００７５】また、第２発明の音声合成装置において
は、受信メールを読み上げる際の韻律又は声質を制御す
る手段と、メールアドレスのデータベースを格納する手
段と、メールアドレスのデータベースの内容を追加・修
正・削除する手段とを備えた音声合成装置において、前
記データベースのメールアドレスをグルーピングすると
共に、各グループを階層化した階層データとして格納す
る手段と、メールの発信人、宛先、複写送付先の情報に
基づいて、前記階層データのノードを選択する手段と、
を備え、メールを読み上げる際に、前記選択されたノー
ドに対応付けられている韻律及び声質でメールを読み上
げる構成としたので、メールの発信人、宛先、複写送付
先を考慮した韻律・声質で読み上げるため、メールの要
件を効率的に聴取することができる。Further, in the speech synthesizer of the second invention, means for controlling the prosody or voice quality when reading the received mail, means for storing the mail address database, and adding / modifying the contents of the mail address database A voice synthesizing apparatus having means for deleting the mail, grouping the mail addresses of the database, storing each group as hierarchical data in a hierarchy, and adding information on the sender, destination, and copy destination of the mail; Means for selecting a node of the hierarchical data based on the
When reading out the mail, the mail is read out according to the prosody and voice quality associated with the selected node. Therefore, it is possible to efficiently listen to the requirements of the mail.

【００７６】更に、第３発明の音声合成装置において
は、決定リストを用いて同形異音語を読み分ける手段
と、メールアドレスのデータベースを格納する手段と、
メールアドレスのデータベースの内容を追加・修正・削
除する手段とを備えた音声合成装置において、前記デー
タベースのメールアドレスをグルーピングすると共に、
各グループを階層化した階層データとして格納する手段
と、メールの発信人・宛先・複写送付先の情報から前記
階層データのノード名を選択する手段とを備え、前記選
択されたノードに設定されている属性を証拠とした規則
を含む決定リストを用いて単語の読み分けを行う構成と
したので、文章内に手掛かりがない場合でも、読み分け
が可能になる。Further, in the speech synthesizing apparatus according to the third aspect of the present invention, means for distinguishing homonymous words using the decision list, means for storing a mail address database,
Means for adding / modifying / deleting the contents of a mail address database, wherein the mail addresses of the database are grouped,
Means for storing each group as hierarchical data, and means for selecting a node name of the hierarchical data from information on the sender, destination, and copy destination of the mail. Since the words are distinguished by using the decision list including the rule with the attribute as evidence, the distinction is possible even when there is no clue in the sentence.

[Brief description of the drawings]

【図１】第１の実施形態の構成図である。FIG. 1 is a configuration diagram of a first embodiment.

【図２】メールアドレス階層データベースの一例を示す
図である。FIG. 2 is a diagram showing an example of a mail address hierarchy database.

【図３】テキスト音声変換用情報設定処理のメインルー
チンである。FIG. 3 is a main routine of a text-to-speech conversion information setting process.

【図４】テキスト音声変換用情報設定処理のサブルーチ
ン１である。FIG. 4 is a subroutine 1 of text-to-speech conversion information setting processing.

【図５】テキスト音声変換用情報設定処理のサブルーチ
ン２である。FIG. 5 is a subroutine 2 of a text-to-speech conversion information setting process.

【図６】テキスト音声変換用情報設定処理のサブルーチ
ン３である。FIG. 6 is a subroutine 3 of text-to-speech conversion information setting processing.

【図７】テキスト音声変換用情報設定処理のサブルーチ
ン４である。FIG. 7 is a subroutine 4 of text-to-speech information setting processing.

【図８】メールアドレスに付与されるアドレス及びテキ
スト音声変換用情報の一例を示す図である。FIG. 8 is a diagram showing an example of an address and text-to-speech conversion information given to a mail address.

【図９】入力文章と中間言語の一例を示す図である。FIG. 9 is a diagram illustrating an example of an input sentence and an intermediate language.

【図１０】辞書引きから単語分割までの処理フローであ
る。FIG. 10 is a processing flow from dictionary lookup to word division.

【図１１】図１０の処理１０１で生成するグラフ構造を
示す図である。FIG. 11 is a diagram showing a graph structure generated in processing 101 of FIG. 10;

【図１２】メールアドレスとユーザ単語辞書を用いた読
み分け処理の経過説明図である。FIG. 12 is an explanatory diagram showing the progress of a reading process using a mail address and a user word dictionary.

【図１３】第２の実施形態の構成図である。FIG. 13 is a configuration diagram of a second embodiment.

【図１４】読み分け処理の処理フローである。FIG. 14 is a process flow of a read-separation process.

【図１５】同形異音語「市場」の読み分け処理の説明図
である。FIG. 15 is an explanatory diagram of a reading process of homomorphic words “market”.

[Explanation of symbols]

１０１，２０１ユーザインターフェース１０２，２０２テキスト音声変換部１０３，２０３テキスト入力部１０４，２０４テキスト解析部１０５，２０５システム単語辞書１０６，２０６メールアドレス付きユーザ単語辞書１０７，２０９合成パラメータ生成部１０８，２１０予測テーブル格納部１０９，２１１音声合成部１１０，２１２音声素片辞書１１１，２１３メール管理部１１２，２１４メール送受信部１１３，２１５メール格納部１１４，２１６データベース管理部１１５，２１７メールアドレス管理部１１６，２１８メールアドレスデータベース１１７，２１９メールアドレス階層データベース１１８ユーザ単語辞書管理部１１９韻律・声質設定データベース管理部１２０韻律・声質設定データベース１２１韻律・予測テーブル対応データベース１２２声質・素片セット対応データベース１２３韻律・声質設定データベース２０７読み分け部２０８読み分け用決定リスト格納部２２０テキスト音声変換用情報設定部 101, 201 User interface 102, 202 Text-to-speech conversion unit 103, 203 Text input unit 104, 204 Text analysis unit 105, 205 System word dictionary 106, 206 User word dictionary with mail address 107, 209 Synthesis parameter generation unit 108, 210 Prediction Table storage units 109, 211 Speech synthesis units 110, 212 Speech unit dictionaries 111, 213 E-mail management units 112, 214 E-mail transmission and reception units 113, 215 E-mail storage units 114, 216 Database management units 115, 217 E-mail address management units 116, 218 E-mail address database 117, 219 E-mail address hierarchy database 118 User word dictionary management unit 119 Prosody and voice quality setting database management unit 120 Prosody and voice quality setting database 121 prosody-prediction table corresponding database 122 voice-segment set corresponding database 123 prosody-voice quality setting database 207 diacritical 208 diacritical deciding list storage unit 220 text-to-speech conversion information setting unit

Claims

[Claims]

1. A voice having means for registering a word used for text analysis of a received mail, means for storing a mail address database, and means for adding / modifying / deleting the contents of the mail address database. In the synthesizing device, a means for grouping the mail addresses of the database and storing each group as hierarchical data, a means for adding a node name of the hierarchical data to a word and registering the word in a user word dictionary, Means for creating a list of node names of the hierarchical data from information on the sender, destination, and copy destination of the
When searching the user word dictionary, the members of the created list are compared with the node names added to the words in the user word dictionary, and only the words whose node names match any of the members of the list are used. A speech synthesizer characterized in that text analysis is performed by using a speech synthesizer.

2. A speech synthesizer comprising: means for controlling the prosody or voice quality when reading a received mail; means for storing a mail address database; and means for adding, modifying, or deleting the contents of the mail address database. In the apparatus, means for grouping the mail addresses of the database and storing each group as hierarchical data in a hierarchy, based on information of a mail sender, a destination, and a copy destination
Means for selecting a node of the hierarchical data, wherein when reading out the mail, the mail is read out according to the prosody and voice quality associated with the selected node.

3. A means for reading homomorphic words using a decision list, a means for storing a mail address database, and the contents of a mail address database.
A voice synthesizing apparatus comprising: means for correcting / deleting; a means for grouping mail addresses of the database and storing each group as hierarchical data; and information on a sender, a destination, and a copy destination of the mail. Means for selecting a node name of the hierarchical data from the list, and performs the read-out of words using a decision list including a rule using an attribute set in the selected node as evidence. Synthesizer.